Dictaphone Demo

A voice recording app that demonstrates transparent RPC and client-side AI. The Stimulus controller calls Clip.create() and clip.audio.attach() directly — the same Ruby code works in the browser (IndexedDB) and on Node.js (SQLite via RPC). Audio transcription and English-to-Spanish translation run entirely in the browser via Whisper and OPUS-MT (Transformers.js) — no API keys, no server calls.

Table of Contents

Create the App

Try it live — no install required.

To run locally:

npx github:ruby2js/juntos --demo dictaphone
cd dictaphone

This creates a Rails app with:

  • Clip model — stores audio recordings with transcriptions and translations
  • Active Storage — manages audio file attachments
  • Dictaphone controller — Stimulus controller written in Ruby
  • Whisper integration — local speech-to-text via Transformers.js
  • OPUS-MT translation — local English-to-Spanish translation via Transformers.js
  • Tailwind CSS — clean recording UI with waveform visualization

Run with Rails

The demo includes a Stimulus controller written in Ruby (app/javascript/controllers/dictaphone_controller.rb). To transpile it automatically, install ruby2js:

bundle add ruby2js --github ruby2js/ruby2js --branch master
bin/rails generate ruby2js:install
RAILS_ENV=production bin/rails db:prepare
bin/rails server -e production

Open http://localhost:3000. Click “Start Recording” to begin capturing audio. Stop recording to save the clip and trigger transcription.

Run in the Browser

Stop Rails. Run the same app in your browser:

bin/juntos dev -d dexie

Open http://localhost:3000. Same recording interface. Same transcription and translation. But now:

  • No Ruby runtime — the browser runs transpiled JavaScript
  • IndexedDB storage — audio files persist in your browser via Active Storage’s IndexedDB adapter
  • Local AI models — Whisper (~75MB) and OPUS-MT (~30MB) download on first use, cached in IndexedDB
  • Hot reload — edit a Ruby file, save, browser refreshes

Microphone Permissions

The browser will request microphone access when you click “Start Recording”. Grant permission to enable audio capture. Recordings are stored as WebM audio blobs in IndexedDB.

Run on Node.js

bin/juntos db:prepare -d sqlite
bin/juntos up -d sqlite

Open http://localhost:3000. Same app—but now Node.js serves requests, and audio files are stored on the local filesystem via Active Storage’s disk adapter. The Stimulus controller’s model operations (Clip.create(), clip.audio.attach()) are automatically routed through RPC to the server — no fetch calls or form submissions needed.

Environment Variables

S3 Storage (Edge Targets)

For edge deployments (Fly.io, Cloudflare Workers, Vercel Edge, Deno Deploy), configure S3-compatible storage:

export S3_BUCKET=my-bucket
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=us-east-1
# For Cloudflare R2 or MinIO:
export AWS_ENDPOINT_URL=https://...

The Code

The Stimulus controller imports a model and calls it directly. In the browser, this hits IndexedDB. On Node.js, the build automatically generates RPC so the same code calls the server:

import ["Clip"], from: 'juntos:models'

class DictaphoneController < Stimulus::Controller
  async def save(event)
    event.preventDefault()
    return unless @audioBlob

    clip = await Clip.create(
      name: nameTarget.value || "Untitled Recording",
      transcript: transcriptTarget.value,
      translation: translationTarget.value,
      duration: parseFloat(durationTarget.value)
    )

    extension = @audioBlob.type.include?('webm') ? 'webm' : 'm4a'
    await clip.audio.attach(@audioBlob,
      filename: "recording.#{extension}",
      content_type: @audioBlob.type
    )
  end
end

Try it — the model uses Active Storage:

class Clip < ApplicationRecord
  has_one_attached :audio

  validates :name, presence: true
  broadcasts_to -> { "clips" }, inserts_by: :prepend
end

What This Demo Shows

Transparent RPC

  • Direct model accessClip.create() in a Stimulus controller, no fetch or form submission
  • Automatic routing — browser target uses IndexedDB, Node.js target uses RPC to the server
  • Build-time detection — the build pipeline detects model imports and generates the RPC layer
  • Like React Server Functions — but for Stimulus controllers and Ruby syntax

Active Storage Integration

  • Browser — IndexedDB adapter stores blobs locally
  • Node.js — Disk adapter stores files on filesystem
  • Edge — S3 adapter stores in cloud object storage
  • Same APIhas_one_attached :audio works everywhere

Client-Side AI

  • Whisper transcription — speech-to-text runs in the browser via Transformers.js
  • OPUS-MT translation — English-to-Spanish translation, also via Transformers.js
  • No API keys — both models download on first use (~75MB + ~30MB), then are cached
  • Same library — both pipelines use @xenova/transformers, imported from a Stimulus controller written in Ruby
  • Progress indicators — model download progress shown in the UI

Audio Recording

  • MediaRecorder API — captures audio from microphone
  • Raw PCM capture — audio samples captured directly via ScriptProcessorNode for reliable Whisper input
  • Waveform visualization — real-time audio level display

Next Steps

Back to Juntos/demos