Dictaphone Demo

A voice recording app with AI transcription. Record audio clips, get automatic transcriptions via OpenAI Whisper, and store everything locally using Active Storage.

Create the App
Run with Rails
Run in the Browser
- Microphone Permissions
Run on Node.js
Environment Variables
- OpenAI API Key (Required for Transcription)
- S3 Storage (Edge Targets)
The Code
What This Demo Shows
What Works Differently
What Doesn’t Work
Next Steps

Create the App

Try it live — no install required.

To run locally:

npx github:ruby2js/juntos --demo dictaphone
cd dictaphone

This creates a Rails app with:

Clip model — stores audio recordings with transcriptions
Active Storage — manages audio file attachments
Dictaphone controller — Stimulus controller written in Ruby
Whisper integration — automatic transcription via OpenAI API
Tailwind CSS — clean recording UI with waveform visualization

Run with Rails

The demo includes a Stimulus controller written in Ruby (app/javascript/controllers/dictaphone_controller.rb). To transpile it automatically, install ruby2js:

bundle add ruby2js --github ruby2js/ruby2js --branch master
bin/rails generate ruby2js:install
RAILS_ENV=production bin/rails db:prepare
bin/rails server -e production

Open http://localhost:3000. Click “Start Recording” to begin capturing audio. Stop recording to save the clip and trigger transcription.

Run in the Browser

Stop Rails. Run the same app in your browser:

bin/juntos dev -d dexie

Open http://localhost:3000. Same recording interface. Same transcription. But now:

No Ruby runtime — the browser runs transpiled JavaScript
IndexedDB storage — audio files persist in your browser via Active Storage’s IndexedDB adapter
Hot reload — edit a Ruby file, save, browser refreshes

Microphone Permissions

The browser will request microphone access when you click “Start Recording”. Grant permission to enable audio capture. Recordings are stored as WebM audio blobs in IndexedDB.

Run on Node.js

bin/juntos db:prepare -d sqlite
bin/juntos up -d sqlite

Open http://localhost:3000. Same app—but now Node.js serves requests, and audio files are stored on the local filesystem via Active Storage’s disk adapter.

Environment Variables

OpenAI API Key (Required for Transcription)

Set your OpenAI API key to enable Whisper transcription:

export OPENAI_API_KEY=sk-...

Without this key, recordings will be saved but transcription will be skipped.

S3 Storage (Edge Targets)

For edge deployments (Fly.io, Cloudflare Workers, Vercel Edge, Deno Deploy), configure S3-compatible storage:

export S3_BUCKET=my-bucket
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=us-east-1
# For Cloudflare R2 or MinIO:
export AWS_ENDPOINT_URL=https://...

The Code

The dictaphone controller is written in Ruby. Try it — see how it transpiles:

class DictaphoneController < Stimulus::Controller
  def startRecording
    stream = await navigator.mediaDevices.getUserMedia(audio: true)
    @mediaRecorder = MediaRecorder.new(stream)
    @chunks = []

    @mediaRecorder.ondataavailable = ->(e) { @chunks.push(e.data) }
    @mediaRecorder.onstop = -> { handleRecordingComplete() }
    @mediaRecorder.start()

    recordingTarget.classList.remove("hidden")
  end

  def stopRecording
    @mediaRecorder.stop()
    @mediaRecorder.stream.getTracks().each { |t| t.stop() }
  end

  def handleRecordingComplete
    blob = Blob.new(@chunks, type: "audio/webm")
    saveClip(blob)
  end
end

Try it — the model uses Active Storage:

class Clip < ApplicationRecord
  has_one_attached :audio

  validates :audio, presence: true
end

What This Demo Shows

Audio Recording

MediaRecorder API — captures audio from microphone
WebM format — compressed audio for efficient storage
Chunk handling — collects data as it streams

Active Storage Integration

Browser — IndexedDB adapter stores blobs locally
Node.js — Disk adapter stores files on filesystem
Edge — S3 adapter stores in cloud object storage
Same API — has_one_attached :audio works everywhere

AI Transcription

OpenAI Whisper — speech-to-text via API
Audio preprocessing — converts WebM to proper format
Async processing — transcription runs after save

Stimulus Controller

Written in Ruby — transpiles to JavaScript
Async/await — microphone access uses promises
State management — tracks recording status, chunks

What Works Differently

Browser audio — uses MediaRecorder with WebM codec
Whisper API — requires server-side call (not from browser)
Storage backend — automatically selected based on target

What Doesn’t Work

Offline transcription — requires OpenAI API (network)
Long recordings — Whisper has a 25MB file limit
Real-time transcription — currently processes after recording stops

Next Steps

Try the Photo Gallery Demo for camera integration
Try the Blog Demo for CRUD patterns
Read the Architecture to understand what gets generated
Check Deployment Guides for platform setup

Back to Juntos/demos