Integrations · Internal

Zoom rehearsal (one bot)

FLY APIAPI base: https://digital-delegate-api.fly.dev

Binds one Session + one stored ZoomConnection to a meeting id, validates the meeting over Zoom REST, then either (A) moves straight to LISTENING for browser-only rehearsal, or (B) with ZOOM_WORKER_SECRET on the API, stops at CONNECTED until a Meeting SDK worker reports in_meeting. TTS is produced on the API; piping it into the Zoom session requires your SDK + virtual mic setup (see apps/zoom-worker/README.md).

LiveSession save target

Start here for the demo: create or load a LiveSession, run browser rehearsal turns below, then save the snapshot/result back to that session. Operator-entered replies are the current workaround until STT is implemented.

Status

No LiveSession linked

Current transcript source: no browser turns yet.

Live Zoom Worker Checklist

Real Zoom participation uses the Meeting SDK worker path. Browser demo audio below is separate and does not put audio into Zoom.

Browser Demo Mode

Proves worker intelligence and voice locally with ElevenLabs in the browser. It does not put audio into Zoom.

Live Zoom Worker Mode

Requires the Meeting SDK worker to join and report in_meeting. Binding and worker must use the same API instance.

Phase 5C Native Runtime Proof

Populate native SDK assets, build the Linux worker image, run ldd, then prove CONNECTED becomes LISTENING after in_meeting.

Panelist Mode

Preserves Delegate, Analyst, and Provocationist rehearsal behavior for live content use cases.

Zoom OAuth configured
Checking
Meeting SDK key configured
Checking
Meeting SDK secret configured
Checking
Worker secret configured
Checking
Public API base URL configured
Checking
ElevenLabs configured
Checking
Default voice configured
Checking
Database configured
Checking
Binding created
Missing
Worker status
No binding
Worker in meeting
Not reported yet
Audio output ready
Waiting for worker in meeting
Listening/STT status
Not implemented yet
API configurationMissing

API needs DATABASE_URL and DIRECT_URL for sessions, bindings, OAuth tokens, responses, and warnings.

Next: Set database env vars in apps/api/.env and restart the API.

Zoom OAuthMissing

OAuth loads Zoom connections and obtains the ZAK needed for Meeting SDK join.

Next: Confirm Zoom app Client ID, Client Secret, and redirect URI.

Meeting SDKMissing

Meeting SDK key/secret signs the SDKAuth JWT for the external worker.

Next: Set ZOOM_MEETING_SDK_KEY and ZOOM_MEETING_SDK_SECRET on API and worker.

Worker runtimeMissing

Live Zoom requires apps/zoom-worker to join and report in_meeting to move CONNECTED to LISTENING.

Next: Run local worker against local API, or deployed worker against deployed API. Do not mix local API with Fly worker.

TTS/audioMissing

API generates ElevenLabs MP3; zoom-worker injects it through PulseAudio into the Meeting SDK client.

Next: Set ElevenLabs env and ensure worker has ffmpeg, PulseAudio, ZoomOut/ZoomIn, and native Zoom SDK runtime.

Listening/STTNot implemented

The worker does not yet capture Zoom meeting audio or stream it to STT.

Next: Use operator-entered prospect replies until STT capture, segmentation, and trigger policy are implemented.

Binding/sessionMissing

A binding links one Session and Zoom connection to a Zoom meeting id.

Next: Create a session, setup a participant, paste a Zoom URL or meeting id, then create binding.

  1. Start API.
  2. Start web.
  3. Load Zoom connection.
  4. Create session.
  5. Setup worker.
  6. Paste Zoom join URL.
  7. Create binding.
  8. Click Join.
  9. Immediately run doctor or start worker with that binding.
  10. Wait for worker in_meeting.
  11. Only then use Speak.

Listening/STT is not implemented. Operator-entered replies are required.

Native runtime proof checklist
Current binding

Create a binding first.

Current API

FLY API: https://digital-delegate-api.fly.dev

Create Join and run the worker against this same API base.

Native SDK assets missing?

Run native layout and asset verification, place files under apps/zoom-worker/vendor/zoom, then build and run ldd. Do not commit vendor binaries.

Join bundle state

Create binding and click Join to arm the bundle.

  1. Deploy API and zoom-worker from the same repo revision.
  2. Confirm API and worker Fly secrets use the same worker secret and Meeting SDK credentials.
  3. Use deployed web/app or local web configured to the deployed API.
  4. Create binding and click Join against that API.
  5. Start the worker with the current binding id immediately.
  6. Watch logs, admit the worker in Zoom, then confirm CONNECTED becomes LISTENING.

After Join, start the worker immediately. The API keeps the join bundle in memory for only a few minutes.

Direct Zoom TTS proof
Browser demo audio

Plays ElevenLabs audio in this browser only. It does not send audio into Zoom.

Direct worker audio

Requires the worker to be in the meeting, then API Speak emits TTS to the worker over Socket.IO for ffmpeg/PulseAudio injection.

Worker in meeting

Wait for worker in_meeting and binding LISTENING before using Speak for direct Zoom audio.

Audio proof status

No worker audio proof event yet

  1. Run audio doctor in the same Linux/Docker/Fly environment as the worker.
  2. Use TTS dry-run to prove download and ffmpeg conversion without claiming Zoom output.
  3. After the worker reports in_meeting, click Speak or call the speak endpoint.
  4. Watch worker logs for tts_audio_ready_received, tts_audio_downloaded, ffmpeg injection start, and audio_complete.
  5. Have a human confirm whether the Zoom participant heard the audio.

TTS dry-run proves download and conversion only. Direct Zoom audio is proven only when the worker is in a real meeting and a participant hears it.

Listening/STT proof
Listening/STT

Not implemented or proven. The worker does not yet capture Zoom audio or transcribe it.

Transcript API

Ready for operator-entered, browser rehearsal, and future STT transcript segments on LiveSession.

Future capture source

ZOOM_CAPTURE_SOURCE / ZOOM_AUDIO_SOURCE / PULSE_SOURCE, defaulting to ZoomIn.

Self-feedback risk

Worker TTS must be separated from prospect audio before automatic listening can trigger worker turns.

  1. Prove the capture source exists in Linux/Docker/Fly.
  2. Record short chunks without sending them to STT.
  3. Add a real STT provider and persist final segments.
  4. Suppress worker TTS feedback before automatic triggers.
  5. Keep manual operator-entered replies as the safe workaround.

No live Zoom listening/STT claim is made here. This is readiness scaffolding for a future capture and transcription pipeline.

Operator commands

Operator commands unavailable until the API responds.

Production requires app.delegateworker.com plus deployed API and worker using the same API instance. Fly worker cannot consume a join bundle created on local API because the bundle is in API memory.

1. Link ids

Load connections after OAuth (integrations page).

2. Session + panelist

Use the Show id printed by pnpm db:seed (operator “Show id”). Creates a new Session row each time.

3. Binding + join

4. Speak (invocation + TTS)

5. Demo cockpit

Browser-only demo turn. Uses the current binding id when present, but does not require Zoom audio or a live call.

LISTENING

Ongoing Sales Discovery session

The worker leads the call, asks one follow-up at a time, speaks with ElevenLabs when available, and updates qualification fields.

READY

Start the discovery session to let the worker take the first turn.

Result so far

Budget
Open question
Timeline
Open question
Decision maker
Open question
Pain points
Open question
Qualification status
Needs discovery
Objections
Open question
Recommended next step
Continue qualification.
Follow-up email draft
Open question

Runtime status

Create a binding to see status (auto-polls every 2s when binding id is set).