video-streaming-server

Purpose

video-streaming-server (VSS) is a single Node.js (ESM) process that owns the real-time media and session side of an AI-led interview: LiveKit room/token management, session orchestration and liveness, the recording pipeline to S3, neural text-to-speech, streaming speech-to-text, and avatar session minting — with every third-party secret kept server-side. The AI interview intelligence (question planning, evaluation, feedback) lives in the Python bot-backend service; VSS still carries a legacy in-process brain that is being retired (see Legacy AI brain).

Architecture

The frontends (interviews-ui, and the live session room in smart-interview-ui) never talk to OpenAI, Deepgram, Simli, or Spatius directly. They call VSS over three transports:

Express 4 HTTP API — LiveKit routes (tokens, recording control, webhooks), voice/avatar routes (TTS, avatar session minting), session-lifecycle routes (heartbeats, resume parsing), and the remaining legacy AI-brain routes while they migrate out.
A raw WebSocket proxy — streams the candidate's 16 kHz PCM audio to Deepgram for speech-to-text; the API key stays on the server.
Socket.io — legacy WebRTC signaling and the in-app (Puppeteer-based) recording flow.

Actual audio/video media flows browser ↔ LiveKit Cloud (WebRTC); VSS mints the access tokens and controls recording egress. Recordings are produced by LiveKit room-composite egress as MP4 into S3, with a server-side finalize step that copies them into their final location. FFmpeg is part of the deployment environment for the recording pipeline.

Legacy AI brain (being retired)

VSS predates bot-backend and still carries an in-process OpenAI "interview brain" — question generation, answer evaluation, and feedback-report routes mounted alongside the media API. That brain is being migrated away under the TI-340 wave plan: new AI-interview capability — question planning, evaluation, blended technical interviews — lands in bot-backend, the AI interview brain in the current workflow. In this repo the rule is bug fixes and migration-shim work only; before touching brain code, check which TI-340 wave owns the surface — it may already be scheduled for deletion.

Key components

Area	What it owns
`livekit/token.js`	LiveKit JWT issuance — security-critical; coordinate before changing grant logic.
`livekit/recording.js`, `recordingFinalize.js`, `recordingTimeline.js`, `egressUtils.js`, `webHooks.js`	The recording pipeline: egress control, finalize-to-S3, Redis timeline, LiveKit webhook handling.
`ai-interviews/services/openaiService.js`	The one centralized OpenAI client (TTS + remaining legacy-brain calls). All OpenAI calls go through it; per-surface model overrides resolve here.
`ai-interviews/routes/{question,evaluation,feedback}Routes.js`	The legacy AI brain — question generation, answer scoring, feedback reports. Being retired under TI-340; net-new logic for these surfaces lands in `bot-backend`.
`ai-interviews/routes/ttsRoutes.js`	Neural OpenAI TTS → MP3 (8 s timeout).
`ai-interviews/routes/sttStreamRoutes.js`	Deepgram Nova-3 STT WebSocket proxy (key stays server-side; 10-minute hard cap per stream).
`ai-interviews/routes/simliRoutes.js`, `spatiusRoutes.js`	Avatar session-token minting; Simli pool/queue management.
`ai-interviews/routes/heartbeatRoutes.js` + `jobs/zombieSessionCleanup.js`	Session liveness heartbeats and abandoned-session cleanup/refund.
`ai-interviews/config/env.js`	Env loading and precedence — imported first in `index.js`.
`ai-interviews/middleware/timeout.js`	Tight timeouts on external calls (Simli mint 5 s, TTS 8 s) — deliberate latency budgets.

Route groups: AI interview routes are mounted under /api/ai-mock-interviews/*, LiveKit routes under /livekit/*, plus /health and a legacy Socket.io/recorder surface. If the LiveKit module fails to load, /livekit/* returns 503 but the rest of the server keeps serving.

Local development

Prerequisites

Node.js 20 LTS or newer — ESM project ("type": "module"), no engines pin.
PostgreSQL reachable via the DB_* (or RDS_*) env vars — required for session/feedback persistence.
Redis — only needed for the LiveKit recording pipeline; the AI interview loop boots without it.
API keys per the table below. For a minimal "does it start" boot, only OPENAI_API_KEY is strictly needed — everything else degrades gracefully.

Commands

cp .env.example .env   # fill in your own values — NEVER commit .env / .env.local
npm install
npm run dev            # nodemon (auto-restart)
npm start              # prod-style: node index.js
npm test               # vitest run — run once, do not leave watch mode running

There is no build step (npm run build is a no-op). A perf suite exists via npm run test:perf. The server listens on PORT (set it in your .env).

Env file precedence (lowest → highest): .env → .env.{development|production} → .env.local (gitignored dev override), loaded by ai-interviews/config/env.js before any other import.

Environment variables

Names only — use your own values; never commit real secrets.

Variable	Purpose
`OPENAI_API_KEY`	Required for any AI surface — questions, evaluation, feedback, TTS fail without it.
`DB_` / `RDS_`	PostgreSQL connection for session/feedback persistence.
`PORT`	HTTP listen port, e.g. `<PORT>`.
`NODE_ENV`	`development` / `production`.
`FRONTEND_URL`	Allowed CORS origin, e.g. https://develop.theinterviews.ai.
`LIVEKIT_API_KEY` / `LIVEKIT_API_SECRET` / `LIVEKIT_HOST`	LiveKit credentials + host (e.g. `<LIVEKIT_WS_URL>`) — needed to mint room tokens for live interviews.
`REDIS_HOST`	Redis for recording state/timeline; unset disables that pipeline.
`AWS_REGION`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `S3_BUCKET_NAME`	Recording storage, e.g. `<S3_BUCKET>`; unset breaks egress finalize.
`DEEPGRAM_API_KEY` (+ `STT_BACKEND=deepgram`)	Streaming STT; unset → the STT stream endpoint returns `503`.
`SIMLI_API_KEY` + `SIMLI_FACE_ID`	Simli avatar session minting; unset → `503`, frontend falls back to the lite orb.
`SPATIUS_API_KEY` (+ `SPATIUS_REGION`, `SPATIUS_APP_ID`)	Spatius avatar session minting; unset → `503`, frontend falls back to the orb.
`INTERNAL_SERVICE_JWT_SECRET`	Auth for the zombie-cleanup quota-refund call to user-management.
`OPENAI_QUESTION_MODEL`, `OPENAI_FEEDBACK_MODEL`, `OPENAI_TTS_MODEL`, …	Per-surface model overrides resolved in `openaiService.js`; legacy `OPENAI_MODEL` is a fallback.

The full annotated list lives in the repo's .env.example (placeholders only).

Gotchas

One OpenAI client, ever. Don't instantiate OpenAI clients in route files — everything goes through ai-interviews/services/openaiService.js. And don't point the legacy OPENAI_MODEL fallback at a small/cheap model in production: it collapses question diversity across sessions for the same resume.
Env loads first. import './ai-interviews/config/env.js' is the first line of index.js. Anything reading process.env at module-init time depends on that ordering.
Timeouts are latency budgets, not arbitrary. Simli mint 5 s, TTS 8 s (ai-interviews/middleware/timeout.js). Tune deliberately.
livekit/token.js is security-critical. It mints the JWTs that grant room access — coordinate before touching grant logic.
Secrets stay server-side. The STT WebSocket proxy and avatar-session routes exist precisely so DEEPGRAM_API_KEY, SIMLI_API_KEY, and SPATIUS_API_KEY never reach the browser or logs.
Graceful degradation, not crashes. Missing optional integrations return 503 and the frontend falls back (lite orb, no STT). If LiveKit fails to load, only /livekit/* goes dark.
Idempotent liveness. Heartbeat-stop and zombie cleanup treat unknown sessions as no-ops; the client keeps beating across WebRTC reconnects, so a transient drop under 60 s never reaps a live session.
Real-time is prod-touching. A live interview can't be rolled back mid-session. Isolate avatar/streaming layers defensively — an observed failure cascade went Simli timeout → CDN fetch failure → WebGL context loss. One layer's failure must not take down the others.
Watch the migration waves. Before changing AI "brain" code, check which TI-340 wave owns that surface — it may already be scheduled for deletion in favor of bot-backend.

Purpose​

Architecture​

Legacy AI brain (being retired)​

Key components​

Local development​

Prerequisites​

Commands​

Environment variables​

Gotchas​