bot-backend
Purpose
bot-backend is the AI interview brain for TheInterviews.ai: a Python 3.13 / FastAPI service plus a livekit-agents worker that hosts the voice-interview intelligence — question planning, the voice loop (STT → LLM → TTS), and answer evaluation — behind swappable capability interfaces. It powers AI-bot interviews in production, including the blended technical interview format, and is absorbing the remaining surfaces of the legacy OpenAI interviewer inside video-streaming-server (VSS) wave by wave under the TI-340 migration plan.
Architecture
One repo ships two runtimes:
| Runtime | Entry | Role |
|---|---|---|
| FastAPI app | bot_backend_ai.main:app | The brain's service surface. Serves health/metrics, an internal upstream probe, question planning (including the blended technical interview arc), and the evaluation routes (POST /v1/evaluate, POST /v1/evaluate-code) that took over from VSS's bot-evaluate / bot-evaluate-code. Posts results back to user-management's internal API. |
| LiveKit agent worker | python livekit_agent.py dev | The legacy-style real-time worker built on livekit-agents. Joins a LiveKit room as a silent participant, transcribes audio, and generates a post-meeting summary it ships to the backend. This path is being decomposed into the capability interfaces. |
The service holds no direct database connection. Interview results live in the ai_bot_interview_sessions Postgres table — the schema is owned bot-side, but the actual row writes are performed by user-management in response to bot-backend's POSTs to its JWT-gated /api/internal/* endpoints. user-management then reads the table back through a read-only Hibernate entity (BotInterviewSession, @Immutable) to feed Profile Card ratings. Schema changes to that table are therefore cross-service.
The migration is wave-based, so not every capability lives here yet. The LLM surfaces — question planning (including blended technical interviews) and evaluation (/v1/evaluate*) — serve production traffic, and the worker transcribes via the LiveKit OpenAI plugin; the Deepgram STT, Simli avatar, and SSE audio-out vendors are Wave-4/5 stubs that raise NotImplementedError, with video-streaming-server still fronting those media surfaces until their waves land.
Key components
bot_backend_ai/main.py— FastAPI app factory (create_app()) with request-context and metrics middleware.bot_backend_ai/api/— route modules:health.py(/healthz,/readyz,/metrics),internal.py(upstream probe of user-management),eval.py(/v1/evaluateverbal +/v1/evaluate-codecoding, mirroring VSS).bot_backend_ai/infra/— the plumbing:settings.py(typed pydantic-settings config),auth.py(mints a short-lived 60-second HS256 service JWT for/api/internal/*calls),be_client.py(retrying async httpx client),be_eval_results.py(posts eval results with an idempotency key),factories.py(wires interfaces to vendors), plus structlog logging and a Prometheus latency histogram.bot_backend_ai/interfaces/— capability ABCs (LLM / STT / TTS / Avatar / AudioSink). This is the swap seam: vendors plug in behind these without touching callers.bot_backend_ai/vendors/— concrete providers. OpenAI chat and OpenAI TTS are live; Deepgram STT, Simli avatar token, and the SSE audio sink are stubs until Waves 4/5.bot_backend_ai/prompts/eval/— LLM prompt builders for coding and verbal evaluation (with moderation fallbacks).livekit_agent.py— the legacy worker. Joins rooms namedmeeting-{id}, extracts the meeting id from the room name, can be suppressed per-room with a-noagentsuffix, and auto-join is additionally gated by a backend flag.bot_backend_ai/tests/— unit / integration / contract suites for the FastAPI app. The legacy worker has no tests.
Local development
Prerequisites
- Python 3.13 (
.python-versionpins it;uvwill install a managed 3.13 if absent). - uv — the package manager and lockfile tool for this repo (see Astral's installation docs).
- An OpenAI API key (required for the app to boot).
- A locally running user-management backend reachable at
BACKEND_API_URLfor the eval-result POST path. - (Optional, worker only) a LiveKit server or cloud project plus
LIVEKIT_*credentials.
Install and configure
uv sync # FastAPI app deps from pyproject.toml
uv sync --extra dev # add pytest, ruff, mypy
cp .env.example .env.local # then fill in placeholders
The two runtimes load env differently: the FastAPI app uses pydantic-settings (reads .env then env.local); the worker uses python-dotenv (reads env.{ENV} by the ENV var). Note .env.example is written for the worker and omits two keys the app requires — add INTERNAL_SERVICE_JWT_SECRET (and SIMLI_API_KEY if building the avatar client) yourself.
Run
# FastAPI app (health: /healthz, ready: /readyz, metrics: /metrics)
uv run uvicorn bot_backend_ai.main:app --reload --port 8000
# LiveKit agent worker
ENV=local uv run python livekit_agent.py dev # or: ./run_agent.sh
OpenAPI docs are intentionally disabled (docs_url=None) — don't go looking for /docs.
Test, lint, types
uv run pytest # full suite (FastAPI app only)
uv run ruff check .
uv run mypy bot_backend_ai # strict mode
Environment variables
All values below are placeholders — never commit real values.
| Variable | Purpose |
|---|---|
OPENAI_API_KEY | OpenAI key for chat (eval) and TTS. Required for the app to boot. Example: <OPENAI_API_KEY>. |
BACKEND_API_URL | Base URL of the user-management API the service posts results to. Example: <BACKEND_API_URL>. |
INTERNAL_SERVICE_JWT_SECRET | Shared HS256 secret (min 32 chars) for service-to-service JWTs; must byte-match user-management's value. Example: <INTERNAL_SERVICE_JWT_SECRET>. |
ENV | Environment name selecting which env file the worker loads (e.g. local). |
LOG_LEVEL | Logging verbosity for the app. |
FEEDBACK_MODEL | Overrides the default LLM used for evaluation feedback. |
DEEPGRAM_API_KEY | Deepgram STT key — only needed once the STT client is built. Example: <DEEPGRAM_API_KEY>. |
SIMLI_API_KEY | Simli avatar key — only needed once the avatar client is built. Example: <SIMLI_API_KEY>. |
LIVEKIT_URL | LiveKit server WebSocket endpoint for the worker. Example: <LIVEKIT_WS_URL>. |
LIVEKIT_API_KEY | LiveKit API key for the worker. Example: <LIVEKIT_API_KEY>. |
LIVEKIT_API_SECRET | LiveKit API secret for the worker. Example: <LIVEKIT_API_SECRET>. |
Gotchas
- The Python 3.13 pin is load-bearing.
livekit-agents≥ 1.3.0 does not support Python 3.14; the repo pinslivekit-agents==1.2.18and.python-versionto 3.13. Don't bump either without checking the comments inrequirements.txt. - Two dependency manifests, on purpose.
requirements.txtcarries older pins for the legacy worker and its deploy path; the FastAPI app builds frompyproject.tomlvia uv. They are intentionally separate — do not "reconcile" them. - Two env loaders, two contracts. The app reads
.env/env.local(pydantic-settings); the worker readsenv.{ENV}(python-dotenv). Keep both fed, or one runtime silently misconfigures. .env.exampleis incomplete for the app. It's missingINTERNAL_SERVICE_JWT_SECRET— without it the app won't boot, and with a mismatched value every internal call returns 401.- Most vendors are stubs. Only OpenAI chat and TTS are wired. Calling
build_stt_client()/build_avatar_client()without the corresponding key raises at build time. - VSS parity is the spec. The eval routes deliberately mirror VSS behavior (same feedback model, 200-char feedback clamp, "thank you…" stripping, keyword/length score gates). A behavior difference vs VSS is a bug unless a wave spec says otherwise.
BackendClientis an async context manager. Useasync with; any other usage raises.- Idempotency matters.
post_eval_resultgenerates oneidempotencyKeyper logical post, outside the retry loop, so retries don't double-append conversation history. - Wave-gated, not all-live. Question planning and evaluation serve production; STT/avatar/audio-out do not yet. Before assuming a change here has live impact, confirm the surface you're touching has been wave-promoted.