bot-backend

Purpose

bot-backend is the AI interview brain for TheInterviews.ai: a Python 3.13 / FastAPI service plus a livekit-agents worker that hosts the voice-interview intelligence — question planning, the voice loop (STT → LLM → TTS), and answer evaluation — behind swappable capability interfaces. It powers AI-bot interviews in production, including the blended technical interview format, and is absorbing the remaining surfaces of the legacy OpenAI interviewer inside video-streaming-server (VSS) wave by wave under the TI-340 migration plan.

Architecture

One repo ships two runtimes:

Runtime	Entry	Role
FastAPI app	`bot_backend_ai.main:app`	The brain's service surface. Serves health/metrics, an internal upstream probe, question planning (including the blended technical interview arc), and the evaluation routes (`POST /v1/evaluate`, `POST /v1/evaluate-code`) that took over from VSS's `bot-evaluate` / `bot-evaluate-code`. Posts results back to user-management's internal API.
LiveKit agent worker	`python livekit_agent.py dev`	The legacy-style real-time worker built on `livekit-agents`. Joins a LiveKit room as a silent participant, transcribes audio, and generates a post-meeting summary it ships to the backend. This path is being decomposed into the capability interfaces.

The service holds no direct database connection. Interview results live in the ai_bot_interview_sessions Postgres table — the schema is owned bot-side, but the actual row writes are performed by user-management in response to bot-backend's POSTs to its JWT-gated /api/internal/* endpoints. user-management then reads the table back through a read-only Hibernate entity (BotInterviewSession, @Immutable) to feed Profile Card ratings. Schema changes to that table are therefore cross-service.

The migration is wave-based, so not every capability lives here yet. The LLM surfaces — question planning (including blended technical interviews) and evaluation (/v1/evaluate*) — serve production traffic, and the worker transcribes via the LiveKit OpenAI plugin; the Deepgram STT, Simli avatar, and SSE audio-out vendors are Wave-4/5 stubs that raise NotImplementedError, with video-streaming-server still fronting those media surfaces until their waves land.

Key components

bot_backend_ai/main.py — FastAPI app factory (create_app()) with request-context and metrics middleware.
bot_backend_ai/api/ — route modules: health.py (/healthz, /readyz, /metrics), internal.py (upstream probe of user-management), eval.py (/v1/evaluate verbal + /v1/evaluate-code coding, mirroring VSS).
bot_backend_ai/infra/ — the plumbing: settings.py (typed pydantic-settings config), auth.py (mints a short-lived 60-second HS256 service JWT for /api/internal/* calls), be_client.py (retrying async httpx client), be_eval_results.py (posts eval results with an idempotency key), factories.py (wires interfaces to vendors), plus structlog logging and a Prometheus latency histogram.
bot_backend_ai/interfaces/ — capability ABCs (LLM / STT / TTS / Avatar / AudioSink). This is the swap seam: vendors plug in behind these without touching callers.
bot_backend_ai/vendors/ — concrete providers. OpenAI chat and OpenAI TTS are live; Deepgram STT, Simli avatar token, and the SSE audio sink are stubs until Waves 4/5.
bot_backend_ai/prompts/eval/ — LLM prompt builders for coding and verbal evaluation (with moderation fallbacks).
livekit_agent.py — the legacy worker. Joins rooms named meeting-{id}, extracts the meeting id from the room name, can be suppressed per-room with a -noagent suffix, and auto-join is additionally gated by a backend flag.
bot_backend_ai/tests/ — unit / integration / contract suites for the FastAPI app. The legacy worker has no tests.

Local development

Prerequisites

Python 3.13 (.python-version pins it; uv will install a managed 3.13 if absent).
uv — the package manager and lockfile tool for this repo (see Astral's installation docs).
An OpenAI API key (required for the app to boot).
A locally running user-management backend reachable at BACKEND_API_URL for the eval-result POST path.
(Optional, worker only) a LiveKit server or cloud project plus LIVEKIT_* credentials.

Install and configure

uv sync                # FastAPI app deps from pyproject.toml
uv sync --extra dev    # add pytest, ruff, mypy
cp .env.example .env.local   # then fill in placeholders

The two runtimes load env differently: the FastAPI app uses pydantic-settings (reads .env then env.local); the worker uses python-dotenv (reads env.{ENV} by the ENV var). Note .env.example is written for the worker and omits two keys the app requires — add INTERNAL_SERVICE_JWT_SECRET (and SIMLI_API_KEY if building the avatar client) yourself.

Run

# FastAPI app (health: /healthz, ready: /readyz, metrics: /metrics)
uv run uvicorn bot_backend_ai.main:app --reload --port 8000

# LiveKit agent worker
ENV=local uv run python livekit_agent.py dev    # or: ./run_agent.sh

OpenAPI docs are intentionally disabled (docs_url=None) — don't go looking for /docs.

Test, lint, types

uv run pytest                          # full suite (FastAPI app only)
uv run ruff check .
uv run mypy bot_backend_ai             # strict mode

Environment variables

All values below are placeholders — never commit real values.

Variable	Purpose
`OPENAI_API_KEY`	OpenAI key for chat (eval) and TTS. Required for the app to boot. Example: `<OPENAI_API_KEY>`.
`BACKEND_API_URL`	Base URL of the user-management API the service posts results to. Example: `<BACKEND_API_URL>`.
`INTERNAL_SERVICE_JWT_SECRET`	Shared HS256 secret (min 32 chars) for service-to-service JWTs; must byte-match user-management's value. Example: `<INTERNAL_SERVICE_JWT_SECRET>`.
`ENV`	Environment name selecting which env file the worker loads (e.g. `local`).
`LOG_LEVEL`	Logging verbosity for the app.
`FEEDBACK_MODEL`	Overrides the default LLM used for evaluation feedback.
`DEEPGRAM_API_KEY`	Deepgram STT key — only needed once the STT client is built. Example: `<DEEPGRAM_API_KEY>`.
`SIMLI_API_KEY`	Simli avatar key — only needed once the avatar client is built. Example: `<SIMLI_API_KEY>`.
`LIVEKIT_URL`	LiveKit server WebSocket endpoint for the worker. Example: `<LIVEKIT_WS_URL>`.
`LIVEKIT_API_KEY`	LiveKit API key for the worker. Example: `<LIVEKIT_API_KEY>`.
`LIVEKIT_API_SECRET`	LiveKit API secret for the worker. Example: `<LIVEKIT_API_SECRET>`.

Gotchas

The Python 3.13 pin is load-bearing. livekit-agents ≥ 1.3.0 does not support Python 3.14; the repo pins livekit-agents==1.2.18 and .python-version to 3.13. Don't bump either without checking the comments in requirements.txt.
Two dependency manifests, on purpose. requirements.txt carries older pins for the legacy worker and its deploy path; the FastAPI app builds from pyproject.toml via uv. They are intentionally separate — do not "reconcile" them.
Two env loaders, two contracts. The app reads .env / env.local (pydantic-settings); the worker reads env.{ENV} (python-dotenv). Keep both fed, or one runtime silently misconfigures.
.env.example is incomplete for the app. It's missing INTERNAL_SERVICE_JWT_SECRET — without it the app won't boot, and with a mismatched value every internal call returns 401.
Most vendors are stubs. Only OpenAI chat and TTS are wired. Calling build_stt_client() / build_avatar_client() without the corresponding key raises at build time.
VSS parity is the spec. The eval routes deliberately mirror VSS behavior (same feedback model, 200-char feedback clamp, "thank you…" stripping, keyword/length score gates). A behavior difference vs VSS is a bug unless a wave spec says otherwise.
BackendClient is an async context manager. Use async with; any other usage raises.
Idempotency matters. post_eval_result generates one idempotencyKey per logical post, outside the retry loop, so retries don't double-append conversation history.
Wave-gated, not all-live. Question planning and evaluation serve production; STT/avatar/audio-out do not yet. Before assuming a change here has live impact, confirm the surface you're touching has been wave-promoted.

Purpose​

Architecture​

Key components​

Local development​

Prerequisites​

Install and configure​

Run​

Test, lint, types​

Environment variables​

Gotchas​