Skip to main content

bot-backend

Purpose

bot-backend is the AI interview brain for TheInterviews.ai: a Python 3.13 / FastAPI service plus a livekit-agents worker that hosts the voice-interview intelligence — question planning, the voice loop (STT → LLM → TTS), and answer evaluation — behind swappable capability interfaces. It powers AI-bot interviews in production, including the blended technical interview format, and is absorbing the remaining surfaces of the legacy OpenAI interviewer inside video-streaming-server (VSS) wave by wave under the TI-340 migration plan.

Architecture

One repo ships two runtimes:

RuntimeEntryRole
FastAPI appbot_backend_ai.main:appThe brain's service surface. Serves health/metrics, an internal upstream probe, question planning (including the blended technical interview arc), and the evaluation routes (POST /v1/evaluate, POST /v1/evaluate-code) that took over from VSS's bot-evaluate / bot-evaluate-code. Posts results back to user-management's internal API.
LiveKit agent workerpython livekit_agent.py devThe legacy-style real-time worker built on livekit-agents. Joins a LiveKit room as a silent participant, transcribes audio, and generates a post-meeting summary it ships to the backend. This path is being decomposed into the capability interfaces.

The service holds no direct database connection. Interview results live in the ai_bot_interview_sessions Postgres table — the schema is owned bot-side, but the actual row writes are performed by user-management in response to bot-backend's POSTs to its JWT-gated /api/internal/* endpoints. user-management then reads the table back through a read-only Hibernate entity (BotInterviewSession, @Immutable) to feed Profile Card ratings. Schema changes to that table are therefore cross-service.

The migration is wave-based, so not every capability lives here yet. The LLM surfaces — question planning (including blended technical interviews) and evaluation (/v1/evaluate*) — serve production traffic, and the worker transcribes via the LiveKit OpenAI plugin; the Deepgram STT, Simli avatar, and SSE audio-out vendors are Wave-4/5 stubs that raise NotImplementedError, with video-streaming-server still fronting those media surfaces until their waves land.

Key components

  • bot_backend_ai/main.py — FastAPI app factory (create_app()) with request-context and metrics middleware.
  • bot_backend_ai/api/ — route modules: health.py (/healthz, /readyz, /metrics), internal.py (upstream probe of user-management), eval.py (/v1/evaluate verbal + /v1/evaluate-code coding, mirroring VSS).
  • bot_backend_ai/infra/ — the plumbing: settings.py (typed pydantic-settings config), auth.py (mints a short-lived 60-second HS256 service JWT for /api/internal/* calls), be_client.py (retrying async httpx client), be_eval_results.py (posts eval results with an idempotency key), factories.py (wires interfaces to vendors), plus structlog logging and a Prometheus latency histogram.
  • bot_backend_ai/interfaces/ — capability ABCs (LLM / STT / TTS / Avatar / AudioSink). This is the swap seam: vendors plug in behind these without touching callers.
  • bot_backend_ai/vendors/ — concrete providers. OpenAI chat and OpenAI TTS are live; Deepgram STT, Simli avatar token, and the SSE audio sink are stubs until Waves 4/5.
  • bot_backend_ai/prompts/eval/ — LLM prompt builders for coding and verbal evaluation (with moderation fallbacks).
  • livekit_agent.py — the legacy worker. Joins rooms named meeting-{id}, extracts the meeting id from the room name, can be suppressed per-room with a -noagent suffix, and auto-join is additionally gated by a backend flag.
  • bot_backend_ai/tests/ — unit / integration / contract suites for the FastAPI app. The legacy worker has no tests.

Local development

Prerequisites

  • Python 3.13 (.python-version pins it; uv will install a managed 3.13 if absent).
  • uv — the package manager and lockfile tool for this repo (see Astral's installation docs).
  • An OpenAI API key (required for the app to boot).
  • A locally running user-management backend reachable at BACKEND_API_URL for the eval-result POST path.
  • (Optional, worker only) a LiveKit server or cloud project plus LIVEKIT_* credentials.

Install and configure

uv sync # FastAPI app deps from pyproject.toml
uv sync --extra dev # add pytest, ruff, mypy
cp .env.example .env.local # then fill in placeholders

The two runtimes load env differently: the FastAPI app uses pydantic-settings (reads .env then env.local); the worker uses python-dotenv (reads env.{ENV} by the ENV var). Note .env.example is written for the worker and omits two keys the app requires — add INTERNAL_SERVICE_JWT_SECRET (and SIMLI_API_KEY if building the avatar client) yourself.

Run

# FastAPI app (health: /healthz, ready: /readyz, metrics: /metrics)
uv run uvicorn bot_backend_ai.main:app --reload --port 8000

# LiveKit agent worker
ENV=local uv run python livekit_agent.py dev # or: ./run_agent.sh

OpenAPI docs are intentionally disabled (docs_url=None) — don't go looking for /docs.

Test, lint, types

uv run pytest # full suite (FastAPI app only)
uv run ruff check .
uv run mypy bot_backend_ai # strict mode

Environment variables

All values below are placeholders — never commit real values.

VariablePurpose
OPENAI_API_KEYOpenAI key for chat (eval) and TTS. Required for the app to boot. Example: <OPENAI_API_KEY>.
BACKEND_API_URLBase URL of the user-management API the service posts results to. Example: <BACKEND_API_URL>.
INTERNAL_SERVICE_JWT_SECRETShared HS256 secret (min 32 chars) for service-to-service JWTs; must byte-match user-management's value. Example: <INTERNAL_SERVICE_JWT_SECRET>.
ENVEnvironment name selecting which env file the worker loads (e.g. local).
LOG_LEVELLogging verbosity for the app.
FEEDBACK_MODELOverrides the default LLM used for evaluation feedback.
DEEPGRAM_API_KEYDeepgram STT key — only needed once the STT client is built. Example: <DEEPGRAM_API_KEY>.
SIMLI_API_KEYSimli avatar key — only needed once the avatar client is built. Example: <SIMLI_API_KEY>.
LIVEKIT_URLLiveKit server WebSocket endpoint for the worker. Example: <LIVEKIT_WS_URL>.
LIVEKIT_API_KEYLiveKit API key for the worker. Example: <LIVEKIT_API_KEY>.
LIVEKIT_API_SECRETLiveKit API secret for the worker. Example: <LIVEKIT_API_SECRET>.

Gotchas

  • The Python 3.13 pin is load-bearing. livekit-agents ≥ 1.3.0 does not support Python 3.14; the repo pins livekit-agents==1.2.18 and .python-version to 3.13. Don't bump either without checking the comments in requirements.txt.
  • Two dependency manifests, on purpose. requirements.txt carries older pins for the legacy worker and its deploy path; the FastAPI app builds from pyproject.toml via uv. They are intentionally separate — do not "reconcile" them.
  • Two env loaders, two contracts. The app reads .env / env.local (pydantic-settings); the worker reads env.{ENV} (python-dotenv). Keep both fed, or one runtime silently misconfigures.
  • .env.example is incomplete for the app. It's missing INTERNAL_SERVICE_JWT_SECRET — without it the app won't boot, and with a mismatched value every internal call returns 401.
  • Most vendors are stubs. Only OpenAI chat and TTS are wired. Calling build_stt_client() / build_avatar_client() without the corresponding key raises at build time.
  • VSS parity is the spec. The eval routes deliberately mirror VSS behavior (same feedback model, 200-char feedback clamp, "thank you…" stripping, keyword/length score gates). A behavior difference vs VSS is a bug unless a wave spec says otherwise.
  • BackendClient is an async context manager. Use async with; any other usage raises.
  • Idempotency matters. post_eval_result generates one idempotencyKey per logical post, outside the retry loop, so retries don't double-append conversation history.
  • Wave-gated, not all-live. Question planning and evaluation serve production; STT/avatar/audio-out do not yet. Before assuming a change here has live impact, confirm the surface you're touching has been wave-promoted.