System Overview

TheInterviews.ai is a voice-AI hiring platform: a candidate joins a live, spoken interview with a photo-real AI interviewer that asks role-relevant questions, listens, follows up, and scores the answers. Five services cooperate to make that one session happen. This page shows the whole path once, end to end, so every other page has a map to hang off.

The end-to-end session path

Walking the path step by step

1. Browser → interviews-ui

Everything starts at the web app (https://www.theinterviews.ai), served by interviews-ui (Next.js 15). It renders the marketing site, dashboards, plan/checkout pages, the Profile Card, and the entry point into an interview. The front end never calls AI providers directly with user data — it has exactly three data paths: REST+JWT to the Java backend through typed service clients, a small set of Next.js serverless routes that exist only to keep the OpenAI key server-side, and a WebSocket connection to LiveKit for real-time audio/video.

2. interviews-ui → user-management

Before a session can begin, user-management (Java Spring Boot) does the gatekeeping. It is the source of truth for authentication (user JWTs), customers, subscription plans, billing, and plan/usage enforcement — e.g. consuming one AI-interview credit from the customer's quota when a session starts. It also hosts an AI-routing filter that flips AI traffic per surface onto bot-backend as the TI-340 migration waves graduate (more on that in step 5).

3. interviews-ui → smart-interview-ui (the session room)

The live interview room itself is a separate app: smart-interview-ui, a plain React SPA that interviews-ui embeds in an iframe — it is not a standalone destination. The parent app hands it the session ID via query parameters or an origin-validated postMessage; the Java backend remains the authority for that session ID, and the room forwards it as a correlation header on its server calls.

4. Session room → video-streaming-server → LiveKit SFU

To get media flowing, the session room asks video-streaming-server (Node.js) for a LiveKit access token — tokens are minted only server-side, never in the browser. With that token the room connects over secure WebSocket (<LIVEKIT_WS_URL>) to the LiveKit SFU, the WebRTC media server that relays audio and video between everyone in the room. video-streaming-server also supplies the session's media plumbing — neural text-to-speech for the interviewer's voice, a streaming speech-to-text proxy, and avatar session minting. (It additionally carries a legacy in-process AI brain that is being retired under the TI-340 migration; the interview intelligence now lives in bot-backend, next step.)

5. LiveKit SFU ↔ bot-backend (the AI participant)

bot-backend (Python, FastAPI + livekit-agents) is the AI interview brain: it plans the questions (including the blended technical interview format), runs the voice-loop intelligence, and drives evaluation, serving production AI-bot interviews today. Its worker joins the LiveKit room as a participant — the AI interviewer is literally another attendee in the call — and transcribes what it hears. The visible face of the interviewer is a Simli photo-real avatar: the browser streams the AI's synthesized speech to Simli and gets lip-synced avatar video back, using a short-lived session token minted by video-streaming-server so the vendor API key never reaches the browser. (Routing the avatar through bot-backend instead is planned in a later migration wave. )

6. Recording: tracks → `<S3_BUCKET>` → ffmpeg merge

If the session is recorded (recording is policy-driven per job posting: off, optional, or mandatory), video-streaming-server orchestrates it. Recording is started and stopped through authenticated endpoints, LiveKit egress writes the recorded media to <S3_BUCKET>, and a finalize step merges the recorded tracks with ffmpeg into the final recording stored under a per-customer folder. Recording state and a session timeline are tracked in Redis along the way, and LiveKit notifies video-streaming-server of egress completion via webhook.

7. Results → user-management

When evaluation happens, the results flow home. bot-backend holds no direct database connection: it POSTs evaluation results and post-meeting summaries to user-management's internal API, authenticated with a short-lived service-to-service JWT and an idempotency key, and user-management performs the row writes (interview results land in a sessions table that ultimately feeds the candidate's Profile Card scores). video-streaming-server uses the same internal API for session-abandon/quota-refund calls. user-management closes the loop: auth and entitlements on the way in, feedback and Profile Card data on the way out.

Where to go deeper

interviews-ui — the Next.js web app and its three data paths.
smart-interview-ui — the embedded live session room.
video-streaming-server — tokens, rooms, and the recording pipeline.
bot-backend — the AI interview brain (FastAPI + livekit-agents).
user-management — auth, plans, billing, entitlements.
LiveKit infrastructure — the SFU, tokens, and the recording pipeline in detail.

The end-to-end session path​

Walking the path step by step​

1. Browser → interviews-ui​

2. interviews-ui → user-management​

3. interviews-ui → smart-interview-ui (the session room)​

4. Session room → video-streaming-server → LiveKit SFU​

5. LiveKit SFU ↔ bot-backend (the AI participant)​

6. Recording: tracks → <S3_BUCKET> → ffmpeg merge​

7. Results → user-management​

Where to go deeper​