AirPilot already retrieves grounded knowledge. It just never shows its work. This is the plan to turn a support bot into a cited, audible, source-grounded knowledge system — mostly by connecting pipes that already exist.
Strip away the consumer wrapper and NotebookLM is five engineering properties. Each one maps onto something AirPilot already half-builds.
Answers are constrained to your uploaded sources — not the open model. Hallucination collapses.
Every claim carries a pin back to the exact passage in the exact document.
Briefing docs, study guides, FAQs and timelines auto-written from the corpus.
Two AI hosts discuss your sources as a podcast you can listen to on the move.
Sources · grounded chat · studio — three panes around one private corpus.
Every column on the left already exists in the codebase on the right. The work is not invention — it is exposure and synthesis.
KnowledgeDocument + per-vendor Qdrant collection airpilot_vendor_{id}, fed by panel import & manual articles.core/rag/ — chunker (900/200) → embedder → retriever → pipeline, invoked at ChatEngine step 9.core/analysis/ — ticket categorize → cluster (cosine 0.8) → FAQGenerator draft FAQs.portal/.../vendor/knowledge/page.tsx — sources table, counts, import & reindex controls.ChatEngine.process_message() — 17-step pipeline, RAG context injected into the system prompt.The four things that are missing are exactly the four things NotebookLM is famous for. Each is a contained workstream — not a rewrite.
Existing dataflow in slate. The four amber nodes are the only new surface area — watch where each one attaches.
Each block: the concrete gap (with the real file and line), the fix, and a running simulation so the behaviour is obvious before a single line is written.
The retriever already knows the document title and
external_source_id of every chunk it returns. They simply never
leave the engine.
engine.py:339–349 rebuilds
ChatResponse.sources as text[:200] + score +
source_type only. external_source_id and
title are dropped, so ConversationMessage.retrieved_context
stores title-less, id-less stubs. The answer can never link back.title+external_source_id
through retriever → engine → message JSON; teach PromptBuilder to emit
inline [n] markers; render clickable source pins in
ConversationTranscript.tsx resolving to KnowledgeDocument.Smallest change, highest trust ROI — and it unblocks everything downstream. Phase 1.
The analysis pipeline already clusters tickets and surfaces
common_issues — then throws the narrative away.
Job.payload["result"].
There is no AnalysisReport model, so nothing is browsable, diff-able,
or feedable to audio.AnalysisReport model and a step 6 in
AnalysisPipeline.run_full_analysis() that asks the LLM to write a
narrative Support Health Briefing from common_issues +
category distribution. Surface it as a Studio artifact.Reuses the entire analysis pipeline. Phase 2.
Once the briefing is persisted text, an audio overview is a thin layer: script it as a dialogue, speak it, store the mp3.
openai_service.py has no speech method —
no create_speech(), no audio job type, no player.OpenAIService.create_speech() (TTS),
a dialogue-script generator over the briefing, a briefing_audio job,
store the file, play it in the portal.Use case: a support lead listens to “this week’s support health” on the commute. Phase 3.
No new infrastructure — Sources, grounded Chat and the Studio are all existing endpoints, re-laid-out as one workspace.
vendor/knowledge/page.tsx into
Sources | Grounded Chat | Studio, composing WS1–WS3 outputs.
Pure front-end re-composition.The payoff surface — where vendors feel the system. Phase 4.
Each phase ships standalone value. Citations first — it is the smallest change and the largest trust gain, and every later phase consumes it.
Thread id+title end-to-end · inline [n] markers · clickable pins in the transcript.
AnalysisReport model · narrative synthesis step · persisted, browsable.
create_speech() · dialogue scripting · briefing_audio job · player.
Three-pane re-composition of the knowledge page — the surface vendors touch.
VPN customers ask sensitive billing & connectivity questions. “Here is the source” is a feature vendors will pay for — and an audit trail on every reply.
Briefings and audio overviews turn AirPilot into a system support leads use, not a widget they install and forget. Stickier, harder to rip out.
Reuses RAG + analysis + openai_service. Net-new: ~one model, one TTS
method, one prompt change, one portal re-layout.