A clinical decision-support assistant for renal drug dosing. PDFs of drug monographs are ingested into a vector-searchable "brain"; the chat answers dosing questions strictly from retrieved monograph context, with inline citations and a built-in Cockcroft–Gault creatinine-clearance calculation.
Retrieval-first by design: the LLM only reasons over passages pulled from the index, never free-associates a dose.
- Runtime / build: Bun, Next.js 16 (App Router), React 19, TypeScript
- Styling: Tailwind CSS v4
- Auth: Clerk
- Data + vectors: MongoDB Atlas (app collections + Atlas Vector Search)
- Parsing: LlamaIndex / LlamaParse (
LLAMA_CLOUD_API_KEY) - LLM: Gemini (primary) → OpenRouter (fallback) for chat & ingestion
- Embeddings: OpenRouter
text-embedding-3-small(1536-dim)
Two paths share one store: ingestion writes monographs into the index; query reads them back to ground an answer.
flowchart TD
PDF["brain/docs/*.pdf"] -->|upload / ingest| ING
subgraph ING["Ingestion — lib/brain/ingest.ts"]
direction LR
P[parse] --> C[custom chunk] --> E[LLM enrich] --> M[embed] --> S[persist]
end
ING --> MONGO[("MongoDB Atlas<br/>brain_documents · brain_chunks")]
ING --> MIR["agent/plasma/memory/docs/*.md<br/>memory mirror"]
MONGO --> VEC{{"Atlas Vector Search<br/>brain_chunks.embedding"}}
Q(["user question"]) --> CHAT
subgraph CHAT["Query — lib/brain/chat.ts"]
direction LR
EQ[embed query] --> VS[vector search top-K] --> BP[build prompt] --> GEN[generateChat]
end
VEC --> VS
GEN --> ANS["cited answer + sources + CrCl"]
API["Next.js API · app/api/v1/*"] <--> UI["React chat UI · components/chat/*"]
CHAT --- API
Triggered per file on upload, or in bulk over brain/docs. Each PDF is
content-hashed (SHA-256); unchanged files are skipped unless force is set.
| Stage | Module | Output |
|---|---|---|
| Parse | parse.ts |
Page-indexed text via LlamaParse (page numbers preserved) |
| Chunk | chunk.ts |
Deterministic split guided by custom-chunking/; keeps pages, headings, formulas, dose rules |
| Enrich | enrich.ts |
LLM section/metadata annotation (toggle BRAIN_ENRICH) |
| Embed | index-store.ts |
text-embedding-3-small, 1536-dim vectors |
| Persist | store.ts, memory.ts |
Chunks/docs → MongoDB and a Markdown mirror in agent/plasma/memory/docs/ |
MongoDB and the memory mirror stay in sync; the vector index
(ensureVectorIndex) is created/refreshed on brain_chunks.embedding.
retrieve.ts embeds the query and runs Atlas Vector Search for top-K chunks;
retrieveCase additionally expands retrieval per drug detected in the question.
chat.ts assembles the system prompt from agent/plasma/RULES.md plus a fixed
answer-format appendix (renal status, drug-by-drug review, interactions,
recommended actions, alternatives), enforces citation discipline, and requests
formulas as LaTeX. Answers carry explicit confidence labels and source chips.
creatinine.ts implements Cockcroft–Gault server-side; lib/chat-client.ts
mirrors it for live client-side preview. When patient context (age, weight, sex,
serum creatinine) is supplied, CrCl is computed and folded into the prompt.
app/
(home)/ marketing / landing
chat/ chat UI + drug library (knowledge) routes
api/v1/ REST endpoints
manifest.ts PWA manifest
icons/[size]/ generated PNG app icons (next/og)
components/chat/ chat UI: sidebar, header, thread, composer, ...
lib/brain/ ingestion, retrieval, store, db, llm, creatinine
lib/chat-client.ts typed browser client for /api/v1
brain/docs/ source monograph PDFs (the knowledge base)
custom-chunking/ chunking prompts (boundary detection, structure analysis)
agent/plasma/ GitAgent memory: RULES.md, SOUL.md, memory/docs
Base path /api/v1.
| Method | Path | Purpose |
|---|---|---|
| POST | /users |
Upsert user (keyed by email) |
| GET | /conversations?userId= |
List a user's conversations |
| GET | /conversations/:chatId?userId= |
Conversation + messages |
| PATCH | /conversations/:chatId |
Rename ({ userId, title }) |
| DELETE | /conversations/:chatId?userId= |
Delete conversation + messages |
| POST | /chat |
Answer a question ({ userId, chatId?, question, patient? }) |
| POST | /creatinine-clearance |
Cockcroft–Gault CrCl |
| GET | /status |
Brain/index/Mongo health, document list |
| POST | /upload |
Multipart PDF upload + ingest |
| POST | /ingest |
Ingest all pending / a single file ({ fileName?, force? }) |
| POST | /rebuild |
Rebuild the vector index |
users,conversations,messages— application data (unique indexes onuserId,email,chatId; recency index for listing).brain_documents— one record per ingested PDF (hash, pages, chunk count).brain_chunks— chunk text + metadata +embedding; backs Atlas Vector Search (brain_chunks_vector).
Set in .env.
Required
| Var | Description |
|---|---|
DATABASE_URL |
MongoDB Atlas connection string |
OPENROUTER_API_KEY |
Embeddings + chat/ingestion fallback |
LLAMA_CLOUD_API_KEY |
LlamaParse PDF parsing |
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY, CLERK_SECRET_KEY |
Clerk auth |
Optional
| Var | Default | Description |
|---|---|---|
GEMINI_API_KEY / GEMINI_KEY |
— | Primary LLM; falls back to OpenRouter if unset/failing |
GEMINI_MODEL |
gemini-2.5-flash |
Gemini model |
OPENROUTER_EMBED_MODEL |
openai/text-embedding-3-small |
Embedding model (must match index) |
OPENROUTER_CHAT_MODEL |
meta-llama/llama-3.3-70b-instruct:free |
Chat fallback model |
MONGODB_DB |
peakplasma |
Database name |
MONGODB_VECTOR_INDEX |
brain_chunks_vector |
Atlas Vector Search index name |
EMBED_DIM |
1536 |
Embedding dimensionality (must match embed model) |
BRAIN_ENRICH |
true |
Enable LLM chunk enrichment |
BRAIN_TOP_K |
6 |
Default retrieval depth |
bun install
bun run dev # http://localhost:3000Other scripts:
bun run build # production build
bun run start # serve production build (required to test the PWA service worker)
bun run lintAdd monographs by dropping PDFs into brain/docs/ (or the in-app drug library),
then ingest via the UI or POST /api/v1/ingest. A new/changed PDF refreshes both
the vector index and the memory mirror.
- The service worker (
public/sw.js) registers in production builds only. - Embedding model and
EMBED_DIMmust stay consistent with the existing index; changing either requires a re-embed / index rebuild.