Local-first document intelligence for private data pipelines.
Classify, sanitize, index, and query unstructured data without sending it to a cloud by default.
Quickstart · Architecture · Capabilities · Runtime · Development · Topology
Phantom turns messy folders of documents, logs, configs, and code into structured intelligence: searchable chunks, sensitivity findings, sanitized exports, RAG-ready vector indexes, and audit-friendly processing reports.
It is built for operators who care about data boundaries. The default path is
local inference through llama.cpp, local vector search with FAISS/BM25, and a
reproducible Nix development environment. Cloud providers can be added through
the provider abstraction, but the core workflow does not require them.
Data stays local. Search gets smarter. Operators keep control.
| Need | Phantom gives you |
|---|---|
| Keep private data local | Local-first processing, llama.cpp provider, no cloud dependency by default |
| Understand large document sets | CORTEX chunking, embeddings, insight extraction, and RAG chat |
| Prepare data safely | DAG classification, PII detection, pseudonymization, sanitization, quarantine |
| Search beyond keywords | Hybrid retrieval with FAISS dense search plus BM25 sparse search |
| Operate like a real system | Typer CLI, FastAPI service, Prometheus metrics, Nix, Docker, tests |
| Give users a GUI | Cortex Desktop, a Tauri 2 + SvelteKit client for the local API |
flowchart LR
raw[Raw files] --> dag[DAG pipeline<br/>classify + sanitize]
dag --> cortex[CORTEX<br/>chunk + extract]
cortex --> vectors[FAISS + BM25<br/>hybrid retrieval]
vectors --> rag[RAG chat<br/>streaming API]
cli[phantom CLI] --> dag
api[FastAPI service] --> dag
desktop[Cortex Desktop] --> api
providers[LLM providers<br/>llama.cpp first] --> cortex
events[NATS hooks] --> api
phantom/
├── src/phantom/ # Python runtime: CLI, FastAPI, CORTEX, RAG, DAG, providers
├── cortex-desktop/ # Tauri 2 + SvelteKit desktop client
├── intelagent/ # Rust agent and quality-gate primitives
├── spectre/ # Companion signal/pattern extraction scaffold
├── nix/ + flake.nix # Reproducible development, packages, and checks
├── docs/ # Architecture, guides, deployment notes, history
├── arch/ # Generated architecture reports
├── tests/ # Unit, integration, and e2e tests
└── .archive/ # Historical experiments and dead-code snapshots
For the canonical topology map, see docs/architecture/project_topology.rst.
Phantom is happiest inside its pinned Nix shell.
git clone https://github.com/VoidNxSEC/phantom
cd phantom
nix develop
just test
just serveThen check the API:
curl http://localhost:8008/healthRun the desktop client:
just desktopUse the CLI directly:
phantom scan ./documents
phantom classify ./documents --dry-run
phantom rag ingest ./docs --collection local
phantom rag query "What are the main compliance risks?" --collection localCORTEX splits large inputs into semantic chunks, embeds them locally, and extracts structured insights through an LLM provider.
Document -> SemanticChunker -> EmbeddingGenerator -> LLM Provider -> Pydantic schema
It is designed for long documents, bounded context windows, and GPU-aware local inference.
Phantom combines dense semantic search with sparse keyword retrieval.
Query -> FAISS cosine search ----+
+-> Reciprocal Rank Fusion -> ranked results
Query -> BM25 keyword search ----+
Index and search through HTTP:
curl -X POST http://localhost:8008/vectors/index \
-F "file=@docs/architecture/CORTEX_V2_ARCHITECTURE.md"
curl -X POST http://localhost:8008/vectors/search \
-H "Content-Type: application/json" \
-d '{"query": "semantic chunking tradeoffs", "top_k": 5, "mode": "hybrid"}'curl -N -X POST http://localhost:8008/api/chat/stream \
-H "Content-Type: application/json" \
-d '{
"message": "Summarize the indexed architecture decisions.",
"conversation_id": "demo",
"history": [],
"context_size": 5
}'The DAG pipeline classifies files, detects sensitive patterns, optionally sanitizes content, records fingerprints, and isolates suspicious outputs.
| Stage | Purpose |
|---|---|
| Discover | Walk input trees and prepare file records |
| Fingerprint | Capture SHA256, BLAKE3, xxHash, size, and timestamps |
| Classify | Detect document, code, data, config, log, crypto, media, and unknown files |
| Detect | Find PII, secrets, keys, tokens, network indicators, and identifiers |
| Sanitize | Strip metadata, redact PII, or perform full sanitization |
| Persist | Write audit records, reports, outputs, and quarantine entries |
| Surface | Entry point | What it owns |
|---|---|---|
| CLI | phantom |
Extraction, analysis, classification, scans, RAG, tools, API startup |
| API | phantom-api / just serve |
Health, metrics, upload, process, vector, chat, pipeline, judge endpoints |
| Desktop | just desktop |
Tauri/Svelte GUI for local workflows |
| Nix | nix develop, nix build, nix flake check |
Reproducible shell, packages, and checks |
| Docker | Dockerfile |
OCI fallback for non-Nix environments |
| IntelAgent | intelagent/ |
Rust agent abstractions and quality-gate primitives |
The FastAPI server exposes OpenAPI docs at /docs when running.
| Area | Endpoints |
|---|---|
| Health | GET /health, GET /ready, GET /metrics, GET /api/system/metrics |
| Documents | POST /extract, POST /process, POST /upload, POST /api/upload |
| Vectors | POST /vectors/index, POST /vectors/batch-index, POST /vectors/search |
| Chat | POST /api/chat, POST /api/chat/stream, GET /api/models, POST /api/prompt/test |
| Pipeline | POST /api/pipeline, POST /api/pipeline/scan |
| Integrations | GET /rag/query, POST /judge |
nix develop # enter the pinned shell
just # list available recipes
just lint # ruff + mypy
just fmt # ruff format
just test # pytest
just test-cov # pytest with coverage report
just ci # lint + tests
just check # nix flake checks
just stats # project statisticsUseful focused commands:
just test-file tests/unit/test_vector_store.py
just test-match "rag"
just ruff-fix
just audit| Document | Purpose |
|---|---|
| Project Topology | Canonical map of live code, docs, generated reports, and archive areas |
| CORTEX Architecture | Chunking, embeddings, vector storage, retrieval, and VRAM notes |
| Roadmap | Shipped, active, and planned work |
| Deployment | Deployment notes for production surfaces |
| Desktop Setup | Cortex Desktop development setup |
| Security Policy | Vulnerability reporting and security process |
| Component | Status |
|---|---|
| Python CLI and core package | Live |
| FastAPI service and metrics | Live |
| CORTEX chunking and extraction | Live |
| FAISS/BM25 retrieval | Live |
| DAG classification and sanitization | Live |
| Cortex Desktop | Beta |
| IntelAgent Rust workspace | Scaffolded |
| Cloud LLM providers | Planned |
| Redis semantic cache | Planned |
| Helm/Kubernetes packaging | Planned |
Near-term work:
- Finish desktop sub-components and frontend test infrastructure.
- Add a system metrics dashboard tab wired to
/api/system/metrics. - Implement markdown/code rendering in chat.
- Add Redis or in-memory semantic caching for repeated embeddings and queries.
- Expand provider implementations beyond the current
llama.cpppath.
Longer-term work:
- Standalone Linux/macOS binaries.
- Docker/OCI hardening.
- NixOS module for system-level deployment.
- Distributed and multi-node processing.
- IntelAgent advanced governance, memory, quality, MCP, and ZK features.
Phantom is designed for sensitive local workloads, but it is still alpha-stage software. Treat it as an operator tool, review outputs before production use, and keep test datasets separate from regulated production data until your own controls are in place.
Found a vulnerability? See SECURITY.md.
Apache 2.0. See LICENSE.
Read CONTRIBUTING.md before opening a PR. For architecture changes or significant API modifications, open an issue with the proposed design and the affected runtime surfaces.