Enterprise Agent Memory

Persistent memory for AI coding agents — deployed on Azure, built for the enterprise.
Your coding agent remembers everything. No more re-explaining. No more context loss between sessions.
Based on agentmemory, re-engineered for Azure with multi-tenancy, managed services, and one-click deployment.

Install • Deploy • How It Works • Agents • Viewer • API • vs Original • Config

Why Enterprise Agent Memory?

Every coding agent forgets everything when the session ends. You waste the first 5 minutes re-explaining your stack. Enterprise Agent Memory fixes this — it silently captures what your agent does, compresses it into searchable memory, builds a knowledge graph, and injects the right context when the next session starts.

Session 1: "Add JWT auth to the API"
  Agent writes code, runs tests, fixes bugs
  → agentmemory captures every tool call silently
  → Observations compressed into structured memory via GPT-4o
  → Knowledge graph auto-extracted (entities, relationships)
  → Vectors embedded with text-embedding-3-large (3072 dims)
  → Everything stored in Cosmos DB + indexed in AI Search

Session 2: "Now add rate limiting"
  Agent already knows:
    ✓ Auth uses jose middleware in src/middleware/auth.ts
    ✓ Tests in test/auth.test.ts cover token validation
    ✓ You chose jose over jsonwebtoken for Edge compatibility
  → Zero re-explaining. Starts working immediately.

What makes this different from the original agentmemory? This is the Azure enterprise edition — same memory pipeline, but running on managed Azure services with multi-tenant isolation, auto-scaling, and one-click deployment. No SQLite files, no local iii-engine runtime, no single-process limitations.

Architecture

                         ┌─────────────────────────────┐
                         │    AI Coding Agents          │
                         │  Claude Code · Cursor · Codex│
                         │  Gemini CLI · Any MCP client │
                         └─────────────┬───────────────┘
                                       │ REST API / MCP
                                       ▼
                    ┌──────────────────────────────────────┐
                    │       Azure Container Apps           │
                    │   Fastify v5 · Entra ID JWT Auth     │
                    │   Multi-tenant · Rate Limited        │
                    │   Auto-scales 1 → 10 replicas        │
                    └────┬──────┬──────┬──────┬───────────┘
                         │      │      │      │
              ┌──────────┘      │      │      └──────────┐
              ▼                 ▼      ▼                  ▼
     ┌─────────────┐  ┌──────────┐  ┌─────────────┐  ┌─────────────┐
     │  Cosmos DB   │  │Azure AI  │  │Azure OpenAI │  │Blob Storage │
     │  (Serverless)│  │ Search   │  │  (GPT-4o +  │  │  (Archive)  │
     │             │  │(BM25 +   │  │  embedding)  │  │             │
     │ Sessions    │  │ Vector)  │  │             │  │ Raw obs     │
     │ Observations│  │          │  │ Compress    │  │ Audit trail │
     │ Memories    │  │ 3072-dim │  │ Embed       │  │             │
     │ Graph nodes │  │ vectors  │  │ Graph extract│  │             │
     │ Graph edges │  │          │  │             │  │             │
     │ Audit log   │  │          │  │             │  │             │
     └─────────────┘  └──────────┘  └─────────────┘  └─────────────┘
              │
              └──────────────────┐
                                 ▼
                        ┌─────────────────┐
                        │  App Insights    │
                        │  (Monitoring)    │
                        └─────────────────┘

Memory Pipeline

Agent tool call fires
  → Archive raw observation to Blob Storage
  → LLM compress via GPT-4o → structured facts + concepts + narrative
  → Generate vector embedding (text-embedding-3-large, 3072 dimensions)
  → Store compressed observation in Cosmos DB
  → Index in Azure AI Search (BM25 + vector)
  → Extract knowledge graph entities (fire-and-forget)
  → Increment session observation count + audit entry

Works with Every Agent

Enterprise Agent Memory exposes a standard REST API that any agent can call. It also works with the original agentmemory MCP server and plugins.

Claude Code
_{hooks + MCP}

Codex CLI
_{hooks + MCP}

Cursor
_{MCP server}

Gemini CLI
_{MCP server}

Windsurf
_{MCP server}

Cline
_{MCP server}

Aider
_{REST API}

_{Works with any agent that speaks MCP or HTTP — one API, memories shared across all of them.}

Install

Prerequisites

Node.js 18+ and npm
Azure subscription with the following services provisioned (or use Deploy to Azure):
- Azure Cosmos DB (NoSQL, serverless)
- Azure AI Search (Basic or Standard)
- Azure OpenAI (GPT-4o + text-embedding-3-large)
- Azure Blob Storage

Quick Start

# Clone
git clone https://github.com/msftse/enterprise-agent-memory.git
cd enterprise-agent-memory

# Install dependencies
npm install

# Configure (see Configuration section below)
cp .env.azure.example .env
# Edit .env with your Azure resource endpoints and keys

# Run in development mode
npm run dev

# Run tests (102 passing)
npm test

# Build for production
npm run build && npm start

The server starts on http://localhost:8080. Open http://localhost:8080/viewer for the built-in dashboard.

Deploy to Azure

One-Click Deploy

This deploys all required Azure services using our Bicep templates:

Resource	What it creates
Cosmos DB	Serverless NoSQL account + `agentmemory` database with 6 containers
AI Search	Basic tier search service with vector index (3072 dims)
Azure OpenAI	GPT-4o + text-embedding-3-large deployments (optional, requires quota)
Blob Storage	LRS storage account for raw observation archive
Container Apps	Consumption-plan app with auto-scale (1–10 replicas)
App Insights	Application monitoring and telemetry
Container Registry	ACR for Docker image hosting

Manual Deploy (CLI)

# 1. Create resource group
az group create --name rg-agentmemory --location westus2

# 2. Deploy infrastructure (Bicep)
az deployment group create \
  --resource-group rg-agentmemory \
  --template-file infra/main.bicep \
  --parameters baseName=agentmem environment=dev

# 3. Build & push Docker image to ACR
az acr build \
  --registry <your-acr-name> \
  --image agent-memory:latest \
  --file Dockerfile .

# 4. Update Container App with new image
az containerapp update \
  --name app-<baseName>-dev \
  --resource-group rg-agentmemory \
  --image <acr>.azurecr.io/agent-memory:latest

Infrastructure Modules

The infra/ directory contains 8 Bicep modules for full Azure deployment:

infra/
├── main.bicep                 # Orchestrator — wires all modules together
└── modules/
    ├── cosmos.bicep            # Cosmos DB NoSQL (serverless)
    ├── ai-search.bicep         # Azure AI Search
    ├── openai.bicep            # Azure OpenAI (conditional)
    ├── storage.bicep           # Blob Storage
    ├── container-app.bicep     # Container Apps + Environment
    ├── monitoring.bicep        # App Insights + Log Analytics
    └── networking.bicep        # VNet + private endpoints (optional)

How It Works

Memory Lifecycle

┌─────────────────────────────────────────────────────────────────┐
│                    OBSERVATION PIPELINE                          │
│                                                                 │
│  Raw Input ──→ Blob Archive ──→ LLM Compress ──→ Embed (3072d) │
│                                      │                          │
│                                      ▼                          │
│                              ┌───────────────┐                  │
│                              │ Compressed Obs │                  │
│                              │  • title       │                  │
│                              │  • content     │                  │
│                              │  • facts[]     │                  │
│                              │  • concepts[]  │                  │
│                              │  • importance  │                  │
│                              └───────┬───────┘                  │
│                                      │                          │
│                     ┌────────────────┼────────────────┐         │
│                     ▼                ▼                ▼         │
│              Cosmos DB          AI Search       Graph Extract   │
│              (store)         (vector index)    (fire & forget)  │
└─────────────────────────────────────────────────────────────────┘

Knowledge Graph

Entities and relationships are automatically extracted from every observation:

Nodes: Files, concepts, libraries, functions, people, errors, patterns, projects, decisions
Edges: imports, uses, depends_on, related_to, caused_by, solves, implements, tested_by
Deduplication: Nodes matched by name + type + tenantId; edges by source + target + type + tenantId
Weight: Edges start at 1.0, increment by 0.5 on re-observation (capped at 10)

Multi-Tenant Isolation

Every record is scoped by tenantId. Every query filters by it. Tenant A never sees Tenant B's data.

Tenant A                         Tenant B
├── Sessions (scoped)            ├── Sessions (scoped)
├── Observations                 ├── Observations
├── Memories                     ├── Memories
├── Knowledge Graph              ├── Knowledge Graph
└── Search Index (filtered)      └── Search Index (filtered)

Authentication uses Microsoft Entra ID JWT tokens. The x-tenant-id header provides tenant scoping. For development, set AUTH_DISABLED=true.

Real-Time Dashboard

A built-in web viewer is served at /viewer (root / redirects there). No separate process, no extra dependencies — it's a self-contained SPA served directly from the Fastify API.

Features:

View	What it shows
📊 Dashboard	Stats overview — sessions, observations, memories, graph node counts
💚 Health	Real-time status of all Azure services (Cosmos, AI Search, Blob)
📁 Sessions	Browse sessions with drill-down detail (project, model, tags, obs count)
👁 Observations	Browse by session — view compressed content, facts, concepts, importance
💡 Memories	List memories with strength bars, versioning, concept tags
🔍 Search	Semantic search across all observations and memories
🕸️ Knowledge Graph	Interactive force-directed graph visualization with click-to-inspect

The viewer supports dark/light theme toggle and auto-detects system preference. The API URL and tenant ID are configurable in the top bar.

vs Original agentmemory

This project takes the core concepts from rohitg00/agentmemory and re-engineers them for Azure enterprise deployment.

	agentmemory (original)	Enterprise Agent Memory (Azure)
Storage	SQLite on local disk	Azure Cosmos DB (serverless, globally distributed)
Vector Search	BM25 + local embeddings (MiniLM)	Azure AI Search (BM25 + vector, 3072-dim text-embedding-3-large)
LLM	Any OpenAI-compatible provider	Azure OpenAI (GPT-4o for compression + graph extraction)
Multi-tenancy	Single user	Full tenant isolation (Entra ID + tenantId scoping)
Scaling	Single process	Container Apps auto-scale (1–10 replicas)
Runtime	iii-engine (Rust binary required)	Pure Node.js — no external runtime needed
Knowledge Graph	Optional (iii-engine)	Auto-extraction on every observation (fire-and-forget)
Auth	HMAC secret	Microsoft Entra ID JWT + RBAC
Deployment	npm install / Docker Compose	One-click Azure deploy (Bicep IaC)
Compliance	—	GDPR purge endpoint, audit trail in Blob Storage
Viewer	Port 3113 (separate process)	Built-in at /viewer (same server, no proxy)
Tests	950+	102 (unit + integration, vitest)

API Reference

Base URL: https://your-app.azurecontainerapps.io/api/v1

All endpoints (except /health) require:

Authorization: Bearer token (Entra ID JWT) — or set AUTH_DISABLED=true for development
x-tenant-id: Tenant identifier header

Sessions

Method	Path	Description
`POST`	`/sessions`	Create a new agent session
`GET`	`/sessions`	List sessions (paginated)
`GET`	`/sessions/:id`	Get session by ID
`PATCH`	`/sessions/:id`	Update session metadata
`POST`	`/sessions/:id/end`	End session (set status to completed)

Observations

Method	Path	Description
`POST`	`/observations`	Capture observation → compress → embed → store → index → graph
`GET`	`/observations/:id`	Get observation by ID
`GET`	`/sessions/:id/observations`	List observations for a session

Memories

Method	Path	Description
`POST`	`/memories`	Create a memory (with embedding)
`GET`	`/memories`	List memories (paginated)
`GET`	`/memories/:id`	Get memory by ID
`PUT`	`/memories/:id/evolve`	Evolve memory (creates new version)
`DELETE`	`/memories/:id`	Forget memory (soft delete, sets strength to 0)

Search

Method	Path	Description
`POST`	`/search`	Hybrid search — BM25 + vector + semantic reranking

curl -X POST https://your-app.azurecontainerapps.io/api/v1/search \
  -H "Content-Type: application/json" \
  -H "x-tenant-id: my-team" \
  -d '{"query": "how does authentication work", "limit": 5}'

Knowledge Graph

Method	Path	Description
`GET`	`/graph/nodes`	List graph nodes (filter by type)
`GET`	`/graph/edges`	List graph edges (filter by nodeId)
`POST`	`/graph/nodes`	Create node
`POST`	`/graph/edges`	Create edge
`POST`	`/graph/traverse`	BFS traversal from a start node
`POST`	`/graph/extract`	Extract entities from a single observation
`POST`	`/graph/extract-batch`	Extract entities from all observations in a session

Admin

Method	Path	Description
`GET`	`/health`	Health check — all Azure service statuses (no auth)
`GET`	`/admin/metrics`	Per-tenant usage metrics
`DELETE`	`/admin/tenant/:id`	GDPR purge — delete all data for a tenant

Azure Services

Service	Purpose	SKU	Scaling
Cosmos DB	Sessions, observations, memories, graph nodes/edges, audit	Serverless	Auto-scales RU/s per request
Azure AI Search	Hybrid search (BM25 + 3072-dim vector)	Basic → Standard	Add replicas for throughput, partitions for index size
Azure OpenAI	GPT-4o (compression, graph extraction) + text-embedding-3-large	Standard	TPM-based rate limiting
Blob Storage	Raw observation archive + audit trail	LRS	Unlimited
Container Apps	Stateless API runtime	Consumption	0.5 vCPU / 1GB → auto-scales to 10 replicas
App Insights	Distributed tracing + monitoring	—	—

Cost Estimate

Tier	Users	Monthly Cost (est.)
Dev	1–5	~$15–30 (serverless Cosmos + basic Search)
Team	5–50	~$80–150 (basic Search + moderate Cosmos RU)
Enterprise	50+	~$300+ (standard Search + provisioned Cosmos)

Configuration

Environment Variables

# Required — Azure service endpoints
COSMOS_ENDPOINT=https://your-cosmos.documents.azure.com:443/
AI_SEARCH_ENDPOINT=https://your-search.search.windows.net
STORAGE_ACCOUNT_URL=https://yourstorage.blob.core.windows.net

# Optional — Azure OpenAI (LLM features disabled without this)
AZURE_OPENAI_ENDPOINT=https://your-openai.openai.azure.com
AZURE_OPENAI_API_KEY=your-key          # or use Managed Identity

# Optional — key-based auth for local development
COSMOS_KEY=your-cosmos-account-key
STORAGE_ACCOUNT_KEY=your-storage-key
AI_SEARCH_ADMIN_KEY=your-search-admin-key

# Server
PORT=8080                              # default: 8080
LOG_LEVEL=info                         # debug | info | warn | error
AUTH_DISABLED=true                     # skip Entra ID auth (dev only)

In production, the app uses Managed Identity (DefaultAzureCredential) — no keys needed. Set COSMOS_KEY / STORAGE_ACCOUNT_KEY only for local development where RBAC isn't configured.

Project Structure

├── src/
│   ├── types/              # 60+ domain model interfaces (tenantId multi-tenancy)
│   │   ├── models.ts       # Session, CompressedObservation, Memory, GraphNode, GraphEdge
│   │   └── api.ts          # Request/response types
│   ├── config/             # Zod-validated Azure config with graceful degradation
│   ├── adapters/           # Azure service adapters
│   │   ├── cosmos.adapter.ts          # Cosmos DB (key + DefaultAzureCredential)
│   │   ├── ai-search.adapter.ts       # AI Search (vector + BM25)
│   │   ├── azure-openai.adapter.ts    # OpenAI (compress, embed, graph extract)
│   │   ├── blob-storage.adapter.ts    # Blob Storage (archive, audit)
│   │   └── fabric/lakehouse.adapter.ts # Fabric Lakehouse (analytics)
│   ├── engine/             # Core memory pipeline
│   │   ├── observe.ts      # 7-step pipeline: archive → compress → embed → store → index → graph → audit
│   │   ├── compress.ts     # GPT-4o observation compression
│   │   ├── remember.ts     # Memory creation + versioning
│   │   ├── forget.ts       # Soft deletion (set strength to 0)
│   │   ├── search.ts       # Hybrid search orchestration
│   │   └── graph.ts        # Knowledge graph CRUD + entity extraction + deduplication
│   ├── middleware/          # Auth (Entra ID JWT), tenant isolation, rate limiting
│   ├── routes/             # 20+ Fastify route handlers
│   │   ├── sessions.routes.ts
│   │   ├── observations.routes.ts
│   │   ├── memories.routes.ts
│   │   ├── search.routes.ts
│   │   ├── graph.routes.ts
│   │   ├── admin.routes.ts
│   │   └── viewer.routes.ts    # Serves built-in dashboard
│   ├── viewer/
│   │   └── index.html          # Self-contained SPA dashboard
│   └── index.ts                # Fastify server entrypoint
├── infra/                      # 8 Bicep modules for Azure deployment
│   ├── main.bicep
│   └── modules/
├── src/__tests__/              # 102 tests (vitest)
│   ├── unit/                   # 9 unit test suites
│   └── integration/            # API integration tests
├── docs/
│   ├── PRD.md                  # Product requirements document
│   └── architecture.excalidraw
├── Dockerfile                  # Multi-stage Node 22 Alpine build
├── vitest.config.ts
└── package.json

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Run tests (npm test — all 102 should pass)
Commit your changes
Push to the branch and open a Pull Request

License

This project is licensed under the Apache License 2.0 — see the LICENSE file for details.

Acknowledgments

agentmemory by Rohit Ghumare — the original persistent memory system for AI coding agents that inspired this enterprise edition.
iii engine — the runtime that powers the original agentmemory.
Built with Azure Cosmos DB, Azure AI Search, Azure OpenAI, and Azure Container Apps.

_{Built with ❤️ by the Microsoft SE team · Powered by Azure}

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github/workflows		.github/workflows
bench		bench
docs		docs
infra		infra
landing		landing
mcp		mcp
src		src
.dockerignore		.dockerignore
.env.azure.example		.env.azure.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

Enterprise Agent Memory

Why Enterprise Agent Memory?

Architecture

Memory Pipeline

Works with Every Agent

Install

Prerequisites

Quick Start

Deploy to Azure

One-Click Deploy

Manual Deploy (CLI)

Infrastructure Modules

How It Works

Memory Lifecycle

Knowledge Graph

Multi-Tenant Isolation

Real-Time Dashboard

vs Original agentmemory

API Reference

Sessions

Observations

Memories

Search

Knowledge Graph

Admin

Azure Services

Cost Estimate

Configuration

Environment Variables

Project Structure

Contributing

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages