Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
193 changes: 193 additions & 0 deletions docs/ai/design/2026-05-26-feature-agent-start.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
---
phase: design
title: System Design & Architecture
description: Define the technical architecture, components, and data models
---

# System Design & Architecture

## Architecture Overview

```mermaid
graph TD
CLI["agent start --type claude --name myagent --cwd /path"]
TmuxMgr["TmuxManager\n(new)"]
Registry["AgentRegistry\n(new)\n~/.ai-devkit/agents.json"]
Tmux["tmux session: myagent\n$ claude"]
AgentMgr["AgentManager\n(modified)"]
Adapters["ClaudeCodeAdapter / CodexAdapter\n(unchanged)"]
ListCmd["agent list / agent send / agent open\n(modified: registry-first name resolution)"]

CLI --> TmuxMgr
CLI --> Registry
TmuxMgr --> Tmux
ListCmd --> AgentMgr
AgentMgr --> Adapters
AgentMgr --> Registry
```

**`agent start` flow (happy path):**
1. Validate `--name` format, tmux availability, `--cwd` exists, name not live in registry
2. `TmuxManager.createSession(name, cwd)` — `tmux new-session -d -s <name> -c <cwd>`
3. `TmuxManager.sendKeys(name, agentCommand)` — `tmux send-keys -t <name> "<cmd>" Enter`
4. Poll up to 5s (500ms interval) via `TmuxManager.findAgentPid(name, type)` for the agent process PID (walks the process tree to skip wrapper scripts)
5. `AgentRegistry.register({ name, pid, type, tmuxSession: name, cwd, startedAt })`
6. Print success output with attach command

**`agent start` flow (PID poll timeout):**
- If no child PID is found within 5s: `tmux kill-session -t <name>`, exit non-zero with message "Agent process not found — verify `<cmd>` is in PATH"
- No orphaned tmux sessions left behind

**`agent list` / `agent send` / `agent open` flow:**
- `AgentManager.listAgents()`: aggregate adapter results → overlay registry names by PID equality
- `AgentManager.resolveAgent(input, agents)`: registry lookup by name first → fall through to exact/partial name match

## Data Models

### Registry entry (`~/.ai-devkit/agents.json`)

```typescript
interface RegistryEntry {
name: string; // user-supplied, unique key
type: AgentType; // 'claude' | 'codex' | 'gemini_cli' | 'opencode'
pid: number; // agent process PID
tmuxSession: string; // tmux session name (== name for `agent start` entries)
cwd: string; // resolved absolute path
startedAt: string; // ISO 8601
}

type RegistryFile = { entries: RegistryEntry[] };
```

**Identity = PID.** Liveness check is `process.kill(pid, 0)`. Simple, single-tier.

**Known limitation**: if the agent process restarts inside its tmux pane (e.g., user hits Ctrl+C and re-runs the binary), the stored PID is dead and the registry entry will be pruned. The pane and the new process are still there — the registry just loses the name. Recovery: `tmux kill-session -t <name>` and run `agent start` again. Acceptable trade-off for v1 simplicity.

**Registering the right PID**: the direct child of the tmux shell is often an npm bin shim or shell wrapper that `exec`s/forks into the real node process and exits. To avoid registering a wrapper PID that dies seconds later, `TmuxManager.findAgentPid` walks the process tree and prefers a descendant whose `comm` matches the agent type.

`prune()` removes entries that fail `isAlive`. Not load-bearing for correctness (every consumer guards stale entries via PID equality), so it runs opportunistically.

## API Design

### `TmuxManager` (new — `packages/agent-manager/src/terminal/TmuxManager.ts`)

```typescript
class TmuxManager {
isAvailable(): Promise<boolean>
sessionExists(name: string): Promise<boolean>
createSession(name: string, cwd: string): Promise<void>
sendKeys(session: string, keys: string): Promise<void>
killSession(name: string): Promise<void>
findAgentPid(session: string, matches: (psCommand: string) => boolean): Promise<number | null>
}
```

- `createSession`: `tmux new-session -d -s <name> -c <cwd>`
- `sendKeys`: `tmux send-keys -t <session> <keys> Enter`
- `findAgentPid`: BFS walks the process tree from the pane's shell PID and returns the **deepest** descendant whose `ps` command line is accepted by the caller-supplied `matches` function. The "deepest match" rule handles both wrapper scripts above the agent and helper subprocesses (e.g. MCP servers) below it. `TmuxManager` is generic — agent-type knowledge lives in the matcher.

### `AGENTS` registry (new — `packages/agent-manager/src/utils/agents.ts`)

```typescript
export type StartableAgentType = Exclude<AgentType, 'other'>;

export interface AgentConfig {
command: string; // shell command sent to tmux
matches: (psCommand: string) => boolean; // identifies the process in `ps` output
}

export const AGENTS: Record<StartableAgentType, AgentConfig> = {
claude: { command: 'claude', matches: matchArgv0('claude') },
codex: { command: 'codex', matches: matchArgv0('codex') },
opencode: { command: 'opencode', matches: matchArgv0('opencode') },
gemini_cli: { command: 'gemini', matches: matchAnyToken('gemini') },
};
```

Each agent owns both its launch command and a matcher that knows that agent's distribution quirks. Most agents ship as npm bin shims where `argv[0]` is the executable name (`matchArgv0`). Gemini ships as a Node script, so `ps` shows it as `node /path/to/gemini ...` and the real binary basename lives in `argv[1..]` — `matchAnyToken` scans every token.

### `AgentRegistry` (new — `packages/agent-manager/src/utils/AgentRegistry.ts`)

```typescript
class AgentRegistry {
register(entry: RegistryEntry): void
lookup(name: string): RegistryEntry | null
list(): RegistryEntry[]
prune(): void
isAlive(entry: RegistryEntry): boolean // kill(pid, 0)
}
```

Path: `~/.ai-devkit/agents.json`. Creates directory/file if absent. Writes are atomic (write to `.tmp` then rename).

**Concurrent access:** two simultaneous `agent start` calls with the same `--name` may both pass the name-free check before either writes. This is an acceptable edge case (rare, self-correcting on next prune). File locking is not implemented in this iteration.

### CLI subcommands (`packages/cli/src/commands/agent.ts`)

```
agent start
--type <type> required 'claude' | 'codex' | 'gemini_cli' | 'opencode'
--name <name> optional alphanumeric + hyphens, max 64 chars; default: {folder}-{timestamp}
--cwd <path> optional defaults to process.cwd()
```

**Output on success:**
```
Agent "myagent" started (claude, PID 12345)
Working directory: ~/projects/myapp
Attach: tmux attach -t myagent
```

### `AgentManager` modifications

**Constructor injection:**
```typescript
class AgentManager {
constructor(private registry: AgentRegistry = AgentRegistry.default()) {}
}
```
`AgentRegistry.default()` returns a module-level singleton backed by the default path `~/.ai-devkit/agents.json`. Tests can inject a custom instance.

**`listAgents()`:** after aggregating adapter results, for each `AgentInfo` whose `pid` matches a registry entry, override `info.name` with the registry name.

**`resolveAgent(input, agents)`:** before exact/partial name matching, check `registry.lookup(input)`; if found and its PID appears in the agent list, return that agent.

**`createAgentManager()` CLI helper** passes `AgentRegistry.default()` explicitly to ensure the same instance is used across `agent start`, `agent list`, and `agent send` within a process lifetime.

## Component Breakdown

| Component | Location | Change |
|---|---|---|
| `TmuxManager` | `packages/agent-manager/src/terminal/TmuxManager.ts` | New |
| `AgentRegistry` | `packages/agent-manager/src/utils/AgentRegistry.ts` | New |
| `AGENTS` registry | `packages/agent-manager/src/utils/agents.ts` | New |
| `agent start` subcommand | `packages/cli/src/commands/agent.ts` | New subcommand |
| `AgentManager.listAgents()` | `packages/agent-manager/src/AgentManager.ts` | Registry name overlay by PID |
| `AgentManager.resolveAgent()` | `packages/agent-manager/src/AgentManager.ts` | Registry-first lookup |
| `packages/agent-manager/src/index.ts` | exports | Export new classes and `AGENTS` |
| `createAgentManager()` (CLI helper) | `packages/cli/src/commands/agent.ts` | Pass registry instance |

## Design Decisions

**Registry over adapter modification:** Agent detection adapters are intentionally decoupled from identity management. Injecting the registry overlay into `AgentManager` (not adapters) keeps adapters stateless and testable.

**Name = tmux session name:** Keeping them identical simplifies lookup and the `tmux attach -t <name>` hint.

**Walk the process tree to find the real agent PID:** The shell's direct child is often an npm bin shim or shell wrapper that exits after spawning the real node process, AND the real agent may itself spawn helper subprocesses (e.g. MCP servers). `findAgentPid` BFS-walks descendants and returns the **deepest** descendant whose `ps` command line passes the per-agent matcher. This handles both shapes uniformly. Without it, the registered PID is either the dead wrapper or a transient helper child.

**PID-only identity (deferred: pane-based identity):** The simplest design that works. Trade-off: if the user kills and re-runs the agent inside the same tmux pane, the registry entry becomes stale (its stored PID is dead). User recovers by killing the tmux session and re-running `agent start`. A future iteration can add `paneId` to the schema for in-pane restart resilience and to enable an `agent name` command for labeling existing agents.

**Atomic registry writes:** Prevents corrupt JSON if the process is killed mid-write.

**`--type` allowlist at CLI:** Accepted types are the keys of `AGENTS` (exported from `agent-manager`). Each entry pairs a launch command with a `ps`-output matcher. No shell interpolation of user input.

**`pgrep` dependency:** `pgrep -P <pid>` is used by `findAgentPid` to walk the process tree. `pgrep` is available on macOS (built-in) and Linux (via `procps`). Soft dependency — if absent, `agent start` will fail at the PID poll step with a clear error.

**1 session = 1 agent:** each `agent start` creates one dedicated tmux session named after the agent. Sessions are independent — killing one does not affect others. This is the confirmed model (vs. shared session with windows).

## Non-Functional Requirements

- Registry reads/writes complete in <50ms (local file, small JSON)
- `agent start` completes (session created + process detected) in <3s on a typical machine
- No new external runtime dependencies (tmux is a system tool, not an npm package)
- Name validation: `/^[a-z0-9][a-z0-9-]{0,62}[a-z0-9]$/` (lowercase alphanumeric + hyphens, 2-64 chars) to be safe as tmux session names
100 changes: 100 additions & 0 deletions docs/ai/implementation/2026-05-26-feature-agent-start.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
---
phase: implementation
title: Implementation Guide
description: Technical implementation notes, patterns, and code guidelines
---

# Implementation Guide

## Development Setup

- tmux must be installed and on `PATH` (`brew install tmux` / `apt install tmux`).
- `pgrep` must be on `PATH` (default everywhere except minimal containers — see `procps-ng` on Alpine).
- Repo install: `npm install` at the monorepo root; nx wires the two affected packages (`packages/agent-manager`, `packages/cli`).
- Run the local CLI: `cd packages/cli && npm run dev -- agent start --type claude`.

## Code Structure

```
packages/agent-manager/src/
utils/AgentRegistry.ts ← new: JSON file at ~/.ai-devkit/agents.json
utils/agents.ts ← new: AGENTS registry { command, matches }
terminal/TmuxManager.ts ← new: tmux wrapper + findAgentPid BFS
AgentManager.ts ← modified: optional AgentRegistry, name overlay, registry-first resolve
index.ts ← modified: export new modules + types

packages/cli/src/
commands/agent.ts ← modified: `agent start` subcommand (parse, validate, format)
services/agent/agent.service.ts ← modified: + `startAgent` orchestration + typed errors
```

## Implementation Notes

### `AgentRegistry` (`utils/AgentRegistry.ts`)
- Plain JSON file at `~/.ai-devkit/agents.json`, schema `{ entries: RegistryEntry[] }`.
- Atomic writes: `writeFileSync(tmp)` then `renameSync(tmp, target)`. `mkdirSync(dir, { recursive: true })` on every write — cheap and idempotent.
- `readFile` is tolerant: malformed JSON or non-array `entries` returns `{ entries: [] }`. No crash on corrupt state.
- `isAlive` is just `process.kill(pid, 0)` — no `startedAt` cross-check (the original `lstart`/`etimes` cross-check was removed after the macOS timezone bug; see planning doc).
- `static default()` returns a module-level singleton at the default path; tests inject a custom path via the constructor.

### `TmuxManager` (`terminal/TmuxManager.ts`)
- Thin wrapper around `child_process.execFile` (promisified). No shell interpolation — all args passed as an array.
- `findAgentPid` BFS:
1. Read pane PID via `tmux list-panes -t <session> -F '#{pane_pid}'`.
2. Walk descendants with `pgrep -P <pid>`.
3. For each non-root node, run `ps -p <pid> -o command=` and call `matches(command)`.
4. Track the **deepest** matching node. Return it after BFS exhausts the tree.
- Returns `null` when no descendant matches yet — caller polls (5s / 500ms).
- `TmuxManager` has no agent-type knowledge; the matcher comes from `AGENTS[type].matches`.

### `AGENTS` registry (`utils/agents.ts`)
- One `{ command, matches }` per `StartableAgentType` (= `AgentType` minus `'other'`).
- `matchArgv0(name)` — basename of the first whitespace-delimited token. Works for npm bin shims where the shim execs into node with the binary name preserved (claude, codex, opencode).
- `matchAnyToken(name)` — basename of any token. Needed for gemini, which ships as a Node script: `ps` shows `node /opt/homebrew/bin/gemini ...`, so the real binary basename is in `argv[1]`, not `argv[0]`.

### `AgentManager` changes (`AgentManager.ts`)
- Constructor now takes an optional `AgentRegistry`, defaulting to `AgentRegistry.default()`. Lets tests inject a stub and ensures the CLI uses one shared instance via `createAgentManager()`.
- `listAgents()`: after aggregating adapter results, for each agent whose `pid` matches a registry entry, overwrite `agent.name` with the registry name. Registry is **not** pruned on read — that runs opportunistically in `agent start`.
- `resolveAgent(input, agents)`: registry-first lookup (by exact name match), then falls through to the existing exact / substring matching on `AgentInfo.name`.

### `agent start` CLI (`commands/agent.ts`)
The handler is a thin shell: parse options, default name if `--name` omitted, validate input format (type ∈ `AGENTS`, name matches `NAME_REGEX`, cwd exists), then delegate to the service. Typed errors from the service are mapped to `ui.error(...)` + `process.exit(1)`. Success prints name, type, PID, cwd, and the `tmux attach` hint.

### `startAgent` service (`services/agent/agent.service.ts`)
Orchestration lives here so it's unit-testable independent of the CLI:
1. `tmux.isAvailable()` — throws `TmuxUnavailableError`.
2. `registry.prune()` then `registry.lookup(name)` — throws `AgentNameInUseError(name, pid)` if live.
3. `tmux.sessionExists(name)` — if true, fire `onWarning(...)` and `tmux.killSession(name)` to replace the orphan.
4. `tmux.createSession(name, cwd)` → `tmux.sendKeys(name, agent.command)`.
5. Poll `tmux.findAgentPid(name, agent.matches)` every 500ms for up to 5s.
6. On success: `registry.register({ name, pid, type, tmuxSession: name, cwd, startedAt })`, return the entry.
7. On timeout: `tmux.killSession(name)`, throw `AgentPidPollTimeoutError(name, command, timeoutMs)`.

Lives in the same file as `waitForAgentResponse` because both are orchestration-of-tmux/registry/adapter primitives for the `agent ...` subcommands.

## Integration Points

- `AgentRegistry.default()` is the single source of truth across the process. `createAgentManager()` passes it to `AgentManager`, and `agent start` reads/writes it directly.
- Existing adapters (`ClaudeCodeAdapter`, `CodexAdapter`, `GeminiCliAdapter`, `OpenCodeAdapter`) are **unchanged**. The registry is a pure overlay.
- `@ai-devkit/agent-manager` exports the new public surface from `src/index.ts`: `AgentRegistry`, `TmuxManager`, `AGENTS`, `AgentConfig`, `StartableAgentType`, `RegistryEntry`.

## Error Handling

- All CLI failures use `ui.error(...)` + `process.exit(1)` to match the surrounding command style.
- tmux subprocess failures are normalized: `isAvailable`/`sessionExists` return booleans; `killSession` swallows "already gone" errors; `findAgentPid` returns `null` rather than throwing so the poll loop can retry.
- Registry I/O failures (missing file, bad JSON) degrade silently to an empty registry — the agent flow does not depend on stored state to succeed.
- No retries for tmux commands; a transient tmux failure surfaces as an `agent start` failure with the original tmux error chained.

## Performance Considerations

- `findAgentPid` is the hot path during the 5s startup poll. Each BFS visit runs two subprocesses (`pgrep`, `ps`). Typical trees are 1–3 nodes, so total cost per poll is ~10–30ms.
- An earlier optimization tried a single-`ps`-snapshot strategy and 200ms polling; it didn't improve real-world wall time (the agent process simply isn't there sooner) and was reverted.
- Registry reads parse a small JSON file; writes are atomic. No concurrency control by design — see "race window" in the design doc.

## Security Notes

- No shell interpolation: every tmux/ps/pgrep call uses `execFile` with an args array; user input never reaches a shell.
- `--type` is allowlisted against `AGENTS` keys before any subprocess runs.
- `--name` is regex-validated (`/^[a-z0-9][a-z0-9-]{0,62}[a-z0-9]$/`) so it's safe as a tmux session name and a filename fragment.
- `--cwd` is `path.resolve`-d and `fs.existsSync`-checked before use.
- The registry file lives in the user's home directory; no secrets are written — only `name`, `type`, `pid`, `tmuxSession`, `cwd`, `startedAt`.
Loading
Loading