ζ₯ζ¬θͺη README γ―γγ‘γ
Give Claude Code structured Codex traces, not raw output.
For Claude Code users who want GPT-5.4 as a real tool: claude-code-codex-agents parses the entire JSONL event stream from Codex CLI and returns a structured execution report -- which tools it used, which files it touched, how long it took, and what went wrong. No other Codex MCP bridge does this.
graph LR
A["Claude Code<br/>(Opus 4.6)"] -->|MCP Protocol| B["claude-code-codex-agents<br/>MCP Server"]
B -->|"subprocess + stdin"| C[Codex CLI]
C -->|JSONL stream| B
C -->|API call| D["OpenAI API<br/>(GPT-5.4)"]
B -->|Structured Report| A
Without -- You call Codex CLI and get a wall of text. You don't know what tools it used, what files it changed, or if it actually succeeded.
With claude-code-codex-agents -- Claude Code gets a structured execution trace:
[Codex gpt-5.4] Completed
β± Execution time: 8.3s
π§΅ Thread: 019d436e-4c39-7093-b7ed-f8a26aca7938
π¦ Tools used (3):
β
read_file β src/auth.py
β
edit_file β src/auth.py
β
shell β python -m pytest tests/
π Files touched (1):
β’ src/auth.py
βββ Codex Response βββ
Fixed the authentication logic. Token validation order was incorrect.
There are 6+ Codex MCP bridges on GitHub. Here's what makes this one different:
| Other bridges | claude-code-codex-agents | |
|---|---|---|
| Output | Raw text dump | Structured trace (tools, files, timing, errors) |
| Parallel tasks | 1 at a time | Up to 6 simultaneous |
| Session continuity | Stateless | threadId persistence across calls |
| Security | Pass-through | 3-tier sandbox + terminal injection prevention |
| Tests | Few or none | 59 tests (parsing, security, sessions, edge cases, agent lifecycle) |
| Review | Basic or none | Adversarial Review Loop (GPT-5.4 challenges Claude's code) |
- Full JSONL Trace Parsing -- Every Codex event (tool calls, file ops, errors) parsed into a structured report
- Parallel Execution -- Run up to 6 Codex tasks simultaneously via
parallel_execute - Session Management -- Continue previous threads with
session_continue(threadId persistence) - Agent Lifecycle -- Run Codex as a background Claude Code-style worker via
spawn_codex_agent,send_codex_agent_input, andwait_codex_agent - Adversarial Review Loop -- GPT-5.4 reviews Claude's code from a different perspective
- Sandbox Security -- 3-tier policy (read-only / workspace-write / danger-full-access) + terminal injection prevention
- Cross-Model Discussion -- Get GPT-5.4's opinion on design decisions via
discuss - Zero External Dependencies -- Just FastMCP + Codex CLI. No databases, no Docker, no config files
- Japanese Native -- Full Japanese prompt and report support
- 59 Tests -- Comprehensive coverage including security, parsing, session management, agent lifecycle, and edge cases
npm install -g @openai/codex
codex logingit clone https://github.com/tsunamayo7/claude-code-codex-agents.git
cd claude-code-codex-agents
uv syncClaude Code (~/.claude/settings.json):
{
"mcpServers": {
"claude-code-codex-agents": {
"type": "stdio",
"command": "uv",
"args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
"env": { "PYTHONUTF8": "1" }
}
}
}Cursor (~/.cursor/mcp.json)
{
"mcpServers": {
"claude-code-codex-agents": {
"command": "uv",
"args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
"env": { "PYTHONUTF8": "1" }
}
}
}VS Code / Windsurf
Add to your MCP settings:
{
"claude-code-codex-agents": {
"command": "uv",
"args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
"env": { "PYTHONUTF8": "1" }
}
}| Tool | Description | Sandbox |
|---|---|---|
execute |
Delegate tasks to Codex with structured trace report | workspace-write |
trace_execute |
Same as execute, plus full event timeline | workspace-write |
parallel_execute |
Run up to 6 tasks simultaneously | read-only |
review |
Adversarial code review by GPT-5.4 | read-only |
explain |
Code explanation (brief/medium/detailed) | read-only |
generate |
Code generation with optional file output | workspace-write |
discuss |
Get GPT-5.4's perspective on design decisions | read-only |
session_continue |
Continue a previous Codex thread | workspace-write |
session_list |
List session history with thread IDs | - |
spawn_codex_agent |
Launch a background Codex worker with default / explorer / worker roles |
role-based |
send_codex_agent_input |
Continue a background Codex worker with follow-up instructions | same as agent |
wait_codex_agent |
Wait for an agent turn and fetch the last structured result | - |
list_codex_agents |
Inspect tracked background Codex agents | - |
close_codex_agent |
Close an idle Codex agent | - |
status |
Check Codex CLI status and auth | - |
The new agent lifecycle tools let Claude Code treat Codex more like a persistent sub-agent than a one-shot CLI call.
- Use
spawn_codex_agentto start a background worker with a role preset:defaultfor balanced execution,explorerfor read-heavy investigation,workerfor implementation. - Use
send_codex_agent_inputto continue the same worker after you read its last result. - Use
wait_codex_agentto poll for completion without blocking other work. - Use
list_codex_agentsandclose_codex_agentto manage idle workers.
Claude Code writes code, then asks GPT-5.4 to review it:
[Codex Review] GPT-5.4 Review Result
β± Execution time: 15.7s
βββ Codex Response βββ
- [CRITICAL] `run(cmd)` calls `os.system(cmd)` directly -- command injection
if `cmd` contains user input. Use `subprocess.run([...], shell=False)`.
- [WARNING] `divide(a, b)` raises ZeroDivisionError when b == 0.
Add a pre-check or explicit error message.
- [INFO] No type hints on function signatures. Add `def divide(a: float,
b: float) -> float:` for readability.
Analyze multiple tasks simultaneously:
[Parallel Execution Complete] 3 tasks
βββ Task 1 β
βββ
Instruction: Analyze src/auth.py for security issues
β± 5.2s
...
βββ Task 2 β
βββ
Instruction: Review database query patterns in src/db.py
β± 7.8s
...
βββ Task 3 β
βββ
Instruction: Check error handling in src/api.py
β± 4.1s
...
sequenceDiagram
participant C as Claude Code
participant H as claude-code-codex-agents
participant X as Codex CLI
participant O as OpenAI API
C->>H: MCP tool call (execute)
H->>H: _validate() + _enforce_sandbox()
H->>X: subprocess (stdin prompt)
X->>O: API request (GPT-5.4)
O-->>X: Response
X-->>H: JSONL event stream
H->>H: parse_jsonl_events() β CodexTrace
H->>H: _sanitize() β format_report()
H-->>C: Structured report
| Sandbox Mode | File Write | Shell Exec | Use Case |
|---|---|---|---|
read-only |
Blocked | Blocked | Review, explain, discuss |
workspace-write |
CWD only | Allowed | Execute, generate |
danger-full-access |
Anywhere | Allowed | Full system access (use with caution) |
Additional protections:
- ANSI/OSC escape sequence sanitization (terminal injection prevention)
- Input validation on all parameters
- Process kill on timeout
--ephemeralflag (no persistent Codex state)
# Setup
git clone https://github.com/tsunamayo7/claude-code-codex-agents.git
cd claude-code-codex-agents
uv sync --extra dev
# Run tests (59 tests)
uv run pytest tests/ -v
# Run server directly
uv run python server.pyProject structure: Single file (server.py, ~820 lines). Easy to read, modify, and contribute.
- Cross-Model Code Review -- Claude writes code, GPT-5.4 reviews it. Eliminates single-model bias.
- Parallel Codebase Analysis -- Analyze 6 files simultaneously, get structured reports for each.
- Design Discussion -- Get GPT-5.4's alternative perspective on architectural decisions via
discuss. - Session-Based Refactoring -- Large refactoring across multiple
session_continuecalls with context preservation. - AI Second Opinion -- When Claude's answer seems off, ask GPT-5.4 for a sanity check.
- Python 3.12+
- Codex CLI (
npm install -g @openai/codex) - OpenAI account (Codex CLI must be authenticated via
codex login) - uv (recommended) or pip
- helix-ai-studio β All-in-one AI chat studio with 7 providers, RAG, MCP tools, and pipeline
- helix-pilot β GUI automation MCP server β AI controls Windows desktop via local Vision LLM
- helix-agent β Extend Claude Code with local Ollama models β cut token costs by 60-80%
- helix-sandbox β Secure sandbox MCP server β Docker + Windows Sandbox
- codex-plugin-cc -- Official OpenAI plugin for Claude Code
- codex-mcp-server -- Alternative Codex MCP bridge (Node.js)
