Skip to content

tsunamayo7/claude-code-codex-agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

claude-code-codex-agents

MIT License Python 3.12+ Tests MCP Compatible

ζ—₯本θͺžη‰ˆ README はこけら

Give Claude Code structured Codex traces, not raw output.

For Claude Code users who want GPT-5.4 as a real tool: claude-code-codex-agents parses the entire JSONL event stream from Codex CLI and returns a structured execution report -- which tools it used, which files it touched, how long it took, and what went wrong. No other Codex MCP bridge does this.

Architecture Overview

graph LR
    A["Claude Code<br/>(Opus 4.6)"] -->|MCP Protocol| B["claude-code-codex-agents<br/>MCP Server"]
    B -->|"subprocess + stdin"| C[Codex CLI]
    C -->|JSONL stream| B
    C -->|API call| D["OpenAI API<br/>(GPT-5.4)"]
    B -->|Structured Report| A
Loading

Without vs With claude-code-codex-agents

Without -- You call Codex CLI and get a wall of text. You don't know what tools it used, what files it changed, or if it actually succeeded.

With claude-code-codex-agents -- Claude Code gets a structured execution trace:

[Codex gpt-5.4] Completed

⏱ Execution time: 8.3s
🧡 Thread: 019d436e-4c39-7093-b7ed-f8a26aca7938

πŸ“¦ Tools used (3):
  βœ… read_file β€” src/auth.py
  βœ… edit_file β€” src/auth.py
  βœ… shell β€” python -m pytest tests/

πŸ“ Files touched (1):
  β€’ src/auth.py

━━━ Codex Response ━━━
Fixed the authentication logic. Token validation order was incorrect.

Why claude-code-codex-agents?

There are 6+ Codex MCP bridges on GitHub. Here's what makes this one different:

Other bridges claude-code-codex-agents
Output Raw text dump Structured trace (tools, files, timing, errors)
Parallel tasks 1 at a time Up to 6 simultaneous
Session continuity Stateless threadId persistence across calls
Security Pass-through 3-tier sandbox + terminal injection prevention
Tests Few or none 59 tests (parsing, security, sessions, edge cases, agent lifecycle)
Review Basic or none Adversarial Review Loop (GPT-5.4 challenges Claude's code)

Key Features

  • Full JSONL Trace Parsing -- Every Codex event (tool calls, file ops, errors) parsed into a structured report
  • Parallel Execution -- Run up to 6 Codex tasks simultaneously via parallel_execute
  • Session Management -- Continue previous threads with session_continue (threadId persistence)
  • Agent Lifecycle -- Run Codex as a background Claude Code-style worker via spawn_codex_agent, send_codex_agent_input, and wait_codex_agent
  • Adversarial Review Loop -- GPT-5.4 reviews Claude's code from a different perspective
  • Sandbox Security -- 3-tier policy (read-only / workspace-write / danger-full-access) + terminal injection prevention
  • Cross-Model Discussion -- Get GPT-5.4's opinion on design decisions via discuss
  • Zero External Dependencies -- Just FastMCP + Codex CLI. No databases, no Docker, no config files
  • Japanese Native -- Full Japanese prompt and report support
  • 59 Tests -- Comprehensive coverage including security, parsing, session management, agent lifecycle, and edge cases

Quick Start

1. Install Codex CLI

npm install -g @openai/codex
codex login

2. Install claude-code-codex-agents

git clone https://github.com/tsunamayo7/claude-code-codex-agents.git
cd claude-code-codex-agents
uv sync

3. Add to your MCP client

Claude Code (~/.claude/settings.json):

{
  "mcpServers": {
    "claude-code-codex-agents": {
      "type": "stdio",
      "command": "uv",
      "args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
      "env": { "PYTHONUTF8": "1" }
    }
  }
}
Cursor (~/.cursor/mcp.json)
{
  "mcpServers": {
    "claude-code-codex-agents": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
      "env": { "PYTHONUTF8": "1" }
    }
  }
}
VS Code / Windsurf

Add to your MCP settings:

{
  "claude-code-codex-agents": {
    "command": "uv",
    "args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
    "env": { "PYTHONUTF8": "1" }
  }
}

Tools

Tool Description Sandbox
execute Delegate tasks to Codex with structured trace report workspace-write
trace_execute Same as execute, plus full event timeline workspace-write
parallel_execute Run up to 6 tasks simultaneously read-only
review Adversarial code review by GPT-5.4 read-only
explain Code explanation (brief/medium/detailed) read-only
generate Code generation with optional file output workspace-write
discuss Get GPT-5.4's perspective on design decisions read-only
session_continue Continue a previous Codex thread workspace-write
session_list List session history with thread IDs -
spawn_codex_agent Launch a background Codex worker with default / explorer / worker roles role-based
send_codex_agent_input Continue a background Codex worker with follow-up instructions same as agent
wait_codex_agent Wait for an agent turn and fetch the last structured result -
list_codex_agents Inspect tracked background Codex agents -
close_codex_agent Close an idle Codex agent -
status Check Codex CLI status and auth -

Claude Code-Style Agents

The new agent lifecycle tools let Claude Code treat Codex more like a persistent sub-agent than a one-shot CLI call.

  • Use spawn_codex_agent to start a background worker with a role preset: default for balanced execution, explorer for read-heavy investigation, worker for implementation.
  • Use send_codex_agent_input to continue the same worker after you read its last result.
  • Use wait_codex_agent to poll for completion without blocking other work.
  • Use list_codex_agents and close_codex_agent to manage idle workers.

Real-World Example: Adversarial Code Review

Claude Code writes code, then asks GPT-5.4 to review it:

[Codex Review] GPT-5.4 Review Result

⏱ Execution time: 15.7s

━━━ Codex Response ━━━
- [CRITICAL] `run(cmd)` calls `os.system(cmd)` directly -- command injection
  if `cmd` contains user input. Use `subprocess.run([...], shell=False)`.

- [WARNING] `divide(a, b)` raises ZeroDivisionError when b == 0.
  Add a pre-check or explicit error message.

- [INFO] No type hints on function signatures. Add `def divide(a: float,
  b: float) -> float:` for readability.

Real-World Example: Parallel Execution

Analyze multiple tasks simultaneously:

[Parallel Execution Complete] 3 tasks

━━━ Task 1 βœ… ━━━
Instruction: Analyze src/auth.py for security issues
⏱ 5.2s
...

━━━ Task 2 βœ… ━━━
Instruction: Review database query patterns in src/db.py
⏱ 7.8s
...

━━━ Task 3 βœ… ━━━
Instruction: Check error handling in src/api.py
⏱ 4.1s
...

Architecture

sequenceDiagram
    participant C as Claude Code
    participant H as claude-code-codex-agents
    participant X as Codex CLI
    participant O as OpenAI API

    C->>H: MCP tool call (execute)
    H->>H: _validate() + _enforce_sandbox()
    H->>X: subprocess (stdin prompt)
    X->>O: API request (GPT-5.4)
    O-->>X: Response
    X-->>H: JSONL event stream
    H->>H: parse_jsonl_events() β†’ CodexTrace
    H->>H: _sanitize() β†’ format_report()
    H-->>C: Structured report
Loading

Security Model

Sandbox Mode File Write Shell Exec Use Case
read-only Blocked Blocked Review, explain, discuss
workspace-write CWD only Allowed Execute, generate
danger-full-access Anywhere Allowed Full system access (use with caution)

Additional protections:

  • ANSI/OSC escape sequence sanitization (terminal injection prevention)
  • Input validation on all parameters
  • Process kill on timeout
  • --ephemeral flag (no persistent Codex state)

Development

# Setup
git clone https://github.com/tsunamayo7/claude-code-codex-agents.git
cd claude-code-codex-agents
uv sync --extra dev

# Run tests (59 tests)
uv run pytest tests/ -v

# Run server directly
uv run python server.py

Project structure: Single file (server.py, ~820 lines). Easy to read, modify, and contribute.

Use Cases

  1. Cross-Model Code Review -- Claude writes code, GPT-5.4 reviews it. Eliminates single-model bias.
  2. Parallel Codebase Analysis -- Analyze 6 files simultaneously, get structured reports for each.
  3. Design Discussion -- Get GPT-5.4's alternative perspective on architectural decisions via discuss.
  4. Session-Based Refactoring -- Large refactoring across multiple session_continue calls with context preservation.
  5. AI Second Opinion -- When Claude's answer seems off, ask GPT-5.4 for a sanity check.

Requirements

  • Python 3.12+
  • Codex CLI (npm install -g @openai/codex)
  • OpenAI account (Codex CLI must be authenticated via codex login)
  • uv (recommended) or pip

Related Projects

Helix Ecosystem

  • helix-ai-studio β€” All-in-one AI chat studio with 7 providers, RAG, MCP tools, and pipeline
  • helix-pilot β€” GUI automation MCP server β€” AI controls Windows desktop via local Vision LLM
  • helix-agent β€” Extend Claude Code with local Ollama models β€” cut token costs by 60-80%
  • helix-sandbox β€” Secure sandbox MCP server β€” Docker + Windows Sandbox

Alternative Codex Bridges

License

MIT

About

MCP server bridging Claude Code to Codex CLI (GPT-5.4) with full JSONL trace visibility, parallel execution, and adversarial review

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages