Skip to content

feat(providers): add OpenAI Codex provider with Responses API support#128

Draft
b3nw wants to merge 9 commits intoMirrowel:devfrom
b3nw:feature/codex-all
Draft

feat(providers): add OpenAI Codex provider with Responses API support#128
b3nw wants to merge 9 commits intoMirrowel:devfrom
b3nw:feature/codex-all

Conversation

@b3nw
Copy link
Contributor

@b3nw b3nw commented Feb 15, 2026

Add full Codex provider implementation with OAuth-based authentication, Responses API integration, and quota tracking.

New files:

  • codex_provider.py: Codex provider with Responses API streaming/non-streaming
  • openai_oauth_base.py: Reusable OAuth base for OpenAI-style authentication
  • codex_quota_tracker.py: Header-based quota tracking with tier detection
  • codex_prompt.txt: System prompt template for Codex interactions

Integration points:

  • Register codex in credential_manager, credential_tool, provider_factory
  • Inject usage manager references into providers that support it
  • Skip limitless quota groups (rotation-only) in TUI summary/detail views

Important

Adds OpenAI Codex provider with OAuth authentication, Responses API support, and quota tracking.

  • Behavior:
    • Adds CodexProvider in codex_provider.py for OpenAI Codex models with OAuth authentication and Responses API.
    • Supports streaming and non-streaming responses, reasoning output, and tool calling.
    • Integrates quota tracking via CodexQuotaTracker.
  • Authentication:
    • Implements OAuth2 with PKCE in openai_oauth_base.py for token management.
    • Supports token refresh and API key exchange.
  • Quota Management:
    • Adds codex_quota_tracker.py for tracking rate limits and credits.
    • Updates quota from API headers and stores baselines in UsageManager.
  • Integration:
    • Registers Codex in provider_factory.py and updates credential management.
    • Modifies quota_viewer.py to skip limitless quota groups in views.

This description was created by Ellipsis for 993793c. You can customize this summary. It will automatically update as commits are pushed.

@b3nw
Copy link
Contributor Author

b3nw commented Feb 15, 2026

@mirrobot-agent gimme a review ya lazy bot

@mirrobot-agent
Copy link
Contributor

@b3nw, I'm on it! I'll dive into the Codex provider implementation and give you a thorough review. 🕵️‍♂️

Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall Assessment

The OpenAI Codex provider implementation is exceptionally thorough and well-integrated into the existing rotator architecture. The move to the Responses API allows for advanced features like reasoning output and streaming tool calls, which are critical for agentic workflows. The OAuth implementation with PKCE is robust and handles headless environments elegantly.

Architectural Feedback

  • Separation of Concerns: The use of mixins (OpenAIOAuthBase, CodexQuotaTracker) keeps the main CodexProvider class focused on request/response translation.
  • Resilience: The token refresh logic and re-auth coordinator handle credential expiration gracefully.
  • Quota Tracking: Using virtual models (_5h_window, _weekly_window) is a clever way to expose Codex's unique rate limit windows to the TUI.

Key Suggestions

  • Error Handling: Refine the generic exceptions in the streaming loop to include more context (status codes, etc.).
  • Fragile Parsing: Move away from regex-based retry-after extraction if the API provides structured alternatives.
  • Backend API usage: Double-check the stability/longevity of the chatgpt.com/backend-api endpoints as they appear to be internal.

Questions for the Author

  • GPT-5 Models: Are the gpt-5 model names intended as placeholders or are they targeting a specific internal preview?
  • System Message Conversion: Is the conversion of system messages to user messages a strict requirement of the Responses API, or a strategy to bypass the fixed instructions field?

This review was generated by an AI assistant.

error_message = error.get("message", "Response failed")

if error_message:
raise Exception(f"Codex response failed: {error_message}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a more specific exception type here, or including the HTTP status code if available, to help with debugging and error handling downstream.


# Try to extract from message
import re
match = re.search(r"try again in (\d+)s", message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsing rate limit retry times from strings can be fragile. If the API provides a Retry-After header or a structured field in the JSON error response, those would be safer to use.

# - If base contains /backend-api: use /wham/usage (ChatGptApi style)
# - Otherwise: use /api/codex/usage (CodexApi style)
# Since we use chatgpt.com/backend-api, we need /wham/usage
CODEX_USAGE_URL = "https://chatgpt.com/backend-api/wham/usage"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a internal backend API endpoint. Ensure that the proxy is intended to hit this directly and that there are no public-facing alternatives that might be more stable.

if item.get("type") == "function_call":
call_id = item.get("call_id") or item.get("id", "")
name = item.get("name", "")
arguments = item.get("arguments", "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User-provided system messages are converted to user messages here. While this ensures compatibility with the Responses API's strict instruction field, it might affect models that rely heavily on the system role for personality/constraints. It's worth confirming if this is the intended behavior for all Codex-style models.

@mirrobot-agent
Copy link
Contributor

@b3nw, I've finished the review! Overall, it's a very solid implementation. I've left some detailed feedback and a few questions in the formal review. Let me know if you want me to look into anything else! 🚀

@Mirrowel Mirrowel added enhancement New feature or request Agent Monitored Monitored for AI Agent to review PR's and commits Priority labels Feb 15, 2026
@mirrobot-agent
Copy link
Contributor

I'm starting my review of the OpenAI Codex provider integration. I see some new retry logic for handling garbled tool calls in the Responses API - checking the implementation details now!

Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review: Codex Provider & Responses API

The implementation of the OpenAI Codex provider is very thorough and the integration with the Responses API is well-handled. The latest additions introducing retry logic for garbled tool calls are a welcome step toward production-level reliability.

Assessment of New Changes:

  • Retry Logic: The introduction of _stream_with_retry and _non_stream_with_retry correctly addresses the intermittent "ChatGPT internal format" reversion issue.
  • Marker Detection: The use of the GARBLED_TOOL_CALL_MARKER is a practical heuristic for detecting when the model has entered an invalid state.

Key Suggestions:

  • Streaming Duplication (High Priority): As noted in the inline comment, retrying a stream after yielding chunks will cause duplication of the initial content (like reasoning tokens) in the client's view. This should be addressed by buffering the start of the stream or implementing a "reset" signal for the client if possible.
  • Fragmented Marker Detection: The current check might miss markers split across two network packets/chunks.
  • Cleanup: There is a minor unreachable code path in the non-streaming retry wrapper.

Note on Previous Feedback:
The previous suggestions regarding specific exception types (line 1186), fragile regex-based retry extraction (line 1242), and the stability of internal backend API endpoints still apply and should be considered for future hardening.

This review was generated by an AI assistant.

Comment on lines 713 to 760
for attempt in range(GARBLED_TOOL_CALL_MAX_RETRIES):
garbled_detected = False

try:
async for chunk in self._stream_response(
client, headers, payload, model, reasoning_compat, credential_path
):
# Inspect content deltas for garbled tool call marker
if hasattr(chunk, "choices") and chunk.choices:
choice = chunk.choices[0]
delta = getattr(choice, "delta", None)
if delta:
content = getattr(delta, "content", None) or ""
if GARBLED_TOOL_CALL_MARKER in content:
garbled_detected = True
lib_logger.warning(
f"[Codex] Garbled tool call detected in stream for {model}, "
f"attempt {attempt + 1}/{GARBLED_TOOL_CALL_MAX_RETRIES}. "
f"Content snippet: {content[:100]!r}"
)
break # Stop consuming this stream

yield chunk

if not garbled_detected:
return # Stream completed successfully

except Exception:
if garbled_detected:
# Exception during stream teardown after garble detected - continue to retry
pass
else:
raise # Non-garble exception - propagate

# Garbled stream detected - retry if we have attempts left
if attempt < GARBLED_TOOL_CALL_MAX_RETRIES - 1:
lib_logger.info(
f"[Codex] Retrying request for {model} after garbled tool call "
f"(attempt {attempt + 2}/{GARBLED_TOOL_CALL_MAX_RETRIES})"
)
await asyncio.sleep(GARBLED_TOOL_CALL_RETRY_DELAY)
else:
lib_logger.error(
f"[Codex] Garbled tool call persisted after {GARBLED_TOOL_CALL_MAX_RETRIES} "
f"attempts for {model}. Giving up."
)
# Final attempt already yielded whatever chunks it had
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a significant duplication risk in the streaming retry logic. Since yield chunk occurs inside the same loop that detects the garbled marker, any chunks yielded before the marker is found (e.g., reasoning tokens or early valid text) will be sent to the client. When the request is retried, the client will receive those same chunks again from the new stream, leading to duplicated content in the UI.

Consider buffering the early chunks or only retrying if no content has been yielded yet. If retrying mid-stream is unavoidable, the client should be notified to clear the previous partial response.

delta = getattr(choice, "delta", None)
if delta:
content = getattr(delta, "content", None) or ""
if GARBLED_TOOL_CALL_MARKER in content:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking for GARBLED_TOOL_CALL_MARKER in individual chunks may fail if the marker (+#+#) is split across two consecutive chunks (e.g., +# followed by +#).

Consider maintaining a small rolling buffer of the last few characters to ensure the marker is detected even when fragmented.

return response

# Should not reach here, but return the last response as fallback
return response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is unreachable because the return response at line 805 is inside the loop and covers the final iteration fallback.

@b3nw b3nw force-pushed the feature/codex-all branch from 41085fe to f3739c2 Compare February 15, 2026 16:07
Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review: Anthropic & Cursor Integration

The PR continues to expand with robust implementations for Anthropic (OAuth + API key) and Cursor providers. The new Log Viewer TUI is also a great addition for observability.

Assessment of New Changes:

  • Anthropic Provider: The dual routing (OAuth/API key) is well-handled. The interleaved thinking support is correctly implemented.
  • Cursor Provider: Solid integration with the sidecar and clever usage of a virtual model for quota tracking.
  • Log Viewer: A very functional addition. The lazy loading of metadata and prompts is a good choice for performance.
  • Request Transforms: Centralized transforms for Kimi and GLM-5/4 help with model-specific quirks.

Note on Previous Feedback:
The critical issues raised in the previous review for src/rotator_library/providers/codex_provider.py (specifically the Streaming Duplication risk on line 760 and the Fragmented Marker detection on line 726) still apply as the file has not been modified in the latest commits. These should be addressed to ensure reliability in production scenarios.

This review was generated by an AI assistant.

ANTHROPIC_MESSAGES_ENDPOINT = f"{ANTHROPIC_API_BASE}/v1/messages"

# Required headers for OAuth requests
ANTHROPIC_BETA_HEADER = "oauth-2025-04-20,interleaved-thinking-2025-05-14"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The date 2025-05-14 in the ANTHROPIC_BETA_HEADER seems very far in the future. Is this a placeholder or a requirement for a specific private beta?

ANTHROPIC_USER_AGENT = "claude-cli/2.1.2 (external, cli)"

# Tool name prefix for OAuth path
TOOL_PREFIX = "mcp_"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the TOOL_PREFIX set to mcp_? If this refers to "Model Context Protocol", it might be worth adding a comment explaining the choice, as it might affect client expectations for tool naming.

from rich.syntax import Syntax


def _get_logs_dir() -> Path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function re-implements logic already available in src/rotator_library/utils/paths.py. It is better to import get_logs_dir from there to ensure consistency and avoid duplication.

@b3nw b3nw force-pushed the feature/codex-all branch from f3739c2 to 8b094e3 Compare February 15, 2026 18:39
Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review: Codex Garble Handling Refinements

The latest updates to the Codex provider significantly harden the garbled tool call detection and retry logic. The move to a buffer-then-flush strategy for streaming is a particularly robust fix for the duplication risks previously identified.

Assessment of New Changes:

  • Robust Detection: Expanding GARBLED_TOOL_CALL_MARKERS and implementing accumulated_text tracking ensures that garbled output is caught even when markers are split across chunks.
  • Streaming Reliability: The buffer-then-flush approach in _stream_with_retry is a great solution. While it adds latency to the initial chunk, it guarantees that no "corrupted" or duplicated content reaches the client.
  • Diagnostics: The new logging for response.completed will be very helpful for tracking down why the model sometimes reverts to ChatML-style tool calls.

Note on Previous Feedback:

  • Resolved: The high-priority issue regarding Streaming Duplication and the concern about Fragmented Markers have been successfully addressed.
  • Still Applies: The minor unreachable code path in _non_stream_with_retry (line 863) still exists. Additionally, the previous suggestions regarding specific exception types (line 1255 of codex_provider.py in the full file) and regex-based retry extraction in the quota tracker remain relevant for future hardening.

This review was generated by an AI assistant.

return response

# Should not reach here, but return the last response as fallback
return response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line remains unreachable because the return response at line 860 covers all exit paths from the loop. While minor, removing it would clean up the implementation.

Comment on lines +728 to +743
Uses a buffer-then-flush approach: all chunks are collected first,
then checked for the garbled marker. Only if the stream is clean
are chunks yielded to the caller. This allows true retry since
no chunks have been sent to the HTTP client yet.

Detection is done both per-chunk (for early abort) AND on the
accumulated text after stream completion (to catch markers that
are split across multiple SSE chunks).
"""
for attempt in range(GARBLED_TOOL_CALL_MAX_RETRIES):
garbled_detected = False
buffered_chunks: list = []
accumulated_text = "" # Track all text content across chunks

try:
async for chunk in self._stream_response(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buffer-then-flush strategy effectively addresses the streaming duplication risk identified in the previous review. Note that this does mean real-time streaming is disabled for the client as they must wait for the entire response to be generated before any chunks are yielded. Given the intermittent nature of the Responses API garbling, this seems like the correct trade-off for reliability.

buffered_chunks.append(chunk)

# Post-stream check: inspect accumulated text for markers split across chunks
if not garbled_detected and _is_garbled_tool_call(accumulated_text):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using accumulated_text for a post-stream check is a great addition to catch garble markers that might be split across network packets. This addresses the fragmented marker concern from the previous review.

@b3nw b3nw force-pushed the feature/codex-all branch from 8b094e3 to 64c0a37 Compare February 15, 2026 18:40
Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

This update focuses on cleaning up diagnostic logging within the Codex provider's streaming logic.

Assessment of New Changes:

  • Log Cleanup: The removal of the verbose lib_logger.info block in the response.completed handler is a sensible refinement now that the Responses API integration is stable.

Overall Status:
The implementation remains robust. The core logic for handling garbled tool calls and the buffer-then-flush streaming strategy are well-executed.

Note on Previous Feedback:
The following items from previous reviews remain unaddressed:

  • codex_provider.py: The unreachable code path at line 863 and the use of generic exceptions at lines 1123 and 1242.
  • codex_quota_tracker.py: The recommendation to favor structured retry-after data over regex parsing where the API allows.
  • anthropic_provider.py: The questions regarding the future-dated beta header (2025-05-14) and the choice of the mcp_ tool prefix.
  • log_viewer.py: The suggestion to use the centralized get_logs_dir utility for path consistency.

Great work on the cleanup!

This review was generated by an AI assistant.

b3nw and others added 9 commits February 15, 2026 19:43
… support (#33)

* feat(rotator_library): ✨ add OpenAI Codex provider with Responses API support

Adds a new provider for OpenAI Codex models (GPT-5, GPT-5.1, GPT-5.2, Codex, Codex Mini)
via the ChatGPT Responses API with OAuth PKCE authentication.

Key features:
- OAuth base class for OpenAI authentication with PKCE flow and token refresh
- Responses API streaming with SSE event handling
- Reasoning/thinking output with configurable effort levels (minimal to xhigh)
- Tool calling support translated from OpenAI format
- System prompt validation using official opencode prompt
- Usage tracking with proper litellm.Usage objects

Files added:
- codex_provider.py: Main provider implementation
- openai_oauth_base.py: OAuth base class with PKCE support
- codex_prompt.txt: Required system prompt for API validation

* fix(codex): 🐛 prefix model names with provider identifier

Models must be returned with the 'codex/' prefix (e.g., 'codex/gpt-5.2')
to match the convention used by other providers like antigravity.
This ensures proper provider routing in the RotatingClient.

* feat(codex): ✨ add reasoning effort model variants

Exposes reasoning effort levels as model variants:
- codex/gpt-5.2:low, :medium, :high, :xhigh
- codex/gpt-5.1:low, :medium, :high
- And similar for all codex models

This allows clients to control reasoning effort by model name,
similar to how gemini models use :thinking suffix.

Total models: 9 base + 30 reasoning variants = 39 models

* feat(codex): ✨ add identity override for user system prompt precedence

Injects an identity override as the first user message that tells the model
to prioritize user-provided system prompts over the required opencode
instructions. This mirrors the pattern used by Antigravity provider.

Message order:
1. Identity override (<system_override> tag)
2. User's system message (converted to user message)
3. Rest of conversation

Controlled by CODEX_INJECT_IDENTITY_OVERRIDE env var (default: true)

* docs: 📝 add Codex provider configuration to .env.example

Documents all Codex environment variables:
- CODEX_REASONING_EFFORT: low/medium/high/xhigh
- CODEX_REASONING_SUMMARY: auto/concise/detailed/none
- CODEX_REASONING_COMPAT: think-tags/raw/none
- CODEX_INJECT_IDENTITY_OVERRIDE: true/false
- CODEX_INJECT_INSTRUCTION: true/false
- CODEX_EMPTY_RESPONSE_ATTEMPTS: retry count
- CODEX_EMPTY_RESPONSE_RETRY_DELAY: seconds
- CODEX_OAUTH_PORT: callback port

* fix(codex): 🐛 register Codex in provider factory for OAuth TUI

Adds Codex to PROVIDER_MAP so it appears in the credential tool's
"Add OAuth Credential" menu alongside other OAuth providers.

* feat(codex): ✨ add quota tracking with rate limit header parsing

Integrate CodexQuotaTracker mixin into CodexProvider to track API usage
via response headers and periodic quota API calls. Enables smart
credential selection based on remaining quota and automatic cooldowns
when limits are approached.

Signed-off-by: Moeeze Hassan <fammas.maz@gmail.com>

* feat(codex): add gpt-5.3-codex to model list and reasoning config

* feat(codex): wire quota header updates to UsageManager for TUI display

- Add set_usage_manager() to CodexQuotaTracker mixin
- Push parsed rate limit headers to UsageManager via update_quota_baseline
- Inject UsageManager reference during RotatingClient initialization
- Enables quota status display in TUI after each Codex API response

* feat(codex): default to sequential credential selection for token caching

Sequential mode reuses the same credential across requests, enabling
upstream token caching on the Codex API.

* fix(codex): standardize ChatGPT-Account-Id header casing

Match the exact capitalization used by the Codex CLI, consistent with
codex_quota_tracker.py. Addresses PR review feedback.
…, and reasoning config

Squash of feature/codex-openai-auth branch improvements:
- Fix model list: remove non-existent codex-mini, add gpt-5.3-codex-spark (Pro-only)
- Add quota tracking with rate limit header parsing
- Wire quota header updates to UsageManager for TUI display
- Standardize ChatGPT-Account-Id header casing
- Add identity override for user system prompt precedence
- Add reasoning effort model variants with gpt-5.3 support
- Default to sequential credential selection for token caching
- Update .env.example documentation
…e_manager

Align method signature with the calling convention in rotating_client.py.
Also update update_quota_baseline calls to use the current keyword arg API.
Squash of fix/oauth-docker-url-paste branch.
Allows OAuth completion via URL paste when redirect callbacks are
unavailable (e.g., Docker containers without exposed callback ports).
Combines work from:
- feature/codex-openai-auth (12 commits)
- fix/think-tag-spacing (2 commits)
- fix/codex-cached-tokens-and-rotation (1 commit)

Changes include:
- OpenAI Codex provider with Responses API support
- Reasoning effort model variants
- Identity override for user system prompt precedence
- Provider factory registration for OAuth TUI
- Quota tracking with rate limit header parsing
- UsageManager wire-up for TUI display
- Sequential credential selection for token caching
- ChatGPT-Account-Id header standardization
- Correct model list with gpt-5.3-codex-spark
- Stream reasoning as reasoning_content instead of think-tags
- Map cached tokens from Responses API
- codex-mini alias and shared sequential rotation
- Refactored openai_oauth_base.py for cleaner auth flow
…-codex

gpt-5.3-codex-spark doesn't exist as a distinct model. Aliases now
redirect spark requests to gpt-5.3-codex instead.
…iew items

Root cause: Missing text.verbosity parameter in Responses API payload.
Without it, the model defaulted to a verbose text mode that caused:
1. Tool calls emitted as garbled text content (+#+#, ♀♀♀♀, etc.)
2. Model announcing actions in text then stopping (finish_reason=stop)

Changes:
- Add text.verbosity: 'medium' to Codex Responses API payload
- Fix garbled tool call detection: delta is a dict, not object
- Add multi-marker detection: +#+#, to=functions., ♀♀♀♀
- Add accumulated text check for markers split across SSE chunks
- Buffer-then-flush retry: buffer all chunks before yielding to allow retry
- Replace generic Exception with StreamedAPIError for response.failed
- Remove unreachable return statement in _non_stream_with_retry
…ntial rotation

When Codex returns 'exceeds the context window' via response.failed SSE event,
the error was classified as 'unknown' and rotated through all 5 credentials
pointlessly. Now StreamedAPIError is classified as context_window_exceeded
or invalid_request — both non-rotatable — so it fails immediately.

Also applies PR review fixes to codex_provider.py:
- Replace generic Exception with StreamedAPIError for response.failed
- Remove unreachable return in _non_stream_with_retry
- Import StreamedAPIError from core.errors
…ls on startup

- Codex Responses API requires 'instructions' field to be non-optional.
  Always include it in payload; fall back to CODEX_SYSTEM_INSTRUCTION
  if no caller system prompt is provided. Fixes 'Instructions are required'
  400 errors.

- On first background job run, fetch quota baselines for ALL credentials
  and apply cooldowns for any that are exhausted. This prevents wasting
  a request attempt on a depleted credential after container restart.
@b3nw b3nw force-pushed the feature/codex-all branch from 64c0a37 to 73b27bf Compare February 15, 2026 22:04
@b3nw
Copy link
Contributor Author

b3nw commented Feb 15, 2026

@mirrobot-agent major cleanup and fixed a stall issue around instructions.

Repository owner deleted a comment from mirrobot-agent bot Feb 16, 2026
Repository owner deleted a comment from b3nw Feb 16, 2026
Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review: Codex Implementation Refined

The latest changes significantly harden the OpenAI Codex provider. The refactoring of instruction handling and the introduction of a robust streaming retry strategy address the most critical stability concerns raised in previous reviews.

Assessment of New Changes:

  • Retry Logic: The buffer-then-flush approach in _stream_with_retry successfully eliminates the risk of yielding duplicated or garbled content to the client.
  • Instruction Handling: Utilizing the instructions field of the Responses API directly (with proper fallbacks) is a much cleaner approach than converting system messages to user messages and appears to have resolved the stall issues.
  • Error Handling: Replacing generic Exception calls with StreamedAPIError allows the rotator library to correctly classify and handle model-generated errors.
  • Cleanup: Unreachable code paths have been removed, and the overall provider logic is more concise.

Key Suggestions:

  • Regex Parsing (Minor): The parse_quota_error method still relies on regex for retry-after extraction. While functional, favoring headers (if the Responses API eventually provides them) would be more robust.
  • Scope Clarification: I noticed that the Anthropic and Cursor providers, along with the Log Viewer, were deleted in this update. I assume this is a deliberate move to narrow the PR scope to Codex only, but please confirm if these were intended to be removed.

Overall Status:
The Codex implementation is now very stable and follows project conventions well. The addition of the bot management infrastructure also looks like a great step forward for the repository's maintenance.

This review was generated by an AI assistant.

accumulated text after stream completion (to catch markers that
are split across multiple SSE chunks).
"""
for attempt in range(GARBLED_TOOL_CALL_MAX_RETRIES):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buffer-then-flush strategy in _stream_with_retry is a robust solution to the streaming duplication risk. It ensures that no garbled content is yielded to the client before detection is complete.

# The Codex Responses API requires the 'instructions' field — it's non-optional.
# Always include it; fall back to the Codex system instruction if nothing else.
if not instructions:
instructions = CODEX_SYSTEM_INSTRUCTION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactored instructions handling properly utilizes the Responses API field and includes a fallback to CODEX_SYSTEM_INSTRUCTION. This should effectively address the stall issues previously identified.

error = evt.get("response", {}).get("error", {})
error_msg = error.get("message", "Response failed")
lib_logger.error(f"Codex response failed: {error_msg}")
raise StreamedAPIError(f"Codex response failed: {error_msg}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for replacing generic exceptions with StreamedAPIError. This allows for more granular error classification in the rotator.


# Try to extract from message
import re
match = re.search(r"try again in (\d+)s", message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This regex-based extraction still exists. While acceptable if the API doesn't provide structured retry data, consider using headers if available in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Agent Monitored Monitored for AI Agent to review PR's and commits enhancement New feature or request Priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants