Skip to content

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Jan 16, 2026

feat: add debug logging when OutputParserError triggers agent retry

Summary

Addresses issue #4246 by adding debug-level logging when OutputParserError triggers an agent retry. Previously, when format_answer() raised OutputParserError, the agent would silently retry with no visibility into what the LLM returned or why parsing failed.

The new debug logging captures:

  • The parsing error message (first line only for brevity)
  • Raw LLM output that failed to parse (truncated to 500 characters, newlines escaped)
  • Retry count per agent turn
  • Agent role context for easier debugging

Logging is controlled via Python's standard logging module at DEBUG level, allowing users to enable it via environment variables or log configuration without code changes.

Review & Testing Checklist for Human

  • Verify integration in agent executors: The new unit tests only test handle_output_parser_exception() directly. The changes to crew_agent_executor.py (both sync and async loops) and lite_agent.py that pass raw_output and agent_role are not covered by integration tests. Consider manually triggering a parse failure to verify the full flow.
  • Confirm debug log format meets expectations: The log format shows Parse failed for agent 'RoleName': <error> followed by truncated raw output. Verify this matches what was expected in issue [Enhancement] Add debug logging when OutputParserError triggers agent retry #4246.
  • Test with actual LLM that produces malformed output: Run a crew with an agent that's likely to produce parse errors (e.g., a model that doesn't follow ReAct format well) with DEBUG logging enabled to verify the logs appear correctly.
  • Verify async path works correctly: The _ainvoke_loop() method in crew_agent_executor.py has the same changes as the sync path but isn't explicitly tested.

Recommended Test Plan

  1. Enable DEBUG logging: logging.getLogger("crewai.utilities.agent_utils").setLevel(logging.DEBUG)
  2. Create a simple crew with an agent
  3. Either mock the LLM to return malformed output, or use a model configuration that's likely to produce parse errors
  4. Verify debug logs show the raw output, error message, and retry count

Notes

This addresses issue #4246 by adding optional debug logging that captures:
- The raw LLM output that failed to parse (truncated to 500 chars)
- The specific parsing error message
- Retry count per agent turn
- Agent role context for easier debugging

The debug logging is controlled by Python's standard logging module at DEBUG level,
allowing users to enable it via environment variables or log configuration.

Changes:
- Modified handle_output_parser_exception() in agent_utils.py to accept raw_output
  and agent_role parameters for debug logging
- Updated CrewAgentExecutor._invoke_loop() and _ainvoke_loop() to pass raw output
  and agent role to the exception handler
- Updated LiteAgent._invoke_loop() to pass raw output and agent role
- Added comprehensive tests for the new debug logging functionality

Co-Authored-By: João <[email protected]>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant