Trace: PRD prd-v0-1-smoke-evaluation-run · FR-006 · SPEC RunAggregates schema (architect finding #2 resolution)
Capability: At status=aggregating step, compute counts_by_status, counts_by_error_class, total_cost_usd, total_wall_clock_ms, per_task_metrics, budget_breach, available_models_count per SPEC RunAggregates.
Acceptance:
Implementation locus: apps/eval-core-py/src/orchestrator/aggregates.py
Trace: PRD prd-v0-1-smoke-evaluation-run · FR-006 · SPEC RunAggregates schema (architect finding #2 resolution)
Capability: At status=aggregating step, compute
counts_by_status,counts_by_error_class,total_cost_usd,total_wall_clock_ms,per_task_metrics,budget_breach,available_models_countper SPECRunAggregates.Acceptance:
counts_by_statussum equals len(evals[]) (invariant for FR-009)total_cost_usdcross-checked with LiteLLM proxy /creditsImplementation locus:
apps/eval-core-py/src/orchestrator/aggregates.py