feat: add Evaluator class for judge orchestration#1331
feat: add Evaluator class for judge orchestration#1331jsonbailey wants to merge 2 commits intomainfrom
Conversation
|
@launchdarkly/js-sdk-common size report |
|
@launchdarkly/browser size report |
|
@launchdarkly/js-client-sdk size report |
|
@launchdarkly/js-client-sdk-common size report |
…-1657) Introduces `Evaluator` wrapping judges and JudgeConfiguration. The evaluator runs all configured judges in parallel, warns+skips on missing judge keys, and intentionally does NOT call tracker.trackJudgeResult — that responsibility belongs in the managed layer. Attaches Evaluator to LDAICompletionConfig and LDAIAgentConfig via createChat/createAgent. Adds Evaluator.noop() static factory. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
46ab0a4 to
c751ce6
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit c751ce6. Configure here.
| return Evaluator.noop(); | ||
| } | ||
|
|
||
| const judgesRecord = await this._initializeJudges( |
There was a problem hiding this comment.
Could this return a Map<string, Judge> directly? Or do we need to deep copy?
- Include judgeConfigKey in error LDJudgeResults emitted from the catch block so error results are attributable to a specific judge config. - Switch from Promise.allSettled to Promise.all in Evaluator.evaluate; the map callback already catches internally so allSettled never sees rejections. Filter out null returns from the missing-judge path. - Mark the Evaluator class @internal and remove it from the public api/judge re-exports. It is consumed only by the managed layer; tests import via the source path. - Mark the evaluator property on LDAICompletionConfig and LDAIAgentConfig as @internal so it is excluded from the published API surface. - Simplify _initializeJudges to return Map<string, Judge> directly, eliminating the Record-then-convert step in _buildEvaluator. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Summary
Evaluatorclass that wraps judges andJudgeConfigurationEvaluator.noop()static factory returns a no-op evaluator (resolves to[])evaluate(input, output)runs all configured judges in parallel; missing judge key → warning + skip (not error)Evaluatordoes NOT calltracker.trackJudgeResult— that belongs in the managed layerEvaluatortoLDAICompletionConfigandLDAIAgentConfig(populated increateChat/createAgent)Test plan
Evaluator.test.tscovers noop(), judge evaluation, missing judge warns+skips, error handling, no tracker calls🤖 Generated with Claude Code
Note
Medium Risk
Moderate risk: introduces new judge orchestration path and changes judge initialization to use a
Map, plus mutates AI config objects to attach an internalevaluator, which could affect downstream integrations if they depend on config shape or judge wiring.Overview
Adds an internal
Evaluatorabstraction to run all configured judges for an input/output pair in parallel, returningLDJudgeResult[]while treating missing judges as warn + skip and converting judge exceptions into error results.Updates
LDAIClientImpl.createChatto build and attach anevaluatoronto the returnedLDAICompletionConfig(via a new_buildEvaluatorhelper andMap-based judge initialization), while still producing the legacyRecord<string, Judge>forTrackedChatcompatibility.Extends
LDAICompletionConfigandLDAIAgentConfigtypes with an internal optionalevaluatorfield, and adds unit tests coveringEvaluator.noop, parallel execution, missing-judge handling, and error wrapping.Reviewed by Cursor Bugbot for commit 8722689. Bugbot is set up for automated code reviews on this repo. Configure here.