Skip to content

feat: add Evaluator class for judge orchestration#1331

Open
jsonbailey wants to merge 2 commits intomainfrom
jb/aic-2174/server-ai
Open

feat: add Evaluator class for judge orchestration#1331
jsonbailey wants to merge 2 commits intomainfrom
jb/aic-2174/server-ai

Conversation

@jsonbailey
Copy link
Copy Markdown
Contributor

@jsonbailey jsonbailey commented Apr 28, 2026

Summary

  • Adds Evaluator class that wraps judges and JudgeConfiguration
  • Evaluator.noop() static factory returns a no-op evaluator (resolves to [])
  • evaluate(input, output) runs all configured judges in parallel; missing judge key → warning + skip (not error)
  • Evaluator does NOT call tracker.trackJudgeResult — that belongs in the managed layer
  • Attaches Evaluator to LDAICompletionConfig and LDAIAgentConfig (populated in createChat/createAgent)

Test plan

  • All 188 existing tests pass
  • New Evaluator.test.ts covers noop(), judge evaluation, missing judge warns+skips, error handling, no tracker calls

🤖 Generated with Claude Code


Note

Medium Risk
Moderate risk: introduces new judge orchestration path and changes judge initialization to use a Map, plus mutates AI config objects to attach an internal evaluator, which could affect downstream integrations if they depend on config shape or judge wiring.

Overview
Adds an internal Evaluator abstraction to run all configured judges for an input/output pair in parallel, returning LDJudgeResult[] while treating missing judges as warn + skip and converting judge exceptions into error results.

Updates LDAIClientImpl.createChat to build and attach an evaluator onto the returned LDAICompletionConfig (via a new _buildEvaluator helper and Map-based judge initialization), while still producing the legacy Record<string, Judge> for TrackedChat compatibility.

Extends LDAICompletionConfig and LDAIAgentConfig types with an internal optional evaluator field, and adds unit tests covering Evaluator.noop, parallel execution, missing-judge handling, and error wrapping.

Reviewed by Cursor Bugbot for commit 8722689. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown
Contributor

@launchdarkly/js-sdk-common size report
This is the brotli compressed size of the ESM build.
Compressed size: 25623 bytes
Compressed size limit: 29000
Uncompressed size: 125843 bytes

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 28, 2026

@launchdarkly/browser size report
This is the brotli compressed size of the ESM build.
Compressed size: 179311 bytes
Compressed size limit: 200000
Uncompressed size: 830815 bytes

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 28, 2026

@launchdarkly/js-client-sdk size report
This is the brotli compressed size of the ESM build.
Compressed size: 31866 bytes
Compressed size limit: 34000
Uncompressed size: 113634 bytes

@github-actions
Copy link
Copy Markdown
Contributor

@launchdarkly/js-client-sdk-common size report
This is the brotli compressed size of the ESM build.
Compressed size: 38473 bytes
Compressed size limit: 39000
Uncompressed size: 211104 bytes

@jsonbailey jsonbailey changed the title feat(server-sdk-ai): add Evaluator class for judge orchestration feat: add Evaluator class for judge orchestration Apr 28, 2026
…-1657)

Introduces `Evaluator` wrapping judges and JudgeConfiguration. The evaluator
runs all configured judges in parallel, warns+skips on missing judge keys, and
intentionally does NOT call tracker.trackJudgeResult — that responsibility
belongs in the managed layer. Attaches Evaluator to LDAICompletionConfig and
LDAIAgentConfig via createChat/createAgent. Adds Evaluator.noop() static factory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jsonbailey jsonbailey force-pushed the jb/aic-2174/server-ai branch from 46ab0a4 to c751ce6 Compare April 28, 2026 23:12
@jsonbailey jsonbailey marked this pull request as ready for review April 28, 2026 23:23
@jsonbailey jsonbailey requested a review from a team as a code owner April 28, 2026 23:23
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit c751ce6. Configure here.

Comment thread packages/sdk/server-ai/src/api/judge/Evaluator.ts
Comment thread packages/sdk/server-ai/src/api/judge/Evaluator.ts
Comment thread packages/sdk/server-ai/src/LDAIClientImpl.ts
return Evaluator.noop();
}

const judgesRecord = await this._initializeJudges(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this return a Map<string, Judge> directly? Or do we need to deep copy?

Comment thread packages/sdk/server-ai/src/api/judge/Evaluator.ts Outdated
Comment thread packages/sdk/server-ai/src/api/judge/Evaluator.ts
Comment thread packages/sdk/server-ai/src/api/config/types.ts Outdated
- Include judgeConfigKey in error LDJudgeResults emitted from the catch
  block so error results are attributable to a specific judge config.
- Switch from Promise.allSettled to Promise.all in Evaluator.evaluate;
  the map callback already catches internally so allSettled never sees
  rejections. Filter out null returns from the missing-judge path.
- Mark the Evaluator class @internal and remove it from the public
  api/judge re-exports. It is consumed only by the managed layer; tests
  import via the source path.
- Mark the evaluator property on LDAICompletionConfig and LDAIAgentConfig
  as @internal so it is excluded from the published API surface.
- Simplify _initializeJudges to return Map<string, Judge> directly,
  eliminating the Record-then-convert step in _buildEvaluator.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants