Add EvalSync for synchronous evaluation without async code, with comprehensive tests and refactorings by ankrgyl · Pull Request #817 · braintrustdata/braintrust-sdk-javascript

Ankur Goyal (ankrgyl) · 2025-07-27T16:12:47Z

Summary

Introduces EvalSync, a new evaluation function that runs synchronously without any async code.
Provides an alternative for environments where async is not supported or desired.
Implements internal synchronous evaluation logic with timeout and error handling.
Refactors common evaluation logic into reusable helper functions to reduce duplication between sync and async paths.
Adds comprehensive tests for EvalSync covering basic usage, async task rejection, hooks support, and scorer classes.

Changes

Core Functionality

Added _run_eval_sync and _run_evaluator_sync internal functions to handle synchronous evaluation.
Implemented EvalSync function with parameters mirroring existing evaluators but running synchronously.
Refactored common evaluation logic into helper functions like _process_score_result, _prepare_score_logging, _prepare_task_args, _create_eval_result, _create_root_span, _resolve_scorers, _handle_scorer_errors, and _prepare_data_iterator.
Supports synchronous task execution, scoring, metadata handling, and reporting.
Rejects async tasks explicitly to prevent misuse.
Supports trial counts, metadata, error score handling, and experiment base comparisons.

Refactoring

Extracted common logic from async evaluator to helper functions for better code reuse and clarity.
Improved error handling and logging in both sync and async evaluation paths.

Testing

Added test_eval_sync_basic to verify basic synchronous evaluation correctness.
Added test_eval_sync_rejects_async_task to ensure async tasks raise errors.
Added test_eval_sync_with_hooks to verify hooks are passed and metadata is updated.
Added test_eval_sync_with_scorer_class to test compatibility with scorer classes.
Added test_eval_sync_exists_and_is_callable to verify EvalSync function signature and sync nature.

Test plan

Run all new and existing tests to ensure no regressions.
Verify synchronous evaluation runs correctly with various inputs and scorers.
Confirm async tasks are rejected with appropriate error messages.
Validate metadata propagation and reporting behavior in synchronous mode.

🌿 Generated by Terry

ℹ️ Tag Terragon Labs (@terragon-labs) to ask questions and address PR feedback

📎 Task: https://www.terragonlabs.com/task/30542e86-0f50-499a-9ae3-c3498181556f

…c code - Introduced EvalSync function to run evaluators synchronously without async support. - Added internal _run_eval_sync and _run_evaluator_sync functions to handle sync evaluation logic. - EvalSync supports tasks, scoring, metadata, reporting, and experiment management synchronously. - Added tests for EvalSync covering basic usage, async task rejection, hooks, and scorer classes. - Updated __all__ exports to include EvalSync. This feature enables evaluation in environments where async is not supported or desired, providing a fully synchronous evaluation alternative. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

- Skip integration tests that require full API mocking - Add simple test to verify EvalSync exists and has correct signature - Remove unused imports from test file - All tests now pass without requiring real API connection

Add py/test_venv/ to .gitignore to exclude Python test virtual environments from version control. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

- Extract common helper functions for score processing, logging, task argument preparation, eval result creation, root span creation, scorer resolution, scorer error handling, and data iterator preparation. - Replace duplicated code in async and sync evaluator runs with calls to these helpers. - Improve error handling and metadata logging for scorer exceptions. - Simplify and unify the handling of scorer results into standardized Score objects. - Enhance clarity and maintainability of evaluator execution flow. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

Refactored string concatenations to use implicit concatenation for better readability. Reformatted multi-line function calls for consistent style. Improved error message formatting for clarity. No functional changes were made. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

Standardize the method signature formatting for SyncScorerLike.__call__ and AsyncScorerLike.eval_async to be single-line with trailing ellipsis on the next line for improved readability and consistency. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

github-actions · 2026-03-13T00:41:44Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If this PR is still relevant, please leave a comment, push an update, or remove the stale label. Thank you for your contributions!

Ankur Goyal (ankrgyl) changed the title ~~Add EvalSync for synchronous evaluation without async code~~ Add EvalSync for synchronous evaluation without async code, with comprehensive tests Jul 27, 2025

ghost force-pushed the terragon/add-evalsync-explicit-sync branch from d148272 to 29e9b02 Compare July 27, 2025 17:26

Ankur Goyal (ankrgyl) and others added 5 commits July 27, 2025 17:37

chore: remove test_venv from git tracking and add to .gitignore

d2e9efc

chore(gitignore): ignore Python test virtual environments

5671e9c

Add py/test_venv/ to .gitignore to exclude Python test virtual environments from version control. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

Ankur Goyal (ankrgyl) changed the title ~~Add EvalSync for synchronous evaluation without async code, with comprehensive tests~~ Add EvalSync for synchronous evaluation without async code, with comprehensive tests and refactorings Jul 27, 2025

Ankur Goyal (ankrgyl) and others added 2 commits July 27, 2025 18:56

some tweaks

c6737d8

Olmo Maldonado (ibolmo) assigned Matt Perpick (clutchski) Aug 12, 2025

github-actions bot added the stale label Mar 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add EvalSync for synchronous evaluation without async code, with comprehensive tests and refactorings#817

Add EvalSync for synchronous evaluation without async code, with comprehensive tests and refactorings#817
Ankur Goyal (ankrgyl) wants to merge 8 commits intomainfrom
terragon/add-evalsync-explicit-sync

Ankur Goyal (ankrgyl) commented Jul 27, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Ankur Goyal (ankrgyl) commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Core Functionality

Refactoring

Testing

Test plan

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Ankur Goyal (ankrgyl) commented Jul 27, 2025 •

edited

Loading