[codex] tighten Expo eval follow-ups by grabbou · Pull Request #382 · callstackincubator/evals

grabbou · 2026-06-24T22:25:04Z

Summary

This PR tightens the Expo eval expansion after review against current SDK 56 docs and local runner behavior.

Aligns the testbench dependencies with Expo's current SDK 56-compatible versions via expo install --fix.
Fixes the Expo Router data-loader eval to use unstable_useServerDataLoaders instead of unrelated asyncRoutes config.
Makes the inline-modules prebuild/build note judgeable by adding an explicit input/reference note file.
Corrects the Expo Modules web-platform eval to use platforms: ["apple", "android", "web"] plus .web.ts platform resolution, avoiding unsupported web.modules config.
Tightens the MediaLibrary reference to create a real .png file, verify file.exists, and avoid double-adding the asset to an album.
Seeds static Expo Modules evals with the config/native/script files their requirements ask solvers to edit.
Adds missing official Expo SDK source links and clarifies whitepaper wording around requirement weights and inputs.files metadata.

Validation

bun lint
bun test runner/evaluators/llm/tests/discovery.test.ts runner/evaluators/llm/tests/requirements.test.ts runner/evaluators/llm/tests/files.test.ts
bunx expo install --check from testbench/
bun runner/run.ts --pattern "evals/expo-sdk/17-rn-expo-media-library-file-asset-create" --model noop --output /tmp/evals-expo-sdk-17
bun runner/run.ts --pattern "evals/expo-router/09-rn-expo-router-data-loaders-config" --model noop --output /tmp/evals-expo-router-09
bun runner/run.ts --pattern "evals/expo-modules/**" --model noop --output /tmp/evals-expo-modules

Notes

I intentionally did not change the broader whitepaper claim about the historical 66-eval result snapshot versus the current 151-eval inventory, per review direction.

grabbou added 8 commits June 25, 2026 02:19

chore(testbench): align Expo SDK 56 packages

5fb8889

fix(expo-router): configure data loader eval

1846713

fix(expo-modules): make inline module note judgeable

eef5325

fix(expo-modules): correct web platform eval

7187ad8

fix(expo-sdk): tighten media library asset reference

a085763

chore(expo-modules): seed static eval inputs

d1deae5

docs(expo-sdk): add missing official sources

be87aa8

docs(whitepaper): clarify eval metadata contract

b50ea25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[codex] tighten Expo eval follow-ups#382

[codex] tighten Expo eval follow-ups#382
grabbou wants to merge 8 commits into
mainfrom
codex/expo-eval-followups

grabbou commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

grabbou commented Jun 24, 2026

Summary

Validation

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant