narrow tls sharing for dav1d#679
Draft
oinoom wants to merge 10 commits into
Draft
Conversation
added 9 commits
April 17, 2026 12:26
Strict dav1d single-thread decode on x86_64 was still failing in IA2 libc compartment mode because a small amount of writable loader/libc state is process-global ABI state, not compartment-private state. Keep only the state that must remain shared: - writable ld.so heap mappings tagged with IA2_LDSO_HEAP_MARKER - the ld.so TLS-generation page read during __tls_get_addr resolution - the libc tuning pages holding the x86_64 memmove/memset threshold globals The intent is to replace broader carveouts with narrow runtime policy rooted in what the loader and libc actually read and write during the dav1d reproduction. ia2_unprotect_loader_heap_maps() handles the writable ld.so heap mappings at startup, and protect_pages() adds the page-level carveouts before retagging the rest of each segment. This is the runtime-only part of the minimal main-based IA2 delta. The paired dav1d strict reproduction still reaches `dav1d --version` and decodes test.ivf successfully with these carveouts in place.
The minimal dav1d stack needs calls to shared_malloc(), shared_free(), and the other shared allocator helpers to route to the shared allocator compartment. Encoding that rule only in the rewriter makes the policy implicit and application-specific. Annotate the allocator declarations themselves with ia2_extern_pkey:1 and add a libia2 wrapper header that exposes those declarations from the runtime include tree. This moves the ownership policy into the API contract: code that includes these declarations can say directly that the entrypoints live in the shared pkey. This commit intentionally changes interface metadata only. The rewriter change that consumes these annotations follows next.
The rewriter previously learned external pkeys only from system-header handling and hardcoded exceptions. That was enough to get a working dav1d branch, but it kept shared allocator routing as tool-specific knowledge instead of making it a declaration-driven rule. Teach SourceRewriter to parse ia2_extern_pkey:<n> annotations on declarations and record the annotated pkey even when the declaration would otherwise be ignored for rewriting. This lets external APIs such as the shared allocator state their required compartment directly in headers. Together with the previous commit, this preserves the generated callgate routing needed by the minimal strict dav1d decode stack while removing the need for a dav1d-specific `shared_*` exception in the rewriter.
Single-thread dav1d decode still passes after removing the ia2_extern_pkey annotation machinery from ia2_allocator.h. This was worth testing because the original minimal stack carried two extra IA2 commits whose stated purpose was to annotate shared allocator entrypoints and teach the rewriter to honor those annotations. In the current dav1d single-thread path, those annotations are not what makes shared_malloc/shared_free work: - the generated dav1d binaries still reference shared_malloc and shared_free as direct dynamic symbols rather than wrapped callgates - disabling annotated-extern-pkey handling in SourceRewriter.cpp did not change single-thread decode behavior - after no-oping the annotation macro and restoring the normal rewriter, strict single-thread decode still passed So for this reduced branch pair, the annotation syntax in the header is currently dead scaffolding. Remove only the macro expansion logic and leave the allocator API surface unchanged. Validation: - rebuilt dav1d through rewrite.py against this IA2 tree - dav1d --version returned 0 - single-thread decode of test.ivf returned 0
Problem
- The reduced principle-check IA2 branch currently makes strict single-thread
dav1d decode pass by retagging writable loader-heap TLS state to shared pkey
0 during startup.
- We wanted to know whether the older e664-style x86_64 TLS policy could still
support the reduced dav1d branch without falling back to the broader
loader-heap carveout.
- Restoring only the TCB/static-TLS neighborhood sharing was not enough.
`dav1d --version` succeeded, but `dav1d --threads 1` still crashed in
`__tls_get_addr+16` while the IVF handoff path entered
`dav1d_data_wrap -> malloc -> PartitionAlloc::ScopedDisallowAllocations`.
- GDB showed the exact mismatch at the failing read:
- `%fs` base was on `[anon: ia2-loader-heap]` with `pkey 0`
- `*(%fs:8)` (the DTV pointer) was on the adjacent
`[anon: ia2-loader-heap]` mapping with `pkey 1`
- the faulting `cmp %rax,(%rdx)` used that DTV pointer as `%rdx`
What this changes
- Keep the x86_64 PT_TLS prefix below the TCB page shared in
`protect_tls_pages()` so `%fs`-relative ABI/TLS state that lives below the
thread pointer remains accessible across compartment transitions.
- Reintroduce `ia2_unprotect_thread_pointer_mapping()` and call it from
`ia2_start()` so the startup thread's TCB neighborhood is retagged shared
after IA2 has finished compartment setup.
- Add `ia2_unprotect_thread_dtv_page()` and call it from `ia2_start()` so the
startup thread's DTV header page is explicitly retagged shared as well.
- Leave the newer targeted runtime follow-ups from the reduced branch in place:
the ld.so TLS-generation page carveout and the libc memmove tuning-page
carveouts remain unchanged.
- Stop depending on `ia2_unprotect_loader_heap_maps()` for this reproduction.
Why this fixes it
- The targeted TCB/static-TLS carveout fixes `%fs`-relative accesses in the TCB
neighborhood, but `__tls_get_addr` also dereferences `THREAD_DTV()` via
`%fs:8`.
- On this reduced single-thread dav1d path, PartitionAlloc reaches that
`__tls_get_addr` fast path during the IVF packet handoff allocation path.
Leaving the DTV header page compartment-private therefore still faults even
though the TCB page itself is shared.
- Sharing the DTV header page closes that remaining gap without going back to a
blanket retag of every writable IA2 loader-heap mapping.
- The current single-thread reproduction does not need the larger thread-start
changes from the earlier e664 line. In particular, the extra
`pthread_create()` PKRU wrapper, the per-thread post-TLS retag in
`ia2_thread_begin()`, and the standalone one-page
`ia2_unprotect_thread_pointer_page()` startup call were all removed while the
strict dav1d reproduction continued to pass.
- The resulting policy is narrower and better motivated: share the startup
thread's TCB/static-TLS neighborhood and DTV page, rather than all marked
loader-heap VMAs.
Validation
- rebuilt `principle_check/dav1d-ia2` through `rewrite.py` against this IA2
tree and dav1d `66a21b9`
- `tools/dav1d --version` returned 0
- `tools/dav1d -i /home/davidanekstein/immunant/test.ivf -o /dev/null --muxer
null --threads 1` returned 0
- decode completed: `Decoded 2/2 frames (100.0%)`
Problem - `loader_minimal_malloc` was written against the older assumption that every VMA tagged `ia2-loader-heap` would remain compartment-private with `pkey 1`. - The reduced single-thread dav1d branch no longer has that policy. It intentionally retags part of the loader heap to shared `pkey 0` so the startup thread can access the TCB/TLS/DTV state required by the strict decode path. - The old test only looked at the first `ia2-loader-heap` entry in `/proc/self/smaps` and asserted that one mapping had `ProtectionKey: 1`. Once one shared loader-heap mapping appeared first, the test failed even when other loader-heap mappings still remained private with `pkey 1`. What this changes - Parse `/proc/self/smaps` entry-by-entry and collect every mapping tagged with `IA2_LDSO_HEAP_MARKER` instead of stopping at the first match. - Record the address range and `ProtectionKey` for each loader-heap mapping. - Print the full loader-heap pkey breakdown so the test output shows the actual mixed policy when debugging regressions. - Change the assertion from "the first loader-heap mapping has pkey 1" to "at least one loader-heap mapping still has pkey 1". Why this fixes it - The old test was overfitting to a single mapping order rather than checking the actual security/property question. - On the current branch, the useful invariant is that the loader heap must not collapse entirely to shared `pkey 0`. Some pages may be shared for startup TLS reasons, but other loader-heap mappings should still remain compartment- private. - Enumerating all loader-heap mappings lets the test distinguish "mixed-policy carveout working as intended" from "everything got retagged shared". Observed breakdown on this branch - direct run of the updated test reported: - `0x74e9292fe000-0x74e929307000 pkey=0` - `0x74e929307000-0x74e92930a000 pkey=1` - `0x74e929338000-0x74e92933a000 pkey=1` - `0x74e929685000-0x74e929687000 pkey=1` - That is exactly why the old test was stale: the loader heap now has a mixed pkey distribution, not a uniform one. Validation - rebuilt `tests/loader_minimal_malloc/loader_minimal_malloc` - direct run printed the mixed pkey breakdown above and exited 0 - `ctest --test-dir build/x86_64 -R loader_minimal_malloc --output-on-failure` passed - strict single-thread dav1d decode still passed after the runtime simplification and test update
The previous dav1d single-thread fix retagged the startup thread's TCB
window and DTV page from ia2_start(). That made the decode path work, but
it also regressed standard tracer builds: with tracer on, debug on, and
IA2_LIBC_COMPARTMENT off, tests started failing during process startup
because the tracer observed IA2 retagging loader-owned TLS state after the
loader had already initialized it.
This change removes that runtime-side startup retag path entirely:
- delete the x86_64 thread-pointer / DTV helper declarations and
implementations
- stop calling them from ia2_start()
- update the glibc submodule to move the initial-thread TCB/DTV retag into
rtld's init_tls() path instead
The paired glibc change now computes the exact initial-thread static TLS
page range from dl_tls_static_size and TLS_TCB_SIZE instead of retagging a
conservative fixed eight-page window below the TCB. That keeps the loader-
side policy narrow while preserving the dav1d decode path and the tracer
startup behavior.
Moving the startup retag into the loader is not sufficient by itself. Once
rtld keeps the initial-thread TCB neighborhood shared, protect_tls_pages()
can still retag that page back to a compartment pkey later when IA2 walks
PT_TLS segments. In practice that showed up as a dav1d --version crash on
exit: libstdc++ / exception-handling state accessed at fs:-0x20 lived on the
single page immediately below the TCB, and that page had been retagged to
pkey 2, causing SEGV_PKUERR during _dl_fini / __cxa_finalize.
Fix that by keeping the x86_64 PT_TLS carveout narrow:
- preserve the historical ia2_stackptr_0 shared page for compartment 1
- for non-default compartments, preserve only the single page immediately
below the main-thread TCB
- do not restore the older broad "share everything up to the TCB" rule,
which weakened TLS isolation and broke threads, tls_protected, and
three_keys_minimal
Validation:
- fresh ./rewrite.py --llvm-config /usr/bin/llvm-config-18 build
- dav1d --version => RC=0
- dav1d test.ivf --threads 1 => RC=0, Decoded 2/2 frames
- tracer/debug/no-libc sweep:
ctest --test-dir build/tracer_debug_standard_computed_20260417 \
--output-on-failure -j1 -E terminating_threads
=> 35/35 passed
- debug/no-tracer/no-libc sweep:
ctest --test-dir build/standard_debug_notracer_computed_20260417 \
--output-on-failure -j1 -E terminating_threads
=> 35/35 passed
- release/libc-compartment sweep:
ctest --test-dir build/libc_release_computed_20260417 \
--output-on-failure -j1
=> 13/13 passed
The shared allocator declarations no longer emit ia2_extern_pkey annotations. Remove the empty macro and its no-op uses.
This branch no longer emits ia2_extern_pkey annotations, so the matching declaration-parsing path in SourceRewriter is unused.
d79a697 to
1228990
Compare
The dav1d single-thread decode work uses ia2_ldso_heap.h only to interpret x86_64 loader-heap markers emitted by the custom glibc loader. AArch64 does not build or use that loader-heap marker path, but ia2.c was including the header unconditionally. In the Arch/AArch64 CI failure this showed up as a fatal include error when compiling runtime/libia2/ia2.c. Guard the include with __x86_64__ so non-x86 builds do not see the x86-only marker header. This keeps the dav1d-specific loader/TLS path out of ARM builds without changing any x86 behavior.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.