chore: Avoid Box<dyn SendStream> and Box<dyn RecvStream>#3603
chore: Avoid Box<dyn SendStream> and Box<dyn RecvStream>#3603larseggert wants to merge 7 commits into
Box<dyn SendStream> and Box<dyn RecvStream>#3603Conversation
Extends the pattern from mozilla#3451 and mozilla#3602 to the HTTP/3 stream maps. - `Box<dyn RecvStream>` - `Box<dyn SendStream>` - `Box<dyn Protocol>`
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3603 +/- ##
==========================================
- Coverage 95.15% 95.00% -0.16%
==========================================
Files 111 116 +5
Lines 38033 38529 +496
Branches 38033 38529 +496
==========================================
+ Hits 36192 36604 +412
- Misses 1153 1225 +72
- Partials 688 700 +12
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR continues the ongoing effort (per #3451 and #3602) to remove Box<dyn ...> usage in neqo-http3 by replacing HTTP/3 stream map trait objects with concrete enum dispatch types, reducing vtable-based dynamic dispatch.
Changes:
- Introduces
RecvStreamImplandSendStreamImplenums (withneqo_common::dispatch!) to replaceBox<dyn RecvStream>/Box<dyn SendStream>in stream maps. - Introduces
ProtocolImplto replaceBox<dyn Protocol>for Extended CONNECT protocol dispatch. - Adds
neqo_common::impl_from_variants!macro to generateFrom<T>conversions for enum wrapper variants.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| neqo-http3/src/lib.rs | Adds RecvStreamImpl/SendStreamImpl enum dispatch and From conversions for stream storage. |
| neqo-http3/src/frames/reader.rs | Updates stream reader wrapper to accept &mut dyn RecvStream instead of &mut Box<dyn RecvStream>. |
| neqo-http3/src/features/extended_connect/webtransport_session.rs | Updates protocol control stream types to use RecvStreamImpl/SendStreamImpl. |
| neqo-http3/src/features/extended_connect/session.rs | Replaces boxed Protocol with ProtocolImpl; updates session control streams to concrete stream impl enums. |
| neqo-http3/src/features/extended_connect/mod.rs | Updates ExtendedConnectType::new_protocol() to return ProtocolImpl. |
| neqo-http3/src/features/extended_connect/connect_udp_session.rs | Adjusts protocol trait method signatures to use concrete stream impl enums. |
| neqo-http3/src/control_stream_local.rs | Updates control stream APIs to operate on HashMap<StreamId, Box<RecvStreamImpl>>. |
| neqo-http3/src/connection.rs | Replaces stream maps with Box<RecvStreamImpl> / Box<SendStreamImpl> and updates insertion sites accordingly. |
| neqo-http3/src/connection_server.rs | Updates server-side stream creation to insert concrete stream impl enums. |
| neqo-http3/src/connection_client.rs | Updates client-side push stream registration to insert concrete stream impl enums. |
| neqo-common/src/lib.rs | Adds impl_from_variants! macro used to simplify enum wrapper conversions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Clean extension of the devirtualization pattern from #3451 / #3602 to the HTTP/3 stream maps. Replacing Box<dyn RecvStream>, Box<dyn SendStream>, and Box<dyn Protocol> with concrete enum dispatch (RecvStreamImpl, SendStreamImpl, ProtocolImpl) eliminates vtable overhead on hot paths while retaining Box for HashMap storage.
The new impl_from_variants! macro in neqo-common is a nice reusable utility, and the existing dispatch! macro keeps the boilerplate in the trait impls manageable. The ProtocolImpl stored inline (no Box) in Session is a good call since the struct is already heap-allocated via Rc<RefCell<…>>.
One item to verify: the private_interfaces lint suppression removal on StreamReaderRecvStreamWrapper::new under #[cfg(fuzzing)] (see inline comment).
There was a problem hiding this comment.
Clean extension of the devirtualization pattern from #3451 / #3602 to the HTTP/3 stream maps. Replacing Box<dyn RecvStream>, Box<dyn SendStream>, and Box<dyn Protocol> with concrete enum dispatch (RecvStreamImpl, SendStreamImpl, ProtocolImpl) eliminates vtable overhead on the hot receive/send paths while keeping Box for HashMap storage to avoid large moves on rehash.
The impl_from_variants! macro is a nice reusable addition, and the dispatch!-based boilerplate is manageable. Storing ProtocolImpl inline (no Box) in Session is a good call since the struct is already heap-allocated via Rc<RefCell<…>>.
A few minor notes (see inline comments):
- The
Protocoltrait signatures still pass&mut Box<…>where&mut RecvStreamImpl/&mut SendStreamImplwould avoid a needless indirection layer — worth a follow-up cleanup. statsimpls can use the dispatch macros for consistency with other methods.- The
cfg_attr(fuzzing, expect(private_interfaces))issue flagged by prior reviewers has been addressed in the latest commit.
No blocking issues found.
There was a problem hiding this comment.
Clean devirtualization of Box<dyn RecvStream>, Box<dyn SendStream>, and Box<dyn Protocol> into concrete enum dispatch, consistent with the pattern established in #3451 and #3602.
A few observations:
- The
impl_from_variants!macro inneqo-commonis a nice addition that reduces boilerplate across the enum definitions. - The manual
matcharms forstats,http_stream, andextended_connect_sessioninRecvStreamImpl/SendStreamImplare necessary — they cannot use thedispatch!macro becauseRc<RefCell<Session>>implements bothRecvStreamandSendStream, making those shared method names ambiguous. (The bot suggestions to use the dispatch macro forstatswould not compile — see inline comment.) - The
const fnpromotions on theHttp3Connectionaccessor methods are a nice bonus from removing trait object references. - The
StreamReaderRecvStreamWrapperchange from&mut Box<dyn RecvStream>to&mut dyn RecvStreamis a good independent cleanup; thecfg_attr(fuzzing, expect(private_interfaces))was properly restored. - The earlier bot concerns (fuzzing
cfg_attr, misleading Box comment) have been addressed in the latest commit.
LGTM overall.
There was a problem hiding this comment.
Clean extension of the devirtualization pattern from #3451 / #3602. Replacing Box<dyn RecvStream>, Box<dyn SendStream>, and Box<dyn Protocol> with concrete enum dispatch (RecvStreamImpl, SendStreamImpl, ProtocolImpl) eliminates vtable overhead on the hot receive/send paths.
A few architectural observations:
- The manual
matcharms forstats,http_stream, andextended_connect_sessionare necessary —Rc<RefCell<Session>>implements bothRecvStreamandSendStream, so thedispatch!macro would produce ambiguous method calls for those shared names. The earlier bot suggestions to use dispatch macros there would not compile. - The
impl_from_variants!macro is a nice reusable utility. Note that it would produce conflictingFromimpls if two variants ever wrap the same type — not an issue today, but worth keeping in mind if the enums grow. - The
const fnpromotions onHttp3Connectionaccessors are a nice side benefit of removing trait object references. - The
Boxwrapping is still justified: these enums are sized to their largest variant, and boxing avoids large moves onHashMaprehash.
No blocking issues found. Three minor inline suggestions for future work (see comments).
5756ac3 to
993b98d
Compare
There was a problem hiding this comment.
Clean continuation of the devirtualization effort from #3451 and #3602. Replacing Box<dyn RecvStream>, Box<dyn SendStream>, and Box<dyn Protocol> with concrete enum dispatch (RecvStreamImpl, SendStreamImpl, ProtocolImpl) is well-executed.
A few observations on the overall design:
-
#[inline(never)]strategy is sound. The annotations are applied comprehensively to all variant impls of the hot dispatch methods (receive,send,send_data,read_data), preventing LTO from inlining every variant body into the enum dispatch site. Smaller methods (reset,done,has_data_to_send, etc.) are left un-annotated, which is the right call — their bodies are small enough that inlining at the match site is beneficial. -
Manual matches for
stats,http_stream,extended_connect_sessionare necessary and correct.Rc<RefCell<Session>>implements bothRecvStreamandSendStream, both of which definehttp_streamandstats(with different return types). Thedispatch!macro expands tov.method(), which would be ambiguous for theExtendedConnectvariant. The earlier bot suggestions to use the dispatch macros for these methods would not compile. -
const fnpromotions on theHttp3Connectionaccessor methods are a nice side benefit — these couldn't beconstwhen the return type containeddyn Trait. -
impl_from_variants!macro is a good reusable addition. Note that it would produce conflictingFromimpls if two variants of the same enum ever wrapped the same type — not an issue today sinceRc<RefCell<ExtConnSession>>appears in both enums but they are different enums.
No blocking issues. See inline comments for two minor architectural notes about the residual Protocol trait and the StreamReaderRecvStreamWrapper vtable path.
993b98d to
8b5dbda
Compare
There was a problem hiding this comment.
Clean continuation of the devirtualization effort from #3451 and #3602. Replacing Box<dyn RecvStream>, Box<dyn SendStream>, and Box<dyn Protocol> with concrete enum dispatch (RecvStreamImpl, SendStreamImpl, ProtocolImpl) is well-executed.
A few notes on the design:
- The manual
matcharms forstats,http_stream, andextended_connect_sessionare necessary and correct —Rc<RefCell<Session>>implements bothRecvStreamandSendStream, making those shared method names ambiguous for thedispatch!macro. The earlier bot suggestions to replace them with the dispatch macros would not compile. - The
#[inline(never)]onreceive_internalis well-targeted — it prevents the ~200-line body from leaking into the dispatch site under LTO, while keeping thereceivewrapper thin enough to inline.read_datais also dispatched but its heaviest work already goes throughreceive_internal(line 445), so the current annotation placement is effective. - The
impl_from_variants!macro is a good reusable addition toneqo-common. Note that conflictingFromimpls (two variants wrapping the same type) would be caught at compile time. - Storing
ProtocolImplinline (noBox) inSessionis the right call since the struct is already heap-allocated viaRc<RefCell<…>>. See inline comment about extending this to the control stream fields.
No blocking issues.
Benchmark resultsNo significant performance differences relative to 2d08e02. All resultstransfer/1-conn/1-100mb-resp (aka. Download): No change in performance detected. time: [175.47 ms 175.86 ms 176.26 ms]
thrpt: [567.34 MiB/s 568.63 MiB/s 569.91 MiB/s]
change:
time: [-0.3637% +0.0789% +0.4717] (p = 0.72 > 0.05)
thrpt: [-0.4695% -0.0789% +0.3650]
No change in performance detected.transfer/1-conn/10_000-parallel-1b-resp (aka. RPS): Change within noise threshold. time: [282.54 ms 284.53 ms 286.52 ms]
thrpt: [34.902 Kelem/s 35.145 Kelem/s 35.394 Kelem/s]
change:
time: [+0.1097% +1.1095% +2.1162] (p = 0.03 < 0.05)
thrpt: [-2.0724% -1.0973% -0.1095]
Change within noise threshold.transfer/1-conn/1-1b-resp (aka. HPS): No change in performance detected. time: [38.986 ms 39.180 ms 39.393 ms]
thrpt: [25.385 B/s 25.523 B/s 25.650 B/s]
change:
time: [-1.1631% -0.4326% +0.3067] (p = 0.27 > 0.05)
thrpt: [-0.3057% +0.4345% +1.1768]
No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
2 (2.00%) low mild
4 (4.00%) high mild
10 (10.00%) high severetransfer/1-conn/1-100mb-req (aka. Upload): Change within noise threshold. time: [178.45 ms 178.87 ms 179.35 ms]
thrpt: [557.57 MiB/s 559.05 MiB/s 560.38 MiB/s]
change:
time: [-0.6524% -0.3315% -0.0195] (p = 0.05 < 0.05)
thrpt: [+0.0195% +0.3326% +0.6567]
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severestreams/walltime/1-streams/each-1000-bytes: No change in performance detected. time: [570.60 µs 572.71 µs 575.16 µs]
change: [-0.9581% -0.4219% +0.1544] (p = 0.14 > 0.05)
No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) high mild
9 (9.00%) high severestreams/walltime/1000-streams/each-1-bytes: No change in performance detected. time: [11.769 ms 11.802 ms 11.846 ms]
change: [-0.3438% +0.0788% +0.5491] (p = 0.74 > 0.05)
No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) high mild
2 (2.00%) high severestreams/walltime/1000-streams/each-1000-bytes: Change within noise threshold. time: [40.316 ms 40.365 ms 40.417 ms]
change: [+0.5124% +0.6806% +0.8538] (p = 0.00 < 0.05)
Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severestreams-flow-controlled/walltime/1-streams/each-4194304-bytes: Change within noise threshold. time: [31.252 ms 31.297 ms 31.345 ms]
change: [-0.7940% -0.5596% -0.3103] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildstreams-flow-controlled/walltime/10-streams/each-1048576-bytes: No change in performance detected. time: [88.921 ms 90.137 ms 91.399 ms]
change: [-2.5803% -0.6927% +1.4131] (p = 0.49 > 0.05)
No change in performance detected.transfer/walltime/pacing-false/varying-seeds: Change within noise threshold. time: [21.332 ms 21.356 ms 21.390 ms]
change: [+0.8462% +1.0020% +1.1771] (p = 0.00 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severetransfer/walltime/pacing-true/varying-seeds: Change within noise threshold. time: [21.818 ms 21.845 ms 21.882 ms]
change: [+1.3815% +1.5403% +1.7365] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severetransfer/walltime/pacing-false/same-seed: No change in performance detected. time: [21.172 ms 21.193 ms 21.220 ms]
change: [-0.3908% -0.1742% +0.0117] (p = 0.09 > 0.05)
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severetransfer/walltime/pacing-true/same-seed: Change within noise threshold. time: [21.464 ms 21.489 ms 21.521 ms]
change: [-0.3615% -0.1995% -0.0136] (p = 0.02 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severeDownload data for |
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
|
Client/server transfer resultsPerformance differences relative to 2d08e02. Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Table above only shows statistically significant changes. See all results below. All resultsTransfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Download data for |
Extends the pattern from #3451 and #3602 to the HTTP/3 stream maps.
Box<dyn RecvStream>Box<dyn SendStream>Box<dyn Protocol>