fix(iroh-relay): Try connecting to relays over both IPv4 and IPv6#4299
fix(iroh-relay): Try connecting to relays over both IPv4 and IPv6#4299Frando wants to merge 12 commits into
Conversation
|
Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/4299/docs/iroh/ Last updated: 2026-06-05T13:09:11Z |
dfcc037 to
4da8397
Compare
4da8397 to
3d14dc0
Compare
flub
left a comment
There was a problem hiding this comment.
looks decent, i was kind of expecting this would be harder when i saw the happy-eyeballs mentioned. I've long thought such a strategy would make more sense, thanks for going that way.
I think the resolving itself being a stream is possible given how it's used? Only challenge then is how to prefer IPv6 but you could give all IPv4 results an extra 50-100ms delay if IPv6 is preferred or something like that?
|
I pushed another commit that makes the design fully streaming, and fix semantics especially for fail fast (i.e. when the first attempt fails, do not wait until starting the second one). Also now interleaves attempts between IPv6 and IPv4, as recommended by the happy eyeballs RFC. This all increases complexity a bit, but I think it is quite correct now and still fairly straightforward. |
flub
left a comment
There was a problem hiding this comment.
this is looking lovely! One nitpick, one slightly more involved question that I might be wrong on.
| // Delay after which to start the next connection attempt, or `None` for immediately. | ||
| let next_dial_delay = MaybeFuture::None; | ||
| tokio::pin!(next_dial_delay); |
There was a problem hiding this comment.
I don't particularly like using None for "immediately". MaybeFuture::None is always pending and this is awaited on. So the code depends on always setting it back to Some before restarting the loop. It seems a bit dangerous.
Would this not be easier with making next_dial_delay a tokio::time::Interval? You can then use reset_immediately to dial immediately, and poll_tick to see if you need to dial.
There was a problem hiding this comment.
I didn't find a good way to structure this with Interval, because we only want to set a timer if a connection attempt is already inflight. So I kept the MaybeFuture version, which is correct AFAICS, and improved the variable naming and inline comments.
Description
Currently the relay client resolved a relay's hostname to a single address and dials
only that one. A
prefer_ipv6flag decides whether we query for an A or AAAA record.If the DNS query returns an unreachable AAAA record for whatever reason, or we somehow
end in a state where
prefer_ipv6is true but the machine does not actually support IPv6,then dialing the relay fails.
This PR fixes this by resolving both A and AAAA records and racing them happy-eyeballs style: the addresses are tried in
preference order, each dial starting a short delay after the previous, and the
first connection to succeed wins while the rest are cancelled. The same race
covers both direct and proxied relay connections.
Fixes #4069
Notes & open questions
Also fixes a unrelated doc comment that was very weirdly written and referred to, I think, some configuration from a very long time ago.
Change checklist