Skip to content

fix(iroh-relay): Try connecting to relays over both IPv4 and IPv6#4299

Open
Frando wants to merge 12 commits into
mainfrom
Frando/relay-v6-v4
Open

fix(iroh-relay): Try connecting to relays over both IPv4 and IPv6#4299
Frando wants to merge 12 commits into
mainfrom
Frando/relay-v6-v4

Conversation

@Frando
Copy link
Copy Markdown
Member

@Frando Frando commented Jun 2, 2026

Description

Currently the relay client resolved a relay's hostname to a single address and dials
only that one. A prefer_ipv6 flag decides whether we query for an A or AAAA record.
If the DNS query returns an unreachable AAAA record for whatever reason, or we somehow
end in a state where prefer_ipv6 is true but the machine does not actually support IPv6,
then dialing the relay fails.

This PR fixes this by resolving both A and AAAA records and racing them happy-eyeballs style: the addresses are tried in
preference order, each dial starting a short delay after the previous, and the
first connection to succeed wins while the rest are cancelled. The same race
covers both direct and proxied relay connections.

Fixes #4069

Notes & open questions

Also fixes a unrelated doc comment that was very weirdly written and referred to, I think, some configuration from a very long time ago.

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.

@Frando Frando changed the title Frando/relay v6 v4 fix(iroh-relay): Try connecting to relays over both IPv4 and IPv6 Jun 2, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/4299/docs/iroh/

Last updated: 2026-06-05T13:09:11Z

@Frando Frando force-pushed the Frando/relay-v6-v4 branch from dfcc037 to 4da8397 Compare June 2, 2026 11:25
@Frando Frando force-pushed the Frando/relay-v6-v4 branch from 4da8397 to 3d14dc0 Compare June 2, 2026 11:25
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: 462d662

@n0bot n0bot Bot added this to iroh Jun 2, 2026
@github-project-automation github-project-automation Bot moved this to 🚑 Needs Triage in iroh Jun 2, 2026
Copy link
Copy Markdown
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks decent, i was kind of expecting this would be harder when i saw the happy-eyeballs mentioned. I've long thought such a strategy would make more sense, thanks for going that way.

I think the resolving itself being a stream is possible given how it's used? Only challenge then is how to prefer IPv6 but you could give all IPv4 results an extra 50-100ms delay if IPv6 is preferred or something like that?

Comment thread iroh-dns/src/dns.rs Outdated
Comment thread iroh-relay/src/client/tls.rs Outdated
Comment thread iroh-relay/src/server.rs Outdated
@Frando
Copy link
Copy Markdown
Member Author

Frando commented Jun 4, 2026

I pushed another commit that makes the design fully streaming, and fix semantics especially for fail fast (i.e. when the first attempt fails, do not wait until starting the second one). Also now interleaves attempts between IPv6 and IPv4, as recommended by the happy eyeballs RFC.

This all increases complexity a bit, but I think it is quite correct now and still fairly straightforward.

Comment thread iroh-relay/src/server.rs Outdated
@Frando Frando requested a review from flub June 5, 2026 08:09
Copy link
Copy Markdown
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is looking lovely! One nitpick, one slightly more involved question that I might be wrong on.

Comment thread iroh-dns/src/dns.rs
Comment thread iroh-relay/src/client/tls.rs Outdated
Comment on lines +268 to +270
// Delay after which to start the next connection attempt, or `None` for immediately.
let next_dial_delay = MaybeFuture::None;
tokio::pin!(next_dial_delay);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't particularly like using None for "immediately". MaybeFuture::None is always pending and this is awaited on. So the code depends on always setting it back to Some before restarting the loop. It seems a bit dangerous.

Would this not be easier with making next_dial_delay a tokio::time::Interval? You can then use reset_immediately to dial immediately, and poll_tick to see if you need to dial.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find a good way to structure this with Interval, because we only want to set a timer if a connection attempt is already inflight. So I kept the MaybeFuture version, which is correct AFAICS, and improved the variable naming and inline comments.

@Frando Frando requested a review from flub June 5, 2026 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🚑 Needs Triage

Development

Successfully merging this pull request may close these issues.

bug: No fallback to IPv4 when IPv6 is advertised but failing (iroh_relay::client)

2 participants