Skip to content

sweep: isolate bad inputs on mempool rejection#10842

Draft
yyforyongyu wants to merge 10 commits into
lightningnetwork:masterfrom
yyforyongyu:isolate-mempool-failure
Draft

sweep: isolate bad inputs on mempool rejection#10842
yyforyongyu wants to merge 10 commits into
lightningnetwork:masterfrom
yyforyongyu:isolate-mempool-failure

Conversation

@yyforyongyu

Copy link
Copy Markdown
Member

Summary

  • add no-broadcast subset probes for unattributed mempool/script rejections
  • route completed diagnoses through BadInputs so only singleton bad inputs are failed fatally
  • document sweeper failure-attribution roles and add a release note

Fixes #10840.

Tests

  • git diff --check
  • make lint
  • GOWORK=off go test ./contractcourt ./lnrpc/walletrpc ./sweep

@gemini-code-assist

Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a diagnostic mechanism for the sweeper to identify and isolate problematic inputs when a transaction batch is rejected by the mempool. By performing no-broadcast probes on subsets of inputs, the system can now pinpoint specific inputs causing failures rather than failing the entire batch. This improves the robustness of the sweeping process by ensuring that only truly invalid inputs are treated as fatal, while valid inputs can continue to be processed and retried.

Highlights

  • Mempool Rejection Diagnosis: Implemented no-broadcast subset probes to isolate specific inputs causing mempool or script rejections, preventing the fatal failure of entire batches when only a single input is invalid.
  • Sweeper Failure Attribution: Updated the UtxoSweeper to handle diagnosed bad inputs by marking them as fatal while allowing the remaining inputs in the batch to be retried.
  • Documentation and Testing: Added release notes documenting the new failure-attribution logic and included comprehensive unit tests to verify the diagnosis flow and edge cases.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to isolate and diagnose bad inputs in a rejected sweep batch using no-broadcast mempool acceptance probes. When a multi-input batch is rejected with an unattributed non-fee mempool or script error, the TxPublisher performs a binary search probe of the input subsets to identify the specific inputs causing the failure. The UtxoSweeper then marks only those identified bad inputs as fatal, allowing the remaining valid inputs in the batch to be retried. A review comment suggests adding a defensive nil check for r.feeFunction in shouldDiagnoseBadInputs to prevent a potential nil pointer dereference panic.

Comment thread sweep/fee_bumper.go
@github-actions github-actions Bot added the severity-critical Requires expert review - security/consensus critical label May 25, 2026
@github-actions

Copy link
Copy Markdown

🔴 PR Severity: CRITICAL

Automated classification | 4 files (excl. tests) | 417 lines changed (excl. tests)

🔴 Critical (2 files)
  • sweep/fee_bumper.go - Fee bumping logic in the sweep package (output sweeping, fund recovery)
  • sweep/sweeper.go - Core sweeper implementation for on-chain output sweeping and fund recovery
🟢 Low (2 files)
  • docs/release-notes/release-notes-0.21.1.md - Release notes documentation
  • sweep/README.md - Package documentation

Analysis

This PR modifies sweep/fee_bumper.go and sweep/sweeper.go, both under the sweep/ package which is explicitly classified as CRITICAL because it handles output sweeping, fund recovery, and fee bumping — all security-sensitive on-chain fund management operations. Errors in this package can result in loss of funds or inability to recover outputs from force-closed channels.

The bulk of changes (265 lines) are in sweep/fee_bumper.go, which directly impacts how fee rates are calculated and applied to sweep transactions. This requires expert review to ensure correctness under all fee market conditions.

Severity bump check: 4 non-test files (threshold: >20), 417 non-test lines (threshold: >500), single critical package — no bump applied.


To override, add a severity-override-{critical,high,medium,low} label.
<!-- pr-severity-bot -->

Comment thread sweep/fee_bumper.go
Comment thread sweep/fee_bumper.go
Comment thread sweep/fee_bumper.go
Comment thread sweep/fee_bumper.go Outdated
Comment thread sweep/fee_bumper.go Outdated
Comment thread sweep/fee_bumper.go Outdated
Comment thread sweep/fee_bumper.go
Comment thread sweep/fee_bumper.go Outdated
Comment thread sweep/fee_bumper.go Outdated
Comment thread sweep/sweeper.go Outdated
@yyforyongyu yyforyongyu force-pushed the isolate-mempool-failure branch from 87d1e3e to 75b9fe1 Compare May 26, 2026 12:27
@saubyk saubyk added this to v0.21 May 26, 2026
@saubyk saubyk moved this to In progress in v0.21 May 26, 2026
@saubyk saubyk added this to the v0.21.1 milestone Jun 1, 2026
@yyforyongyu yyforyongyu force-pushed the isolate-mempool-failure branch from 75b9fe1 to 4ef522b Compare June 10, 2026 07:16
Comment thread sweep/fee_bumper.go Outdated
@yyforyongyu

Copy link
Copy Markdown
Member Author

LGTM ⚡ (claude-opus-4-8[1m])

@yyforyongyu yyforyongyu force-pushed the isolate-mempool-failure branch from 4ef522b to b1443e9 Compare June 10, 2026 11:18
@yyforyongyu

yyforyongyu commented Jun 10, 2026

Copy link
Copy Markdown
Member Author

LGTM 🦦 openai/gpt-5.5

@yyforyongyu

Copy link
Copy Markdown
Member Author

LGTM ✨ (claude-opus-4-8[1m])

Comment thread sweep/sweeper.go Outdated
Comment thread sweep/fee_bumper.go Outdated
Comment thread sweep/fee_bumper.go Outdated
Comment thread sweep/fee_bumper.go Outdated
Comment thread sweep/fee_bumper.go Outdated
Comment thread sweep/sweeper_test.go Outdated
@yyforyongyu yyforyongyu force-pushed the isolate-mempool-failure branch 2 times, most recently from a5f51c3 to 24a2759 Compare June 18, 2026 12:51
Comment thread sweep/fee_bumper.go Outdated
Comment thread sweep/fee_bumper.go Outdated
Comment thread sweep/fee_bumper.go Outdated
Comment thread sweep/sweeper.go
@yyforyongyu

Copy link
Copy Markdown
Member Author

No issues found by claude-opus-4-8[1m] 🌟

@yyforyongyu

Copy link
Copy Markdown
Member Author

No issues found by openai/gpt-5.5 ✅

@yyforyongyu yyforyongyu force-pushed the isolate-mempool-failure branch from 24a2759 to 7ca790c Compare June 18, 2026 13:57
Comment thread sweep/fee_bumper.go Outdated
@yyforyongyu yyforyongyu force-pushed the isolate-mempool-failure branch from 7ca790c to 7991909 Compare June 18, 2026 14:33
@saubyk saubyk modified the milestones: v0.21.1, v0.21.2 Jun 18, 2026
Mark errors returned from mempool acceptance checks so later
logic can distinguish rejected transactions from construction, signing,
or fee-selection failures.
Add the eligibility checks for no-broadcast bad-input probes and
cover the cases that must keep the existing whole-set failure path.
Add no-broadcast probe transactions for input subsets, preserving
missing-input handling and returning concrete mempool policy errors to
the caller.
Identify the mempool rejection classes that can be attributed to a
single input script or witness, leaving broader policy failures for
the caller to handle without fatal attribution.
Add the final-shape bad-input search helper that recursively probes
subsets without broadcasting, and cover both singleton attribution and
non-attributable policy failures.
Wire bad-input diagnosis into the non-fee mempool rejection path so
a singleton script failure becomes a retryable failed batch annotated
with the diagnosed input.
Cover non-attributable diagnosis outcomes, probe construction and
missing-input errors, and initial-broadcast paths that must preserve
the existing whole-set fatal behavior.
Handle diagnosed bad-input results in the sweeper by marking only
the isolated input fatal and returning the rest of the set to the
normal retry flow.
@yyforyongyu yyforyongyu force-pushed the isolate-mempool-failure branch from 7991909 to e111613 Compare June 22, 2026 09:54
Comment thread sweep/fee_bumper.go

badInput, probeErr := t.findBadInput(r)
if errors.Is(probeErr, ErrInputMissing) {
return t.handleMissingInputs(r)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call is introduced in 8df710517, but handleMissingInputs is only made nil-r.tx safe in the later 0c86e21e5 test commit. That means checking out or bisecting the implementation commit can panic on the initial-broadcast probe-missing-input path before a tx exists, and the production fix is hidden in a sweep/test: commit. Please fold the nil guard into the commit that adds this call, or split the test commit so each commit is self-contained. (openai/gpt-5.5)

Comment thread sweep/fee_bumper.go
// input-standardness failures narrow the search left.
// Missing inputs keep their dedicated handling. Policy/backend
// errors stop the search without marking any input fatal.
err := t.probeInputSet(r, left)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probing only left can abort diagnosis before the actual bad input is inspected. If the set needed wallet inputs, or the left half contains only a required-output HTLC, probeInputSet returns a construction error such as ErrNotEnoughInputs; this default path returns that error and handleBadInputs later emits TxFailed with no BadInput, so the original cluster can be retried unchanged forever even when the bad input is in the right half. Please keep subset probes constructible, or continue with the other half/a singleton scan when a subset construction error is hit, so one invalid input does not pin the rest of the batch. (openai/gpt-5.5)

@yyforyongyu

Copy link
Copy Markdown
Member Author

No issues found by claude-opus-4-8[1m] 🌿

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

severity-critical Requires expert review - security/consensus critical

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

[bug]: sweeper batched 4 force-close outputs into invalid tx, then removed all inputs as State=Fatal

2 participants