Skip to content

Discriminate WG21 index refresh failures and propagate stale signal t…#76

Open
henry0816191 wants to merge 3 commits into
developfrom
feat/index-refresh-error-discrimination
Open

Discriminate WG21 index refresh failures and propagate stale signal t…#76
henry0816191 wants to merge 3 commits into
developfrom
feat/index-refresh-error-discrimination

Conversation

@henry0816191

@henry0816191 henry0816191 commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add IndexRefreshError and IndexRefreshResult so WG21Index.refresh() / fetch() distinguish transient download failures (TIMEOUT, RATE_LIMIT, NETWORK) from permanent ones (CONFIGURATION).
  • Return IndexRefreshResult(papers, stale=True) on stale cache fallback instead of silently returning an empty or in-memory dict.
  • Raise ConfigurationError when no index data is available (e.g. non-dict JSON with no cache).
  • Update Scheduler to unwrap IndexRefreshResult, log INDEX-STALE, skip advancing _last_successful_poll on stale index, and handle IndexRefreshError in run_forever() (retry next cycle).

Test plan

  • pytest — full suite (392 passed)
  • ruff check / ruff format --check
  • mypy on protocols, monitor, __main__
  • New unit tests: timeout/429/stale/configuration error paths in test_sources.py
  • New integration tests: error propagation and run_forever() behavior in test_monitor.py

Related Issue

Summary by CodeRabbit

  • Bug Fixes

    • Improved WG21 index refresh robustness with retry-eligible error classification versus configuration-fatal failures.
    • Added structured refresh outcomes to support consistent stale-cache fallback when live refresh fails.
    • Updated polling behavior and logs to correctly suppress scheduling delays when an index refresh is marked stale.
  • Tests

    • Expanded coverage for transient timeout/rate-limit handling, stale fallback behavior, and correct “continue polling vs halt” outcomes.

@henry0816191 henry0816191 self-assigned this Jun 23, 2026
@henry0816191 henry0816191 requested a review from wpak-ai as a code owner June 23, 2026 18:38
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds IndexRefreshError (with FailureCategory) and IndexRefreshResult (with papers and stale) types. Updates WG21Index._download() to return a (data, category) tuple, refresh()/fetch() to return IndexRefreshResult or raise typed errors. Scheduler gains _index_stale tracking, a _advance_staleness_clock_if_ok() helper, and a dedicated IndexRefreshError handler in run_forever(). Tests updated throughout.

Changes

WG21 Index Refresh Error Discrimination

Layer / File(s) Summary
New error and result contracts
src/paperscout/errors.py, src/paperscout/models.py
IndexRefreshError exception carries a FailureCategory; IndexRefreshResult frozen dataclass carries papers and a stale bool flag.
WG21Index download and refresh restructuring
src/paperscout/sources.py
_download() returns (data, category) tuple with per-exception FailureCategory; refresh() returns IndexRefreshResult for fresh and stale-cache paths and raises ConfigurationError or IndexRefreshError via _raise_on_download_failure(); fetch() return type updated to IndexRefreshResult.
Scheduler staleness gating and IndexRefreshError handling
src/paperscout/monitor.py
Scheduler.__init__ adds _index_stale flag; _poll_sources reads IndexRefreshResult.stale and logs INDEX-STALE; new _advance_staleness_clock_if_ok() suppresses _last_successful_poll advancement when stale; poll_once call sites replaced with the helper; run_forever gains a dedicated except IndexRefreshError clause.
WG21Index source tests updated for structured results
tests/test_sources.py
refresh() tests assert result.stale and result.papers; _download() tests assert (data, category) tuple with CONFIGURATION/NETWORK/RATE_LIMIT categories; timeout/stale-fallback/no-data tests assert IndexRefreshError categories and ConfigurationError.
Scheduler integration tests for index refresh error paths
tests/test_monitor.py
Existing mock updated to return IndexRefreshResult; _make_scheduler_with_real_wg21 helper added; four new tests cover poll_once propagation of IndexRefreshError, stale-index usage, run_forever continuation on timeout, and run_forever halt on configuration error.
Protocol conformance test mock alignment
tests/test_callback_protocols.py, tests/test_datasource_protocol.py
WG21 mock fetch return values updated from plain {} dicts to IndexRefreshResult instances.

Sequence Diagram(s)

sequenceDiagram
  participant Scheduler
  participant WG21Index
  participant httpx

  rect rgba(70, 130, 180, 0.5)
    Note over Scheduler,httpx: Fresh download path
    Scheduler->>WG21Index: fetch()
    WG21Index->>httpx: GET index JSON
    httpx-->>WG21Index: 200 OK dict
    WG21Index-->>Scheduler: IndexRefreshResult(papers=..., stale=False)
    Scheduler->>Scheduler: _advance_staleness_clock_if_ok()
  end

  rect rgba(180, 70, 70, 0.5)
    Note over Scheduler,httpx: Timeout with stale cache
    Scheduler->>WG21Index: fetch()
    WG21Index->>httpx: GET index JSON
    httpx-->>WG21Index: ReadTimeout
    WG21Index-->>Scheduler: IndexRefreshResult(papers=cached, stale=True)
    Scheduler->>Scheduler: _index_stale=True, log INDEX-STALE
    Scheduler->>Scheduler: _advance_staleness_clock_if_ok() → no-op
  end

  rect rgba(180, 100, 30, 0.5)
    Note over Scheduler,httpx: Timeout with no cache
    Scheduler->>WG21Index: fetch()
    WG21Index->>httpx: GET index JSON
    httpx-->>WG21Index: ReadTimeout
    WG21Index-->>Scheduler: raises IndexRefreshError(TIMEOUT)
    Scheduler->>Scheduler: run_forever logs failure_category=TIMEOUT, continues
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • cppalliance/paperscout#29: Established the FailureCategory enum and initial _download() error logging in sources.py and run_forever exception handling that this PR directly extends.
  • cppalliance/paperscout#40: Both PRs modify src/paperscout/monitor.py's Scheduler.run_forever() exception-handling logic to add dedicated handling for specific failure types.
  • cppalliance/paperscout#65: Introduced DataSource.fetch() protocol and wired WG21Index.fetch() through Scheduler, which this PR updates to return IndexRefreshResult.

Suggested reviewers

  • wpak-ai

Poem

🐇 Oh hop, oh hop, the index cried,
"Am I stale or fresh inside?"
Now TIMEOUT, NETWORK, CONFIG too —
Each failure tagged with category true!
The scheduler checks the stale flag's gleam,
And run_forever stays on beam. 🌟

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.74% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title partially summarizes the main changes (error discrimination and stale signal propagation) but is truncated and does not complete the full thought. Complete the truncated title to clearly convey both aspects of the change, e.g., 'Discriminate WG21 index refresh failures and propagate stale signal to scheduler'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The description is comprehensive and well-structured, covering summary, test plan with verification checkmarks, and linked issue closure.
Linked Issues check ✅ Passed All acceptance criteria from issue #72 are fully addressed: error discrimination (TIMEOUT, RATE_LIMIT, NETWORK, CONFIGURATION), stale signal propagation via IndexRefreshResult, ConfigurationError on no data, and comprehensive test coverage.
Out of Scope Changes check ✅ Passed All changes are directly aligned with issue #72 requirements: new error types, stale result handling, scheduler integration, and test updates for error scenarios.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/index-refresh-error-discrimination

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/paperscout/sources.py`:
- Around line 99-107: The code currently accepts any dict as a successful
response, including empty dicts which should be rejected. In the
isinstance(data, dict) check, add an additional condition to verify the dict is
not empty before returning it as successful. If the data is an empty dict, treat
it the same as a non-dict response by logging a warning with
FailureCategory.CONFIGURATION and returning None, FailureCategory.CONFIGURATION
instead, to prevent empty payloads from overwriting the cache.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7e6c6db1-416a-4e62-a98f-ca48d5cb20cc

📥 Commits

Reviewing files that changed from the base of the PR and between be4f507 and 3009c86.

📒 Files selected for processing (8)
  • src/paperscout/errors.py
  • src/paperscout/models.py
  • src/paperscout/monitor.py
  • src/paperscout/sources.py
  • tests/test_callback_protocols.py
  • tests/test_datasource_protocol.py
  • tests/test_monitor.py
  • tests/test_sources.py

Comment thread src/paperscout/sources.py
@henry0816191

Copy link
Copy Markdown
Collaborator Author

@coderabbitai full review again

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

@henry0816191 Sure! I'll kick off a full review of the PR now.

✅ Action performed

Full review finished.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
tests/test_sources.py (1)

264-272: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add a malformed-JSON regression for the terminal config path.

The no-data test covers non-dict JSON, but not resp.json() raising ValueError; that branch determines whether no-cache refresh becomes ConfigurationError or retryable IndexRefreshError.

🧪 Proposed test
     async def test_refresh_no_data_raises_configuration_error(self, fake_pool):
         index = WG21Index(fake_pool)
         mock_resp = _make_response(200, json_data=[1, 2, 3])
         mock_client = _make_async_client(get_resp=mock_resp)
         with patch("paperscout.sources.httpx.AsyncClient") as mock_cls:
             mock_cls.return_value.__aenter__ = AsyncMock(return_value=mock_client)
             mock_cls.return_value.__aexit__ = AsyncMock(return_value=False)
             with pytest.raises(ConfigurationError, match="No index data available"):
                 await index.refresh()
+
+    async def test_refresh_malformed_json_raises_configuration_error(self, fake_pool):
+        index = WG21Index(fake_pool)
+        mock_resp = _make_response(200, json_data=SAMPLE_INDEX_DATA)
+        mock_resp.json = MagicMock(side_effect=ValueError("invalid json"))
+        mock_client = _make_async_client(get_resp=mock_resp)
+        with patch("paperscout.sources.httpx.AsyncClient") as mock_cls:
+            mock_cls.return_value.__aenter__ = AsyncMock(return_value=mock_client)
+            mock_cls.return_value.__aexit__ = AsyncMock(return_value=False)
+            with pytest.raises(ConfigurationError, match="No index data available"):
+                await index.refresh()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_sources.py` around lines 264 - 272, Add a regression test for
malformed JSON handling in the WG21Index refresh flow. Currently,
test_refresh_no_data_raises_configuration_error covers non-dict JSON responses,
but not the case where resp.json() raises a ValueError due to malformed JSON.
Create a new test (or extend the existing one) that mocks the async response
object to raise a ValueError when .json() is called (using AsyncMock with
side_effect), then verify that index.refresh() still raises ConfigurationError
with an appropriate error message. This ensures the error path that determines
whether no-cache refresh becomes ConfigurationError versus retryable
IndexRefreshError is properly covered.
tests/test_monitor.py (1)

812-827: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Cover stale fallback after the scheduler is already seeded.

This test currently hits the cold-start seed early return, so it does not exercise the normal poll completion path where _advance_staleness_clock_if_ok() runs after diffing and notifications. Add a second stale poll or pre-seed the scheduler to cover that path.

Example extension
         assert "P2300R10" in wg21.papers
         assert "INDEX-STALE" in caplog.text
         assert scheduler._last_successful_poll is None
         prober.fetch.assert_called_once()
+
+        prober.fetch.reset_mock()
+        with patch("paperscout.sources.httpx.AsyncClient") as mock_cls:
+            mock_cls.return_value.__aenter__ = AsyncMock(return_value=mock_client)
+            mock_cls.return_value.__aexit__ = AsyncMock(return_value=False)
+            await scheduler.poll_once()
+        assert scheduler._last_successful_poll is None
+        prober.fetch.assert_called_once()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_monitor.py` around lines 812 - 827, The
test_poll_once_uses_stale_index test currently exits early during the cold-start
seed phase, so it does not exercise the normal poll completion path where
_advance_staleness_clock_if_ok() runs. To cover this path, add a second await
scheduler.poll_once() call after the first one (before the assertions) to ensure
the scheduler is already seeded and initialized, so the second poll attempt will
go through the full completion path instead of taking the cold-start early
return.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/paperscout/sources.py`:
- Around line 59-63: The code checks if cached data is not None but doesn't
validate that it contains actual entries. An empty dictionary from
`_cache.read_if_fresh()` will pass the `is not None` check and result in
`_parse_and_index(cached)` returning empty papers while still returning
`stale=False`, which incorrectly advances downstream staleness clocks. Modify
the condition that checks `if cached is not None` to also verify that cached is
not empty (has length greater than zero) before proceeding with
`_parse_and_index(cached)` and returning the IndexRefreshResult with
stale=False. This fix needs to be applied in both locations mentioned: the
cached read_if_fresh path and the also applies section around line 72-79.
- Around line 136-143: The except clause in the WG21 index fetching logic
currently catches both httpx.HTTPError and ValueError together and classifies
both as NETWORK failures, but ValueError from malformed JSON should be treated
as a CONFIGURATION failure instead. Split the exception handling into two
separate except clauses: keep httpx.HTTPError with FailureCategory.NETWORK
classification, and add a new except clause specifically for ValueError that
logs and returns FailureCategory.CONFIGURATION to properly surface parse
failures as terminal configuration errors rather than retryable network errors.

In `@tests/test_monitor.py`:
- Around line 872-874: The test for the fatal-path in scheduler.run_forever()
lacks a shutdown mechanism to prevent hanging if the expected ConfigurationError
path regresses. Replace the AsyncMock() for the patched asyncio.sleep with a
mock that raises an exception instead of returning immediately. This ensures
that if the code path ever reaches the sleep call (indicating the
ConfigurationError was not properly raised), the test will fail immediately with
an exception rather than entering an infinite loop.

---

Nitpick comments:
In `@tests/test_monitor.py`:
- Around line 812-827: The test_poll_once_uses_stale_index test currently exits
early during the cold-start seed phase, so it does not exercise the normal poll
completion path where _advance_staleness_clock_if_ok() runs. To cover this path,
add a second await scheduler.poll_once() call after the first one (before the
assertions) to ensure the scheduler is already seeded and initialized, so the
second poll attempt will go through the full completion path instead of taking
the cold-start early return.

In `@tests/test_sources.py`:
- Around line 264-272: Add a regression test for malformed JSON handling in the
WG21Index refresh flow. Currently,
test_refresh_no_data_raises_configuration_error covers non-dict JSON responses,
but not the case where resp.json() raises a ValueError due to malformed JSON.
Create a new test (or extend the existing one) that mocks the async response
object to raise a ValueError when .json() is called (using AsyncMock with
side_effect), then verify that index.refresh() still raises ConfigurationError
with an appropriate error message. This ensures the error path that determines
whether no-cache refresh becomes ConfigurationError versus retryable
IndexRefreshError is properly covered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6ed944b6-b15b-4e7a-8401-9bf606446a8b

📥 Commits

Reviewing files that changed from the base of the PR and between be4f507 and fb2e835.

📒 Files selected for processing (8)
  • src/paperscout/errors.py
  • src/paperscout/models.py
  • src/paperscout/monitor.py
  • src/paperscout/sources.py
  • tests/test_callback_protocols.py
  • tests/test_datasource_protocol.py
  • tests/test_monitor.py
  • tests/test_sources.py

Comment thread src/paperscout/sources.py
Comment on lines 59 to +63
cached = self._cache.read_if_fresh()
if cached is not None:
log.info("Loaded %d entries from cache", len(cached))
self.papers = self._parse_and_index(cached)
return self.papers
return IndexRefreshResult(self.papers, stale=False)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Don’t treat empty cache rows as usable index data.

Line 60 and Line 73 accept {} from PaperCache as valid because they only check is not None. An empty persisted row can parse to no papers and return stale=False on the fresh-cache path, allowing downstream staleness clocks to advance with no index data.

🐛 Proposed fix
         cached = self._cache.read_if_fresh()
-        if cached is not None:
+        if cached:
             log.info("Loaded %d entries from cache", len(cached))
             self.papers = self._parse_and_index(cached)
             return IndexRefreshResult(self.papers, stale=False)
+        if cached == {}:
+            log.warning("INDEX-CACHE  empty fresh cache ignored")
 
         raw, category = await self._download()
         if raw is not None:
             self._cache.write(raw)
             log.info("Downloaded and cached %d entries", len(raw))
@@
-        if stale is not None:
+        if stale:
             log.warning(
                 "INDEX-STALE-FALLBACK  entries=%d  (download failed; using persisted cache)",
                 len(stale),
             )
             self.papers = self._parse_and_index(stale)
             return IndexRefreshResult(self.papers, stale=True)
+        if stale == {}:
+            log.warning("INDEX-STALE-FALLBACK  skipped empty persisted cache")

Also applies to: 72-79

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/paperscout/sources.py` around lines 59 - 63, The code checks if cached
data is not None but doesn't validate that it contains actual entries. An empty
dictionary from `_cache.read_if_fresh()` will pass the `is not None` check and
result in `_parse_and_index(cached)` returning empty papers while still
returning `stale=False`, which incorrectly advances downstream staleness clocks.
Modify the condition that checks `if cached is not None` to also verify that
cached is not empty (has length greater than zero) before proceeding with
`_parse_and_index(cached)` and returning the IndexRefreshResult with
stale=False. This fix needs to be applied in both locations mentioned: the
cached read_if_fresh path and the also applies section around line 72-79.

Comment thread src/paperscout/sources.py
Comment on lines 136 to +143
except (httpx.HTTPError, ValueError) as exc:
log.error(
"INDEX-FETCH failure_category=%s url=%s %s",
FailureCategory.NETWORK.value,
WG21_INDEX_URL,
exc,
)
return None
return None, FailureCategory.NETWORK

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Classify malformed JSON as configuration/no-data, not retryable network.

ValueError from payload parsing currently shares the HTTP error path, so a no-cache refresh raises retryable IndexRefreshError(NETWORK) instead of the terminal configuration path. Catch it separately as CONFIGURATION. As per PR objectives, parse failures with no cache should surface ConfigurationError.

🐛 Proposed fix
-        except (httpx.HTTPError, ValueError) as exc:
+        except ValueError as exc:
+            log.warning(
+                "INDEX-FETCH  failure_category=%s  url=%s  invalid JSON payload: %s",
+                FailureCategory.CONFIGURATION.value,
+                WG21_INDEX_URL,
+                exc,
+            )
+            return None, FailureCategory.CONFIGURATION
+        except httpx.HTTPError as exc:
             log.error(
                 "INDEX-FETCH  failure_category=%s  url=%s  %s",
                 FailureCategory.NETWORK.value,
                 WG21_INDEX_URL,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/paperscout/sources.py` around lines 136 - 143, The except clause in the
WG21 index fetching logic currently catches both httpx.HTTPError and ValueError
together and classifies both as NETWORK failures, but ValueError from malformed
JSON should be treated as a CONFIGURATION failure instead. Split the exception
handling into two separate except clauses: keep httpx.HTTPError with
FailureCategory.NETWORK classification, and add a new except clause specifically
for ValueError that logs and returns FailureCategory.CONFIGURATION to properly
surface parse failures as terminal configuration errors rather than retryable
network errors.

Comment thread tests/test_monitor.py
Comment on lines +872 to +874
with caplog.at_level(logging.CRITICAL, logger="paperscout.monitor"):
with patch("asyncio.sleep", AsyncMock()):
await scheduler.run_forever()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Make the fatal-path test fail fast instead of hanging.

run_forever() has no shutdown event here; if the expected ConfigurationError path regresses, AsyncMock() makes sleep return immediately and the test can loop forever. Make sleep raise if reached.

Proposed test hardening
             with caplog.at_level(logging.CRITICAL, logger="paperscout.monitor"):
-                with patch("asyncio.sleep", AsyncMock()):
+                with patch(
+                    "asyncio.sleep",
+                    AsyncMock(
+                        side_effect=AssertionError(
+                            "run_forever should halt before sleeping"
+                        )
+                    ),
+                ):
                     await scheduler.run_forever()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_monitor.py` around lines 872 - 874, The test for the fatal-path in
scheduler.run_forever() lacks a shutdown mechanism to prevent hanging if the
expected ConfigurationError path regresses. Replace the AsyncMock() for the
patched asyncio.sleep with a mock that raises an exception instead of returning
immediately. This ensures that if the code path ever reaches the sleep call
(indicating the ConfigurationError was not properly raised), the test will fail
immediately with an exception rather than entering an infinite loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WG21Index.refresh() error discrimination (5pt)

1 participant