Skip to content

WG21Index.refresh() error discrimination (5pt) #72

Description

@henry0816191

Problem

WG21Index.refresh() in src/paperscout/sources.py returns an empty self.papers dict on all failure paths — network error, timeout, rate limit, parse failure, and "no data available" — without raising or propagating a discriminated error. Callers (Scheduler.seed() and Scheduler.poll_once()) receive the same empty dict regardless of root cause, preventing differentiated retry logic. The _download() helper logs a FailureCategory but this structured signal is consumed by log.warning/error and discarded from the return path.

Acceptance Criteria

  • refresh() raises (or returns a result object carrying) a discriminated error distinguishing at least TIMEOUT, RATE_LIMIT, NETWORK, and CONFIGURATION categories
  • When stale fallback is used, callers receive both the data and a stale signal
  • The "no index data available" terminal path raises ConfigurationError
  • Existing tests are updated to assert discriminated error handling
  • New tests cover: timeout → retry-eligible, 429 → rate-limit, stale fallback → stale signal, no data → permanent error

Implementation Notes

  • Primary file: src/paperscout/sources.py — modify refresh() and _download()
  • Error taxonomy: src/paperscout/errors.py — add IndexRefreshError(category: FailureCategory)
  • Caller updates: src/paperscout/monitor.py — handle discriminated result
  • Tests: tests/test_sources.py, tests/test_scout.py, tests/test_monitor.py

References

  • src/paperscout/errors.py (existing FailureCategory enum)
  • src/paperscout/sources.py (download error handlers)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions