Skip to content

[Bug] backfill stores missing PR author as empty string, allowing unknown solvers through mirror solved-issue pipeline #85

@jonathanchang31

Description

@jonathanchang31

Summary

Backfill persists missing PR author IDs as '' (empty string) instead of NULL. The miner-issues SQL uses sp.author_github_id IS NOT NULL to exclude unknown solvers, so empty-string values pass the filter and get emitted as solving_pr.author_github_id.

Impact

High / scoring integrity:

  • Issues can be classified as solved with an unknown solver identity.
  • Mirror output violates intended “no author => skip solver credit” behavior.
  • Downstream gittensor issue-discovery can consume malformed solver identity.

Code references

  • packages/das/src/webhook/github-fetcher.service.ts
    • PR backfill upsert: authorGithubId: String(pr.author?.databaseId ?? "")
  • packages/das/src/api/miners/miners.service.ts
    • solving PR subquery:
      • comment: skip null-author solvers
      • condition: sp.author_github_id IS NOT NULL (does not exclude '')

Repro (conceptual)

  1. Backfill repo containing PR where GraphQL author databaseId is missing/unavailable.
  2. PR row stored with author_github_id = ''.
  3. Linked issue with solved_by_pr resolves solving_pr because '' IS NOT NULL.
  4. API returns solved issue with empty solver identity.

Expected behavior

Unknown solver identity should be represented as NULL and excluded from solver credit path.

Suggested fix direction

  • In backfill, persist missing author IDs as NULL, not empty string.
  • Harden SQL guard to exclude blanks too:
    • e.g. sp.author_github_id IS NOT NULL AND sp.author_github_id <> ''
  • Add normalization for existing bad rows if needed.

Acceptance criteria

  • PRs with missing author IDs persist NULL author_github_id.
  • Solving PR subquery excludes null/blank solver identities.
  • Regression test covers missing-author PR in backfill flow.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions