Skip to content

Fix KeyError in User.__init__ and error handling (#417)#421

Closed
PriyanshAroraa wants to merge 2 commits into
d60:mainfrom
PriyanshAroraa:fix/user-keyerror-417
Closed

Fix KeyError in User.__init__ and error handling (#417)#421
PriyanshAroraa wants to merge 2 commits into
d60:mainfrom
PriyanshAroraa:fix/user-keyerror-417

Conversation

@PriyanshAroraa

@PriyanshAroraa PriyanshAroraa commented Apr 18, 2026

Copy link
Copy Markdown

Fixes #417.

Problem

Two separate KeyError crashes when X's API returns responses with missing fields:

1. User.__init__ — crashes on legacy['entities']['description']['urls'], legacy['withheld_in_countries'], and any of the ~30 other fields accessed via direct dict lookup. X's API intermittently omits these for certain accounts.

2. Client.requestresponse_data['errors'][0]['code'] can KeyError because X sometimes returns error objects without a code field. This masks the real failure with a confusing KeyError during error handling itself.

Fix

Switch direct dict lookups to .get() with sensible defaults:

  • Empty string for string fields (location, description, etc.)
  • 0 for counts (followers_count, media_count, etc.)
  • False for booleans (verified, is_translator, etc.)
  • Empty list for list fields (pinned_tweet_ids, withheld_in_countries, description_urls, urls)
  • None for error_code (matches the existing error_message line which already uses .get())

No behavior change when fields are present; graceful fallback when they are not.

Test

Hit the bug in production polling 50+ accounts across multiple niches. With the patch, get_user_by_screen_name and get_user_tweets have run cleanly across thousands of calls — no KeyErrors.

Summary by Sourcery

Handle missing fields in X API responses without raising KeyError in user initialization and error handling.

Bug Fixes:

  • Prevent User.init from crashing when optional legacy fields or nested entity URLs are missing by providing safe default values.
  • Avoid KeyError in Client.request error handling when an error object lacks a code field by treating the code as optional.

Summary by CodeRabbit

  • Bug Fixes
    • Gracefully handle API JSON errors when error codes or structures are missing, preventing crashes and ensuring consistent account-state handling.
    • Robustly parse user profiles and supply sensible defaults for missing fields (names, counts, flags, URLs, images), avoiding failures during profile construction.
    • Reduces runtime errors and improves overall stability and resilience.

X's API response omits certain legacy user fields intermittently
(entities.description.urls, withheld_in_countries, etc.), causing
User.__init__ to crash with KeyError.

Error responses from X sometimes omit the 'code' field too, which
masks the actual failure with a confusing KeyError in client.request.

Switch all direct dict lookups in User.__init__ and the error-code
read in Client.request to .get() with sensible defaults (empty string
for strings, 0 for counts, False for booleans, empty list for list
fields). No behavior change when fields are present; graceful
fallback when they are not.
@sourcery-ai

sourcery-ai Bot commented Apr 18, 2026

Copy link
Copy Markdown

Reviewer's Guide

Replaces brittle direct dict indexing in User model initialization and error handling with safe .get() accessors and sensible defaults to prevent intermittent KeyErrors when the X API omits optional fields.

Class diagram for updated User and Client.request behavior

classDiagram
    class Client {
        +request(method, url, params, json_body, files, headers) async
    }

    class User {
        +Client client
        +dict data
        +str id
        +str created_at
        +str name
        +str screen_name
        +str profile_image_url
        +str profile_banner_url
        +str url
        +str location
        +str description
        +list description_urls
        +list urls
        +list pinned_tweet_ids
        +bool is_blue_verified
        +bool verified
        +bool possibly_sensitive
        +bool can_dm
        +bool can_media_tag
        +bool want_retweets
        +bool default_profile
        +bool default_profile_image
        +bool has_custom_timelines
        +int followers_count
        +int fast_followers_count
        +int normal_followers_count
        +int following_count
        +int favourites_count
        +int listed_count
        +int media_count
        +int statuses_count
        +bool is_translator
        +str translator_type
        +list withheld_in_countries
        +bool protected
        +__init__(client, data)
    }

    Client "1" --> "many" User : creates
Loading

File-Level Changes

Change Details Files
Harden User.init against missing fields in X API responses by using .get() with defaults for legacy and data attributes.
  • Replace direct legacy[...] indexing for core string fields (created_at, name, screen_name, profile_image_url, location, description, translator_type) with legacy.get(..., '') defaults.
  • Change nested entity access for description_urls and urls to chained .get() calls with {} fallbacks and [] as the default list value.
  • Update pinned_tweet_ids, withheld_in_countries, and other list-like attributes to use .get(..., []) to avoid KeyErrors when arrays are absent.
  • Default boolean flags (verification, sensitivity, DM/media capabilities, profile flags, custom timelines, translation) via .get(..., False) when fields are missing.
  • Default numeric counters (followers, fast/normal followers, friends/following, favourites, listed, media, statuses) via .get(..., 0) when fields are missing.
  • Access is_blue_verified on the root data dict via data.get('is_blue_verified', False) instead of direct indexing.
twikit/user.py
Make HTTP error handling resilient to error objects that omit the code field.
  • Access the first error's code via response_data['errors'][0].get('code') instead of direct dict indexing.
  • Leave existing .get() usage for message unchanged and continue to branch on error_code values when present.
twikit/client/client.py

Assessment against linked issues

Issue Objective Addressed Explanation
#417 Ensure User.init handles missing fields in Twitter API legacy user data without raising KeyError by using safe access patterns and sensible defaults.
#417 Ensure Client.request error handling does not raise KeyError when error objects in the Twitter API response lack a 'code' field, by safely accessing this field.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai

coderabbitai Bot commented Apr 18, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 773689c0-5e38-46c7-933c-88e4aa17952a

📥 Commits

Reviewing files that changed from the base of the PR and between dde1e76 and 04e6e2e.

📒 Files selected for processing (1)
  • twikit/client/client.py

📝 Walkthrough

Walkthrough

The changes make JSON parsing defensive: Client.request and User.__init__ now use safe .get() accessors with defaults to avoid KeyError when Twitter API responses omit expected fields.

Changes

Cohort / File(s) Summary
Error handling
twikit/client/client.py
Use response_data.get('errors') and safely access the first error object's .get('code')/.get('message') to avoid KeyError on missing code.
User data parsing
twikit/user.py
Replaced direct dict indexing with .get(..., default) across many legacy and data-derived attributes (strings, lists, ints, booleans) to provide sensible defaults when fields are absent.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibble on bytes in the moonlit code night,
Replacing sharp keys with cushions of light;
.get() pads the path where wild APIs roam,
Defaults like carrots guide data back home. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix KeyError in User.init and error handling (#417)' accurately summarizes the main changes: fixing KeyError crashes in both User initialization and Client.request error handling.
Linked Issues check ✅ Passed The PR directly addresses all objectives from issue #417: safely handling missing fields in User.init via .get() with defaults, defensive error handling in Client.request, and preventing KeyError crashes with appropriate fallback values.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing KeyError crashes in User.init and Client.request error handling as specified in issue #417; no unrelated modifications are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In Client.request, you now tolerate missing code keys but still assume errors[0] exists; consider guarding against an empty errors list (or non-list) before indexing to avoid IndexError in these cases.
  • In User.__init__, silently defaulting all missing numeric/boolean fields to 0/False may hide upstream API changes; you might want to selectively allow None for fields where absence is meaningfully different from zero/false, or document the intentional fallback semantics in the type hints (e.g., Optional[int]).
  • The repeated .get(..., default) pattern in User.__init__ could be refactored into a small helper or mapping of field names to defaults to reduce duplication and make future changes to default behavior easier to manage.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `Client.request`, you now tolerate missing `code` keys but still assume `errors[0]` exists; consider guarding against an empty `errors` list (or non-list) before indexing to avoid `IndexError` in these cases.
- In `User.__init__`, silently defaulting all missing numeric/boolean fields to `0`/`False` may hide upstream API changes; you might want to selectively allow `None` for fields where absence is meaningfully different from zero/false, or document the intentional fallback semantics in the type hints (e.g., `Optional[int]`).
- The repeated `.get(..., default)` pattern in `User.__init__` could be refactored into a small helper or mapping of field names to defaults to reduce duplication and make future changes to default behavior easier to manage.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@twikit/user.py`:
- Line 94: The constructor currently sets self.created_at: str =
legacy.get('created_at', '') which leaves created_at as an empty string and
causes User.created_at_datetime to pass that into timestamp_to_datetime() and
raise ValueError; change created_at to be nullable (e.g., Optional[str]) and
update created_at assignment to allow None (use legacy.get('created_at') without
default or explicit None), then modify created_at_datetime to guard the fallback
by returning None if self.created_at is falsy/None or only calling
timestamp_to_datetime(self.created_at) when self.created_at is a non-empty
value; reference symbols: created_at attribute on the User class,
created_at_datetime property, and timestamp_to_datetime().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0ad15fe8-e97a-4d68-812e-0fc7b415b2a4

📥 Commits

Reviewing files that changed from the base of the PR and between c3b7220 and dde1e76.

📒 Files selected for processing (2)
  • twikit/client/client.py
  • twikit/user.py

Comment thread twikit/user.py
self.name: str = legacy['name']
self.screen_name: str = legacy['screen_name']
self.profile_image_url: str = legacy['profile_image_url_https']
self.created_at: str = legacy.get('created_at', '')

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle missing created_at in created_at_datetime too.

Defaulting created_at to '' avoids the constructor KeyError, but User.created_at_datetime still passes that empty string to timestamp_to_datetime(), which will raise ValueError. Please make the property nullable or otherwise guard the fallback.

🐛 Proposed fix
-        self.created_at: str = legacy.get('created_at', '')
+        self.created_at: str = legacy.get('created_at', '')
@@
-    def created_at_datetime(self) -> datetime:
-        return timestamp_to_datetime(self.created_at)
+    def created_at_datetime(self) -> datetime | None:
+        if not self.created_at:
+            return None
+        return timestamp_to_datetime(self.created_at)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self.created_at: str = legacy.get('created_at', '')
self.created_at: str = legacy.get('created_at', '')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@twikit/user.py` at line 94, The constructor currently sets self.created_at:
str = legacy.get('created_at', '') which leaves created_at as an empty string
and causes User.created_at_datetime to pass that into timestamp_to_datetime()
and raise ValueError; change created_at to be nullable (e.g., Optional[str]) and
update created_at assignment to allow None (use legacy.get('created_at') without
default or explicit None), then modify created_at_datetime to guard the fallback
by returning None if self.created_at is falsy/None or only calling
timestamp_to_datetime(self.created_at) when self.created_at is a non-empty
value; reference symbols: created_at attribute on the User class,
created_at_datetime property, and timestamp_to_datetime().

Per Sourcery review on d60#421: the previous patch handled missing 'code'
on the first error, but still assumed errors[0] exists and is a dict.
X can return errors as an empty list or a non-list; guard against both
before indexing.
@PriyanshAroraa

Copy link
Copy Markdown
Author

@sourcery-ai thanks for the review. Addressed the real one, leaving the others as call-outs for the maintainer:

1. Guarding errors[0] — fixed in 04e6e2e.
Good catch. X can return errors: [] or occasionally non-list shapes during partial outages. Changed to:

errors = response_data.get('errors') if isinstance(response_data, dict) else None
if errors and isinstance(errors, list):
    first_error = errors[0] if isinstance(errors[0], dict) else {}
    error_code = first_error.get('code')
    error_message = first_error.get('message')

2. None defaults vs 0/False — left as-is.
Fair philosophical point, but it is a behaviour change. Changing followers_count to Optional[int] breaks any downstream code doing arithmetic on it without a None check. The purpose of this PR is to stop crashes without changing the API surface. If the maintainer prefers Optional semantics that is a larger design decision worth its own PR and typing pass.

3. Helper for the repeated .get() — left as-is.
Agree the pattern is verbose, but abstracting it (e.g. a _field(legacy, key, default) helper or a declarative field map) would make this PR larger and harder to review. Happy to do a followup if the maintainer wants it. Keeping this PR focused on the bug fix.

@PawiX25

PawiX25 commented Apr 18, 2026

Copy link
Copy Markdown

Duplicate of #418 which was opened two days before this. Please close this PR.

@PriyanshAroraa

Copy link
Copy Markdown
Author

@PawiX25 thanks for the heads up, missed #418 in my search. Closing as a duplicate - yours is the correct one.

One thing my PR picked up from the Sourcery review that might be worth adding to #418: guarding against empty or non-list errors arrays before indexing errors[0]. The current one-line .get('code') fix still crashes if X returns {"errors": []}. Posting a suggestion on #418 in case you want to fold it in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

KeyError in User.__init__ and Client.request when Twitter API omits expected fields

2 participants