Skip to content

ref(db): Replace create_or_update with update_or_create in Buffer.process#117513

Draft
vgrozdanic wants to merge 1 commit into
masterfrom
ref/buffer-replace-create-or-update
Draft

ref(db): Replace create_or_update with update_or_create in Buffer.process#117513
vgrozdanic wants to merge 1 commit into
masterfrom
ref/buffer-replace-create-or-update

Conversation

@vgrozdanic

Copy link
Copy Markdown
Member

Buffer.process had one remaining caller of Sentry's legacy create_or_update in the elif model: branch — the path taken for non-Group models like ReleaseProject, ReleaseProjectEnvironment, GroupTombstone, and GroupRelease.

A simple update_or_create doesn't work here because update_kwargs is built from F() increment expressions (F("new_groups") + 1). Django evaluates defaults on both the UPDATE and INSERT paths, and F-expressions fail on INSERT since there is no existing row to resolve against. This is what caused the original migration attempt (reverted in #97840) to fail.

Instead this PR uses the explicit pattern: filter().update() first; if no rows were affected, create() with raw column values inside a savepoint, retrying as a plain update on IntegrityError in case a concurrent worker inserts between the two steps. All models that reach this branch have unique constraints on their filter fields, so the create-with-retry pattern is safe.

Regression tests added for the update path (F() increment applied, buffer_incr_complete signal fires with created=False), the create path (initial value set, signal fires with created=True), and the race-condition path (IntegrityError on create is swallowed, no exception propagates). Also removes the stale create_or_update mock from test_signal_only and removes src/sentry/buffer/base.py from the allowlist.

Part of the broader create_or_update deprecation tracked in the repo.

@github-actions github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label Jun 12, 2026
@vgrozdanic vgrozdanic force-pushed the ref/buffer-replace-create-or-update branch from d0e326c to 7d8cb69 Compare June 12, 2026 11:01
…cess

The `elif model:` branch in `Buffer.process` was the last remaining
caller of Sentry's custom `create_or_update` utility in non-manager
production code.

A simple `update_or_create` cannot be used here because `update_kwargs`
contains `F()` increment expressions that Django would try to evaluate
on the INSERT path and fail. Instead, use the explicit pattern:
`filter().update()` first; if no rows were affected, `create()` with
raw column values inside a savepoint, retrying as an update on
`IntegrityError` in case a concurrent writer beat us to the insert.

All models that reach this branch — `GroupTombstone`, `ReleaseProject`,
`ReleaseProjectEnvironment`, and `GroupRelease` — have unique constraints
on their filter fields, so the create-with-retry pattern is safe.

Adds regression tests covering the update path (F() increment applied,
signal fires with `created=False`), the create path (initial value set,
signal fires with `created=True`), and the race-condition path
(IntegrityError swallowed, no exception propagated).

Co-Authored-By: Claude <noreply@anthropic.com>
@vgrozdanic vgrozdanic force-pushed the ref/buffer-replace-create-or-update branch from 7d8cb69 to 9670c9e Compare June 12, 2026 11:19
Comment thread src/sentry/buffer/base.py
# only rolls back this attempt, not any outer transaction.
with transaction.atomic(using=router.db_for_write(model)):
model.objects.create(**create_kwargs)
created = True

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra fields silently dropped on INSERT path in Buffer.process

In the elif model: rollout branch, when filter().update() matches no rows and a new record is created, create_kwargs is built from filters and columns only — extra is omitted. The legacy create_or_update it replaces did include extra on the INSERT path (it resolved both F-expressions and plain values from values/defaults into create_kwargs). As a result, caller-supplied fields like last_seen are silently dropped on creation and fall back to model defaults. The create branch only fires when the filtered row is absent (e.g. a GroupTombstone/GroupRelease row deleted between the buffer enqueue and flush), so this is a narrow correctness regression rather than a crash.

Evidence
  • create_or_update in db/models/query.py builds create_kwargs from kwargs plus iterating values.items() and defaults.items(), so non-F extra values (e.g. last_seen) were written on INSERT.
  • The new branch in buffer/base.py sets create_kwargs = {**filters, **{col: v for col, v in columns.items()}}, omitting extra, so last_seen is not passed to model.objects.create().
  • GroupTombstone caller in event_manager.py:320-331 passes extra={'last_seen': max(event.datetime, group_tombstone.last_seen)}; if the tombstone row is gone at flush time the create path fires and last_seen falls back to auto_now_add.
  • ReleaseProject/ReleaseProjectEnvironment callers pass no extra, and GroupRelease filters by an existing id, so the create path with extra is only reached on a deletion race — limiting real-world impact.

Identified by Warden sentry-backend-bugs · C5Q-Q33

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant