fix: memory leak when profiling is enabled#5133
Conversation
Semver Impact of This PR⚪ None (no version bump detected) 📋 Changelog PreviewThis is how your changes will appear in the changelog. Fixes 🐛
Dependencies ⬆️Deps
🤖 This preview updates automatically when you update the PR. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5133 +/- ##
==========================================
+ Coverage 73.99% 74.03% +0.03%
==========================================
Files 499 500 +1
Lines 18067 18090 +23
Branches 3520 3520
==========================================
+ Hits 13369 13393 +24
+ Misses 3838 3837 -1
Partials 860 860 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This was passing accidentally before because ConcurrentBag typically followed a LIFO model for adding/removing items. The SQL listener test had a bug that the LIFO mask would hide. The test did `OrderByDescending(x => x.StartTimestamp).First()` to get the "newest" span, but this fails whenever two spans share the same timestamp — which is common on Windows due to ~15ms clock resolution. Making ConcurrentBagLite LIFO would make the test pass again accidentally, because the newer span would happen to enumerate first and survive the stable sort. But the brittleness is still there — if the spans ever get equal timestamps in a different order (e.g., a background thread adds one), the test breaks again.
There was a problem hiding this comment.
The SQL listener test had a bug that the LIFO mask would hide. The test did OrderByDescending(x => x.StartTimestamp).First() to get the "newest" span, but this fails whenever two spans share the same
timestamp — which is common on Windows x64 due to ~15ms clock resolution. Making ConcurrentBagLite LIFO would make the test pass again accidentally, because the newer span would happen to enumerate first and
survive the stable sort... so changing/fixing the test instead.
|
|
||
| public void Add(T item) | ||
| { | ||
| lock (_items) |
There was a problem hiding this comment.
Worth checking whether we have a hotspot where add(IEnumearble<T> items) would be useful.
There was a problem hiding this comment.
Good call... we have these:
sentry-dotnet/src/Sentry/Scope.cs
Lines 530 to 534 in a16638f
sentry-dotnet/src/Sentry/Scope.cs
Lines 555 to 568 in a16638f
sentry-dotnet/src/Sentry/Scope.cs
Lines 687 to 690 in a16638f
sentry-dotnet/src/Sentry/Scope.cs
Lines 713 to 716 in a16638f
sentry-dotnet/src/Sentry/Scope.cs
Lines 739 to 742 in a16638f
I don't think I'd describe any of those as a hot path though... with the exception of the first one (attachments) it's all one off stuff that happens at init for a very constrained number of items - and even attachments is likely to be only a handful of items.
I think I'll leave it for now then, in the interests of expediting the fix to the memory leak.
fixes: #5113
fixes: #4721
Fixes memory leaks caused by
ConcurrentBagandConcurrentQueue.Why ConcurrentBag leaks
ConcurrentBag is backed by a ThreadLocal — internally it keeps a per-thread linked list. Every thread that ever touches the bag allocates a ThreadLocalList node that stays attached to the bag for the bag's lifetime, even after that thread has died. In a high-throughput app (profiler + ASP.NET Core thread pool), each bag instance accumulates stranded per-thread slots, and each SdkVersion instance carries that weight.
Why the profiler amplifies it
Every SentryEvent, SentryTransaction, Scope, and TransactionTracer allocates its own SdkVersion → two fresh ConcurrentBag instances. AddPackage gets called on most of these from multiple callsites (SentryMiddleware.cs:252, Enricher.cs:72, Scope.cs:527). The profiler drives the event/transaction throughput way up, so both the count of bags and the per-bag thread-local overhead explode.
There's also one hotter target: NoOpTransaction.cs:14 returns SdkVersion.Instance (the singleton) as its Sdk. Any code path that ends up calling AddPackage on a NoOpTransaction's Sdk writes into that singleton bag forever.
The fix pattern
Same as #3432: replace
ConcurrentBagwith a simpler lock-backed list (as suggested in #4721).Testing
Memory does keep growing in the process but this seems to be just because I have loads of memory so the GC doesn't bother kicking in. If I force a GC then memory is released again:

As far as I can tell then, there are no remaining memory leaks in the minimal repro that @kanadaj provided.