perf: enhance monitor stability with non-blocking LOCK acquisition (#1754)#1773
Open
zhu6201976 wants to merge 1 commit intounclecode:developfrom
Open
perf: enhance monitor stability with non-blocking LOCK acquisition (#1754)#1773zhu6201976 wants to merge 1 commit intounclecode:developfrom
zhu6201976 wants to merge 1 commit intounclecode:developfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
While the recent changes in develop introduce a global wait_for timeout for update_timeline, the monitor still attempts to "queue" for the global LOCK. Under extreme load, this can lead to unnecessary resource contention.
Improvements:
Non-blocking Stats: Changed LOCK acquisition within monitor.py to a fail-fast mechanism (0.5s timeout). If the crawler pool is busy, the monitor skips the current data point instead of adding to the lock queue.
System Resilience: Ensures that telemetry collection never interferes with the core mission-critical path (browser lifecycle management).
Error Silencing: Downgraded lock-timeout warnings to debug logs to avoid cluttering production logs during expected high-load spikes.
Why this matters:
In high-concurrency enterprise environments, skipping a single monitoring heartbeat is preferable to adding latency to the asyncio event loop.