Release staging to main#457
Conversation
Lets callers opt out of automatic redirect following so they can inspect 3xx responses directly instead of receiving a thrown "Too many redirects" error.
The probe passed maxRedirects: 0 to safeFetch expecting it to return 3xx responses for the outer loop to follow manually. safeFetch instead threw "Too many redirects (>0)" on any single redirect, causing every http→https or apex→www site to be reported DOWN. Switch to the new followRedirects: false flag so the outer loop sees the 3xx and follows it through MAX_REDIRECTS hops as intended.
Whitespace and ordering cleanups across builder configs (devices.ts, performance.ts, vitals.ts), the percentageOf type field ordering, and extraction of two regex constants in SimpleQueryBuilder.getColumnAlias. No behavior changes.
events_by_date and summary_metrics aggregated session_agg via a JOIN that materialised one row per session per bucket. Adding a per-bucket pre-aggregation (session_by_bucket / session_summary) lets the final JOIN run small-to-small and drops the per-session row materialisation entirely. medianIf is replaced with quantileTDigestIf on the same path. outbound_links and outbound_domains LEFT JOIN-ed analytics.events on session_id with a ±60s window. With multiple events per click in the window, each click row was multiplied, so COUNT(*) reported clicks times matched events. The joined event-side columns were pulled but never returned. Dropping the JOIN fixes the count and removes the biggest contributor to those builders' memory cost. error_summary scanned analytics.error_spans twice through two CTEs that produced identical uniq(session_id). Folded into one CTE (error_stats.affectedSessions) and reused in the error-rate calculation.
ClickHouse OOM telemetry over the last 7 days shows 386 failures all hitting the same ~55 GiB total-memory ceiling, with two websites accounting for 60% of them. The pattern is concurrent-load shape: single queries are well under 300 MiB at p99, but a dashboard load fans 10+ widgets out at once and several heavy users do that in parallel. executeBatch already groups requests by schema and runs each group as a single UNION, but Promise.all sends every group at CH at once. Replacing that with a small worker-pool helper (mapWithConcurrency) caps in-flight groups per batch at BATCH_GROUP_CONCURRENCY (default 3, env-tunable). executeDynamicQuery in apps/api now wraps executeBatch with a tiny in-process keyed semaphore (runPerWebsite) so a single project cannot have more than PER_WEBSITE_QUERY_CONCURRENCY (default 8, env-tunable) batches in flight at once. Excess requests queue instead of bursting the cluster. Together these protect the cluster from concurrent-load spikes while keeping cached refreshes (use_query_cache=1 is already on) at ~1 ms.
There was a problem hiding this comment.
0 issues found across 1 file (changes from recent commits).
Shadow auto-approve: would not auto-approve. Auto-approval blocked by 13 unresolved issues from previous reviews.
Re-trigger cubic
There was a problem hiding this comment.
0 issues found across 4 files (changes from recent commits).
Shadow auto-approve: would not auto-approve. Auto-approval blocked by 13 unresolved issues from previous reviews.
Re-trigger cubic
There was a problem hiding this comment.
0 issues found across 2 files (changes from recent commits).
Shadow auto-approve: would not auto-approve. Auto-approval blocked by 13 unresolved issues from previous reviews.
Re-trigger cubic
There was a problem hiding this comment.
0 issues found across 2 files (changes from recent commits).
Shadow auto-approve: would not auto-approve. Auto-approval blocked by 13 unresolved issues from previous reviews.
Re-trigger cubic
|
You're iterating quickly on this pull request. To help protect your rate limits, cubic has paused automatic reviews on new pushes for now—when you're ready for another review, comment |
Summary
http→https/ apex→www site was trippingToo many redirects (>0). AddsfollowRedirects: falseoption tosafeFetchand switches the probe to use it so the outer loop handles redirects as intended.Test plan
Too many redirects (>0)(e.g.databuddy.cc→www.databuddy.cc) now report UP after deploySummary by cubic
Ships a dedicated
@databuddy/insightsworker for queued insight generation, refreshes the public status page, and fixes uptime redirect false alarms. Also adds query concurrency controls, optimizes analytics builders, centralizes public query access in@databuddy/ai, swaps Unkey env vars to Railway, hardens the worker by initializing run IDs before queuing jobs, stabilizes dashboard analytics e2e, and simplifies filter plumbing.New Features
@databuddy/insightsapp with BullMQ queue, scheduler, stale-run recovery, rollups, evlog; CI health checks and Docker image.insightGenerationrouter and expandedinsights; legacy API insights routes removed. Public query access now in@databuddy/ai(canReadQueryTypesPublicly) and used byapps/api.streamdown.followRedirects: false; Railway deployment metadata.DATABUDDY_E2E_MODE+ key; restore local e2e access guards; simplify dashboard filter plumbing and prevent filter URL sync loops.Migration
insightsservice (seeinsights.Dockerfile,docker-compose.selfhost.yml; portINSIGHTS_PORT, default 4002).INSIGHTS_*(dispatch/maintenance/stale/workers), optionalINSIGHTS_BULLMQ_REDIS_URL(falls back toBULLMQ_REDIS_URL), andSUPERMEMORY_API_KEY.APP_ENV/RAILWAY_ENVIRONMENT_NAMEandRAILWAY_*vars for uptime and insights logging/metadata.Written for commit c8fa804. Summary will update on new commits. Review in cubic