CloudWatch · eu-central-1 · /ecs/lockin-backend-task

Ten hours of controlled failure

Production backend stability report covering heap OOM crash-loop, Postgres connection-pool starvation, and the APIs and background jobs implicated in the cascade.

Window 2026-06-30 21:03 → 2026-07-01 07:03 UTC Account 225523106684 Service lockin-backend-service Generated 2026-07-01
OOM heartbeat — 44 process deaths across 10 hours (avg 14 min apart)
2026-06-30 21:28:13.669 UTC2026-07-01 07:11:39.620 UTC

Executive stats

Three failure modes stack: heap grows until V8 kills the process, ECS replaces the task, and surviving work fights over a 10-slot request-path Postgres pool.

44
Heap OOM crashes
122
DB pool timeout events
14m
Avg time between OOMs
1
HTTP 500 with pool proof
25
Reached heap limit
19
GC mark-compact failures

Pool errors by component

ComponentEventsShare
Scheduler52
ExchangeWatchdogService22
ArchetypeStaleSweep19
TradeMonitorCronService7
RankRecomputeJob6
ArchetypeService4
ExternalPrefetchService4
PartnerVolumeSyncJob2
ClosedTradesLiveSyncService1
OrphanedLivePositionsSweep1
NowConfigService1
PartnerActivationSyncJob1
InsightOutcomeWorker1
ExceptionsHandler (API)1

Background jobs & watchdogs (schedule)

High-frequency crons

  • Scheduler — Nest wrapper for all @Cron jobs (52 pool hits)
  • ExchangeWatchdogService — setInterval every 60s (22 hits)
  • ArchetypeStaleSweep — every 2–3 min recovery + daily sweep (19 hits)
  • RankRecomputeJob — every 5 min (6 hits)
  • TradeMonitorCronService — every 5 min (7 hits)
  • ExternalPrefetchService — every 5 min (4 hits)

Flagged user APIs

  • GET /api/now — confirmed 500, 19.7s, pg-pool acquire timeout
  • GET /api/v1/subscription — 5.1s wait before 401 (pool pressure)
  • NowConfigService — archetype lookup pool timeout (feeds /api/now)
  • Onboarding / trade-feed / insights bulk — heavy by design (code path)

How it crashes

Not a single bug — a feedback loop between memory leak, task churn, and connection starvation.

1 · Heap climb

  • Primary: in-process CacheModule stores large OHLCV payloads from ExternalInsightDataService — see OOM root cause
  • Secondary: WebSocket listener leak on reconnect (Blofin/Weex/Extended)
  • Node hits --max-old-space-size (4096 MB in prod) → FATAL ERROR: Allocation failed - JavaScript heap out of memory
  • Observed cadence: ~14 minutes between deaths; 19 crashes show GC Ineffective mark-compacts

2 · ECS churn

  • Task killed → ALB drains → new task starts (~12 min cycle in ECS events)
  • Cold start replays crons, watchdog tick, archetype recovery sweeps
  • Each restart opens fresh DB connections while old pool state is gone

3 · Pool starvation

  • Request path pool = 10 connections per ECS task (POSTGRES_POOL_SIZE)
  • Background jobs share same pool — no worker bulkhead enabled
  • Errors: timeout exceeded when trying to connect (acquire) and Connection terminated due to connection timeout

4 · API pile-up

  • GET /api/now fans out ~6 computeAllMetrics DB loads per screen open
  • Onboarding refreshForConnection + archetype refresh competes on same pool
  • Only errors log HTTP lines — fleet mostly crash-looping, so user-visible 500s are under-counted

OOM root cause — in-memory cache (debugged)

Code review + CloudWatch correlation. OOM is not caused by Postgres pool exhaustion — pool timeouts are a downstream symptom. The process dies because the V8 heap fills with cached insight market data that is never reclaimed.

The leak mechanism

  • Nest CacheModule.register() default store = Keyv backed by an unbounded Map
  • ExternalInsightDataService writes ExternalInsightData objects: OHLCV candle arrays, OI/liquidation history, per-trade external maps
  • Cache keys churn on time buckets + symbols + connection IDs — e.g. insights:external:…:b{nowBucket}:…
  • Keyv evicts expired entries only on re-access — churning keys are never read again → entries never freed
  • Heap climbs until --max-old-space-size cap → V8 FATAL (documented: exit 139 / OOM every ~12–21 min)

Evidence from this window

  • 44 heap OOM fatals — steady ~14 min cadence, not spike-driven
  • 25 × Reached heap limit — hard cap hit
  • 19 × Ineffective mark-compacts near heap limit — GC cannot reclaim enough
  • Pool errors (122) occur between OOMs while task still alive — separate failure mode
  • ECS replaces task after each OOM → cold start replays crons → pool pressure worsens

Fix in repo (may not be deployed)

  • boundedMemoryStore() in src/common/cache/bounded-cache-store.ts — caps entry count, evicts oldest on insert
  • Wired in app.module.ts (2000), insights.module.ts (1500), now.module.ts (2000), briefing.module.ts (1000)
  • Caveat: cap is entry count, not bytes — 2000 × large OHLCV payloads can still be hundreds of MB
  • Prod still OOM-ing every ~15 min → likely old build without fix, or byte cap still needed

Secondary heap contributor

  • WebSocket services (Blofin, Weex, Extended) leak one abort listener per reconnect
  • Documented in blofin-websocket.service.ts — same heap-OOM class bug
  • prod/compose.yml references blofin-ws-oom-leak branch as complementary fix
  • Alone this may not explain 14 min cadence; combined with cache leak it accelerates death

Cache write path (code)

ServiceWhat gets cachedTTLHeap risk
ExternalInsightDataServiceFull ExternalInsightData — candles, fear/greed, OI, liquidations, trade maps10 minCritical
ExternalInsightDataServiceHyperliquid candleSnapshot arrays per symbol/interval5 minCritical
ExternalInsightDataServiceCoinalyze markets / liquidation events6 hHigh
InsightsServiceBulk closed-trade insight results10 minHigh
NowScoringServiceMetric computation results per user5 minMedium
BriefingCacheServiceBriefing summary / full payloadsvariesMedium

Conclusion

Yes — in-memory caching is the primary OOM cause. Specifically the unbounded (or insufficiently bounded) Nest CacheModule store feeding ExternalInsightDataService OHLCV payloads. Pool starvation and API timeouts are collateral damage from the crash-restart loop, not the heap killer.

Every pool incident (122 events)

Complete log extract sorted chronologically. Scheduler rows are generic Nest cron wrapper — underlying jobs include rank recompute, trade monitor, partner sync, reports worker, etc.

Timestamp (UTC)ComponentError typeMessage
2026-06-30 21:26:48.744ClosedTradesLiveSyncServiceConnection terminatederror [ClosedTradesLiveSyncService] Closed trades live refresh failed for connection cf5b19e5-76e1-4cd9-8609-f077102180e9: Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._conn…
2026-06-30 21:27:16.251SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 21:27:16.262TradeMonitorCronServiceConnection terminatedwarn [TradeMonitorCronService] [TradeMonitorCron] Failed to load distinct users: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-06-30 22:27:37.209ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-06-30 22:27:37.223ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-06-30 22:27:37.224SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 22:27:37.225SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 22:27:37.225SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 22:27:37.226TradeMonitorCronServiceConnection terminatedwarn [TradeMonitorCronService] [TradeMonitorCron] Failed to load distinct users: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-06-30 22:33:45.304ExchangeWatchdogServiceAcquire timeoutwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-06-30 22:33:47.181ExchangeWatchdogServiceAcquire timeoutwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-06-30 22:34:26.789SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 22:35:29.482RankRecomputeJobConnection terminatederror [RankRecomputeJob] Rank recompute failed: Connection terminated due to connection timeout {"service":"mentor-api","stack":[null]}
2026-06-30 22:35:29.488SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 22:36:11.707ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=6e998057-a638-4a28-8928-bca056877122: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-06-30 22:42:10.215SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 22:42:10.216TradeMonitorCronServiceConnection terminatedwarn [TradeMonitorCronService] [TradeMonitorCron] Failed to load distinct users: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-06-30 22:42:14.129ArchetypeServiceAcquire timeoutwarn [ArchetypeService] round-trip reconstruction failed conn=cf5b19e5-76e1-4cd9-8609-f077102180e9 userId=6e998057-a638-4a28-8928-bca056877122: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-06-30 22:42:35.188ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=456773dd-311b-41da-b856-8483e53cde77: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-06-30 22:43:01.597ArchetypeStaleSweepConnection terminatedwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=f61a8166-7f7c-497c-a797-e5d63a1f1086: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-06-30 22:57:09.988SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 22:57:10.187ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-06-30 23:09:12.228ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=456773dd-311b-41da-b856-8483e53cde77: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-06-30 23:12:11.359SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 23:12:11.362SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 23:12:11.366SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 23:12:11.367TradeMonitorCronServiceConnection terminatedwarn [TradeMonitorCronService] [TradeMonitorCron] Failed to load distinct users: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-06-30 23:12:11.367SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 23:12:28.344ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-06-30 23:14:46.114ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-06-30 23:14:46.116ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-06-30 23:14:46.122SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 23:14:46.123SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 23:14:46.124SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 23:24:16.635ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=456773dd-311b-41da-b856-8483e53cde77: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-06-30 23:28:11.316SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-06-30 23:28:47.444ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 00:00:10.243SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 00:00:10.243SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 00:00:10.243SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 00:00:10.244SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 00:00:10.244SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 00:00:10.244SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 00:00:10.245PartnerVolumeSyncJobConnection terminatederror [PartnerVolumeSyncJob] Partner volume sync failed: Connection terminated due to connection timeout {"service":"mentor-api","stack":[null]}
2026-07-01 00:00:10.245OrphanedLivePositionsSweepConnection terminatedwarn [OrphanedLivePositionsSweepService] orphaned open_positions_live sweep failed: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 00:01:14.766NowConfigServiceConnection terminatedwarn [NowConfigService] NowConfigService.resolve(6d371d5d-efff-41ac-a1c8-da3675d84fc3) — archetype lookup failed, using default. Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 00:08:11.576SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 00:35:14.977SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 00:35:15.251RankRecomputeJobConnection terminatederror [RankRecomputeJob] Rank recompute failed: Connection terminated due to connection timeout {"service":"mentor-api","stack":[null]}
2026-07-01 00:35:35.156ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 00:37:02.986ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 00:59:46.329ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 01:01:14.561ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 01:13:11.527ArchetypeStaleSweepConnection terminatedwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=456773dd-311b-41da-b856-8483e53cde77: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 01:13:11.528SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 01:13:20.084SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 01:13:20.085ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 01:13:20.598ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=6e998057-a638-4a28-8928-bca056877122: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 01:13:45.285ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=f61a8166-7f7c-497c-a797-e5d63a1f1086: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 01:15:23.847ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 01:15:23.848ExternalPrefetchServiceConnection terminatedwarn [ExternalPrefetchWorker] Prefetch cron failed: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 01:15:23.848RankRecomputeJobConnection terminatederror [RankRecomputeJob] Rank recompute failed: Connection terminated due to connection timeout {"service":"mentor-api","stack":[null]}
2026-07-01 01:15:23.849SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 01:15:23.850SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 01:15:23.850PartnerVolumeSyncJobConnection terminatederror [PartnerVolumeSyncJob] Partner volume sync failed: Connection terminated due to connection timeout {"service":"mentor-api","stack":[null]}
2026-07-01 01:15:23.851SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 01:15:34.064ArchetypeServiceAcquire timeoutwarn [ArchetypeService] round-trip reconstruction failed conn=cf5b19e5-76e1-4cd9-8609-f077102180e9 userId=6e998057-a638-4a28-8928-bca056877122: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 01:17:08.422SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 01:17:08.423TradeMonitorCronServiceConnection terminatedwarn [TradeMonitorCronService] [TradeMonitorCron] Failed to load distinct users: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 01:20:12.093PartnerActivationSyncJobConnection terminatedwarn [PartnerActivationSyncJob] Activation sync failed for 11bd3c66-bac4-4b53-8a96-9d349e13cbb2: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 01:21:10.725SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 01:21:28.067ArchetypeServiceAcquire timeoutwarn [ArchetypeService] round-trip reconstruction failed conn=04c7824b-1b18-4408-8fd0-821f52674e80 userId=456773dd-311b-41da-b856-8483e53cde77: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 01:21:36.948ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=456773dd-311b-41da-b856-8483e53cde77: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 01:35:12.643RankRecomputeJobConnection terminatederror [RankRecomputeJob] Rank recompute failed: Connection terminated due to connection timeout {"service":"mentor-api","stack":[null]}
2026-07-01 01:36:20.299ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 01:36:30.687SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 01:36:30.699ArchetypeServiceAcquire timeoutwarn [ArchetypeService] round-trip reconstruction failed conn=04c7824b-1b18-4408-8fd0-821f52674e80 userId=456773dd-311b-41da-b856-8483e53cde77: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 02:02:08.628SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 02:02:08.629TradeMonitorCronServiceConnection terminatedwarn [TradeMonitorCronService] [TradeMonitorCron] Failed to load distinct users: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 02:07:09.275SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 02:07:09.279TradeMonitorCronServiceConnection terminatedwarn [TradeMonitorCronService] [TradeMonitorCron] Failed to load distinct users: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 02:07:09.280SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 02:07:09.280ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 02:07:09.281ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 02:07:09.306SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 02:07:09.307SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 02:26:09.305SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 02:37:25.854SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 02:50:19.708ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 02:54:11.539ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=456773dd-311b-41da-b856-8483e53cde77: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 03:06:06.557InsightOutcomeWorkerAcquire timeoutwarn [InsightOutcomeWorker] score outcome failed (496f6663-0842-49f9-80b6-da9513162867-trade:RB-WR-004 u=5e2de41a-4c8e-4f26-bdaf-26942496edd2): timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 03:09:15.005ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=456773dd-311b-41da-b856-8483e53cde77: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 03:09:35.318ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=6e998057-a638-4a28-8928-bca056877122: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 03:10:02.064ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=f61a8166-7f7c-497c-a797-e5d63a1f1086: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 03:36:10.168ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=5e2de41a-4c8e-4f26-bdaf-26942496edd2: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 03:37:19.709SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 03:39:14.730ArchetypeStaleSweepConnection terminatedwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=456773dd-311b-41da-b856-8483e53cde77: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 03:39:14.755SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 03:53:47.977ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 03:54:09.394SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 03:54:09.587SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 04:20:01.362ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 04:23:12.144SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 04:45:14.456RankRecomputeJobConnection terminatederror [RankRecomputeJob] Rank recompute failed: Connection terminated due to connection timeout {"service":"mentor-api","stack":[null]}
2026-07-01 04:52:36.282ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=5e2de41a-4c8e-4f26-bdaf-26942496edd2: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 04:53:00.747ArchetypeStaleSweepConnection terminatedwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=f61a8166-7f7c-497c-a797-e5d63a1f1086: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 04:53:44.730ExchangeWatchdogServiceConnection terminatedwarn [ExchangeWatchdogService] Exchange watchdog tick failed: Error: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 04:54:31.429SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 04:54:31.432SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 04:57:12.659ArchetypeStaleSweepAcquire timeoutwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=6e998057-a638-4a28-8928-bca056877122: timeout exceeded when trying to connect {"service":"mentor-api"}
2026-07-01 05:16:08.256SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 05:17:08.359SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 06:10:13.444SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 06:10:15.505ArchetypeStaleSweepConnection terminatedwarn [ArchetypeStaleSweep] nodata-recovery refresh failed userId=37c46997-fb77-4ccc-bfeb-f846bf71dc66: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 06:10:15.531RankRecomputeJobConnection terminatederror [RankRecomputeJob] Rank recompute failed: Connection terminated due to connection timeout {"service":"mentor-api","stack":[null]}
2026-07-01 06:11:16.308ExternalPrefetchServiceConnection terminatedwarn [ExternalPrefetchService] refreshPrefetch failed symbol=SOL: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 06:11:16.326SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 06:25:14.424ExternalPrefetchServiceConnection terminatedwarn [ExternalPrefetchService] refreshPrefetch failed symbol=XYZ: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 06:33:05.865ExternalPrefetchServiceConnection terminatedwarn [ExternalPrefetchService] refreshPrefetch failed symbol=ENA: Connection terminated due to connection timeout {"service":"mentor-api"}
2026-07-01 06:33:18.739SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 06:33:18.740SchedulerConnection terminatederror [Scheduler] Connection terminated due to connection timeout {"service":"mentor-api","stack":["Error: Connection terminated due to connection timeout\n at Client._connectionCallback (/app/node_modules/pg-pool/index.js:262:17)\n at Connection.<anonymous> (/app/node_modu…
2026-07-01 06:34:16.331ExceptionsHandler (API)Acquire timeouterror [6652e0d2-c650-4b5e-ad41-394613a9a086][ExceptionsHandler] timeout exceeded when trying to connect {"service":"mentor-api","stack":["Error: timeout exceeded when trying to connect\n at Timeout._onTimeout (/app/node_modules/pg-pool/index.js:216:27)\n at listOnTimeout (n…

Every OOM crash (44 events)

Each row is a process-level V8 fatal. Task restarts immediately after; no graceful drain.

Timestamp (UTC)VariantMessage
2026-06-30 21:28:13.669Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-06-30 21:57:06.360Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-06-30 22:10:52.503Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-06-30 22:44:02.378Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-06-30 22:59:23.107Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-06-30 23:16:09.608Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-06-30 23:29:39.917Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-06-30 23:43:56.962Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-06-30 23:58:02.532Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 00:06:40.604Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 00:10:09.884Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 00:13:36.449Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 00:19:05.367Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 00:19:10.866Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 00:36:30.971Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 00:44:06.780Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 00:49:11.568Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 01:01:52.915Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 01:17:13.594Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 01:21:48.602Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 01:36:38.941Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 01:52:56.165Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 02:04:13.164Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 02:08:55.110Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 02:20:12.315Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 02:38:12.060Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 02:51:58.803Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 03:10:19.831Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 03:20:46.053Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 03:40:11.587Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 03:56:43.853Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 04:07:37.075Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 04:24:36.319Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 04:37:44.900Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 04:56:22.724Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 05:07:28.928Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 05:24:03.287Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 05:38:19.445Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 05:55:18.170Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 06:12:05.698Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 06:26:24.685Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 06:44:22.562Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 06:56:15.475Heap limit reachedFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2026-07-01 07:11:39.620Mark-compact failureFATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory

Flagged HTTP endpoints

HTTP logger only emits 4xx/5xx. Scanner/probe traffic (unrelated vulnerability scans) excluded. Only Lockin app endpoints shown.

TimestampMethodPathStatusDurationNotes
2026-07-01 06:34:16.331GET/api/now50019,743msPool acquire timeout — only confirmed 500
2026-07-01 06:23:47.388GET/api/v1/subscription4015,140msSlow — likely waited on pool before auth fail
2026-07-01 06:16:26.260POST/api/exchanges/hyperliquid/link-direct40921msConflict — already linked
2026-07-01 06:16:04.516POST/api/exchanges/hyperliquid/link-direct40912msConflict — already linked
2026-07-01 06:11:52.719POST/api/insights/session/end4011msUnauthenticated
2026-07-01 06:11:37.415POST/api/notifications/tokens4011msUnauthenticated
2026-07-01 06:04:08.892GET/api/exchanges/trades4047msRoute not found
2026-07-01 06:01:49.099GET/api/exchanges/connections4010msUnauthenticated
2026-07-01 05:55:30.773GET/api/rewards/summary40351msForbidden

Heavy APIs (code-identified, pool risk)

EndpointWhy it stresses the pool
GET /api/now5-metric score; ~6 parallel computeAllMetrics fan-outs per screen
GET /api/now/:metricPer-metric scoring reload
GET /api/now/:metric/trendTrend + metrics DB reads
POST /api/onboarding/exchange/connectImport + closedTradesLive.refreshForConnection + archetype refresh
GET /api/trade-feed/open-fastConnection-bundle fan-out for open trades
GET /api/trade-feed/closed-fastClosed-trade bundle load
POST /api/insights/trades/closed/bulkBulk PRD insight compute per trade page
GET /api/insights/curriculumRanked insight curriculum query
POST /api/archetype/refreshForce refreshNow + fp-svc + DB
GET /api/reports/*Daily/weekly/monthly report computation

Recommended fixes

Ordered by impact. Heap fix stops the crash loop; pool isolation stops the cascade; infra alignment stops false kills.

P0 — stop the bleed

Deploy bounded in-process cache (OOM root cause)

Ship boundedMemoryStore() for all CacheModule.register() calls. Stops unbounded Keyv Map growth from ExternalInsightDataService OHLCV payloads. Keys churn on time buckets; expired entries never re-accessed → never freed → heap OOM every ~14 min. Verify this is actually deployed to ECS — prod still crash-looping suggests old image or insufficient byte cap.

P0 — stop the bleed

Add byte-weighted cache eviction (follow-up)

boundedMemoryStore(2000) caps entry count, not size. One ExternalInsightData object can hold thousands of candles across symbols. Consider max-bytes eviction or moving OHLCV to Redis with TTL.

P0 — stop the bleed

Ship WebSocket listener leak fixes

Blofin/Weex/Extended WS services: detach abort listeners on normal completion. Secondary heap contributor — same OOM class documented in blofin-websocket.service.ts.

P1 — pool isolation

Enable worker DB bulkhead

Set WORKER_DB_POOL_ENABLED=true so BullMQ processors and crons use WORKER_POOL_SIZE (5), not the 10-slot request path pool.

P1 — pool isolation

Move crons off request pool

Scheduler (52 hits), ExchangeWatchdog (22), ArchetypeStaleSweep (19) must not share POSTGRES_POOL_SIZE with HTTP. Route through worker DataSource or dedicated read pool.

P2 — API hardening

Single-flight already on Now — extend coverage

computeAllMetrics dedup exists; ensure all /api/now/* routes share one flight. Cap concurrent metric detail fetches from Flutter.

P2 — API hardening

Decouple onboarding fingerprint from closedTradesLive build

Feed fp-svc from in-memory round-trip reconstruction — skip heavy refreshForConnection on card hot path (onboarding-scale-architecture.md).

P2 — API hardening

Throttle background sweeps during saturation

Skip ArchetypeStaleSweep / ExternalPrefetch when admission control is saturated or pool probe fails.

P3 — infra

RDS Proxy cutover

Removes Aurora Serverless scaling lag that amplifies acquire timeouts. Allows tuning pool sizes per CONNECTION_BUDGET.md.

P3 — infra

ALB health → /api/health/live

Stop killing busy-but-alive tasks on GET /api 5s timeout. Prevents restart cascade on top of OOM.

P3 — observability

Log successful HTTP at sample rate

Current logger only emits 4xx/5xx — under-reports API load during incidents. Add sampled info-level HTTP for volume attribution.