You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Replace the current N-way webhook fan-out with a centralized event bus that platforms subscribe to. Publishers emit events once, consumers pull or receive only the packet types they care about. Existing packet format is preserved. Transport and distribution change, payload shape does not.
Context: Current Architecture
Today, every state change in W3DS triggers a webhook dispatch loop that iterates over every registered platform and performs an outbound HTTP POST per platform, per event.
flowchart LR
S[W3DS Core] -->|event occurs| D[Dispatcher]
D -->|POST| P1[Platform 1]
D -->|POST| P2[Platform 2]
D -->|POST| P3[Platform 3]
D -->|POST| P4[Platform 4]
D -->|POST| PN[Platform N]
Loading
For every event, the dispatcher does N outbound calls where N is the number of registered platforms. Every platform receives every event regardless of whether it cares about that packet type.
Problem
Dispatch cost scales linearly with platform count. Adding the Nth platform adds a full outbound request to every event.
Failure handling is per-platform and per-event. A slow or failing consumer blocks the dispatcher or burns retry budget on traffic the consumer may not even want.
No filtering at the edge. Consumers receive events they do not care about and must drop them client-side, wasting bandwidth and processing on both sides.
No replay. If a platform is down and exhausts retries, events are lost. There is no cursor, no backfill, no durable log.
No pull semantics. Consumers cannot catch up at their own pace or query historical events.
Producer is tightly coupled to consumer addressability. The core system must know every registered platform, every endpoint URL, every auth scheme.
Proposal: Awareness as a Service
Introduce a central event bus that sits between the W3DS core and all consuming platforms. The core publishes once, the bus handles distribution, filtering, durability, and delivery.
flowchart LR
S[W3DS Core] -->|publish once| B[(Awareness Bus)]
B -->|filtered push| P1[Platform 1]
B -->|filtered push| P2[Platform 2]
B -.->|pull on demand| P3[Platform 3]
B -.->|pull on demand| P4[Platform 4]
B -->|filtered push| PN[Platform N]
Loading
Each consumer declares which packet types it wants, how it wants them delivered (push or pull), and from what position (live, or from a cursor). The core is unaware of how many consumers exist or what they want.
Event Flow
sequenceDiagram
participant Core as W3DS Core
participant Bus as Awareness Bus
participant Sub as Subscription Registry
participant P as Platform
Core->>Bus: publish(event, packet_type)
Bus->>Bus: append to durable log
Bus->>Sub: lookup subscribers for packet_type
Sub-->>Bus: matching subscriptions
alt push subscription
Bus->>P: deliver(event)
P-->>Bus: ack
else pull subscription
P->>Bus: fetch(cursor, filters)
Bus-->>P: events batch
P->>Bus: commit(cursor)
end
Loading
Required Capabilities
The bus must support the following, described as capabilities rather than specific technology choices.
Topics or streams keyed by packet type. Consumers subscribe to the packet types they care about, not a firehose.
Server-side filtering. Filter predicates beyond packet type, for example by tenant, by entity id, by severity.
Push and pull delivery. Push for consumers that want live delivery. Pull for consumers that want to control throughput or backfill.
Durable, append-only log with retention. Events persist for a configured window so consumers can replay, resume after downtime, or reprocess after bugs.
Cursor-based consumption. Each subscription tracks its position independently. Slow consumers do not affect fast ones.
Dead letter handling. Events that fail delivery past a retry threshold go to a DLQ, not the void.
Ordering guarantees per partition key. Events for the same entity arrive in order. No global ordering requirement.
At-least-once delivery with idempotency keys on every event. Consumers dedupe on their side using the key.
Auth and authorization per subscription. Platforms can only subscribe to packet types they are entitled to receive.
Unchanged. The existing packet schema is preserved end to end. The bus transports packets, it does not transform them. Consumers that parse the current format continue to work without modification. Only the transport envelope around the packet changes, and that envelope is additive metadata (packet_type, timestamp, idempotency_key, cursor_position).
Migration Path
Backward compatibility during transition. Existing webhook consumers keep working while new bus-native consumers come online.
flowchart LR
S[W3DS Core] -->|publish| B[(Awareness Bus)]
B --> N1[New: Bus-native Platform]
B --> N2[New: Bus-native Platform]
B --> SH[Webhook Shim]
SH -->|POST legacy webhook| L1[Legacy Platform 1]
SH -->|POST legacy webhook| L2[Legacy Platform 2]
Loading
Phases:
Stand up the bus alongside the existing dispatcher. Core dual-writes to both.
Build a webhook shim that consumes from the bus and performs outbound POSTs in the existing webhook format to legacy platforms. Legacy platforms see no change.
Migrate platforms one by one from legacy webhook to native bus subscription. Each migration is a config change on the platform side, not a core change.
Once all platforms are migrated, retire the old dispatcher. The shim can stay as long as there are legacy consumers, or be removed.
Out of Scope
Changing the packet schema itself.
Exposing the bus to third-party external consumers outside the W3DS platform network. Initial scope is internal consumers.
Cross-region replication. Single-region to start, revisit if needed.
Open Questions
Retention window. How many days of event history do we keep on the bus. Depends on worst-case consumer downtime we want to tolerate and storage cost.
Partition key. Is it tenant id, entity id, or something else. Affects ordering semantics.
Subscription registration flow. Self-serve via API, or gated through admin provisioning.
Schema evolution policy. How we version packet types when fields are added or removed, and how consumers opt in to new versions.
Ordering strictness. Strict per-key ordering costs throughput. Is best-effort ordering acceptable for any packet types.
Billing and quotas. Do we meter per-subscription consumption, and if so on what axis (events, bytes, retained storage).
Success Criteria
Core publish path is O(1) with respect to consumer count. Adding a new platform does not increase publish latency or load on the core.
Consumers can subscribe to a subset of packet types and receive only those.
A platform can go offline for the full retention window, come back, and catch up from its last cursor without losing events.
DLQ depth, subscription lag, and delivery error rate are observable per subscription in real time.
Legacy webhook consumers continue to function unchanged through the shim for the entire migration window.
Summary
Replace the current N-way webhook fan-out with a centralized event bus that platforms subscribe to. Publishers emit events once, consumers pull or receive only the packet types they care about. Existing packet format is preserved. Transport and distribution change, payload shape does not.
Context: Current Architecture
Today, every state change in W3DS triggers a webhook dispatch loop that iterates over every registered platform and performs an outbound HTTP POST per platform, per event.
flowchart LR S[W3DS Core] -->|event occurs| D[Dispatcher] D -->|POST| P1[Platform 1] D -->|POST| P2[Platform 2] D -->|POST| P3[Platform 3] D -->|POST| P4[Platform 4] D -->|POST| PN[Platform N]For every event, the dispatcher does N outbound calls where N is the number of registered platforms. Every platform receives every event regardless of whether it cares about that packet type.
Problem
Proposal: Awareness as a Service
Introduce a central event bus that sits between the W3DS core and all consuming platforms. The core publishes once, the bus handles distribution, filtering, durability, and delivery.
flowchart LR S[W3DS Core] -->|publish once| B[(Awareness Bus)] B -->|filtered push| P1[Platform 1] B -->|filtered push| P2[Platform 2] B -.->|pull on demand| P3[Platform 3] B -.->|pull on demand| P4[Platform 4] B -->|filtered push| PN[Platform N]Each consumer declares which packet types it wants, how it wants them delivered (push or pull), and from what position (live, or from a cursor). The core is unaware of how many consumers exist or what they want.
Event Flow
sequenceDiagram participant Core as W3DS Core participant Bus as Awareness Bus participant Sub as Subscription Registry participant P as Platform Core->>Bus: publish(event, packet_type) Bus->>Bus: append to durable log Bus->>Sub: lookup subscribers for packet_type Sub-->>Bus: matching subscriptions alt push subscription Bus->>P: deliver(event) P-->>Bus: ack else pull subscription P->>Bus: fetch(cursor, filters) Bus-->>P: events batch P->>Bus: commit(cursor) endRequired Capabilities
The bus must support the following, described as capabilities rather than specific technology choices.
Packet Format
Unchanged. The existing packet schema is preserved end to end. The bus transports packets, it does not transform them. Consumers that parse the current format continue to work without modification. Only the transport envelope around the packet changes, and that envelope is additive metadata (packet_type, timestamp, idempotency_key, cursor_position).
Migration Path
Backward compatibility during transition. Existing webhook consumers keep working while new bus-native consumers come online.
flowchart LR S[W3DS Core] -->|publish| B[(Awareness Bus)] B --> N1[New: Bus-native Platform] B --> N2[New: Bus-native Platform] B --> SH[Webhook Shim] SH -->|POST legacy webhook| L1[Legacy Platform 1] SH -->|POST legacy webhook| L2[Legacy Platform 2]Phases:
Out of Scope
Open Questions
Success Criteria