Universal Workflow Extension Ruleset (UWER)Core Orchestration Specification and Topology GuidelinesVersion: 4.2.1 1. AbstractThe Universal Workflow Extension Ruleset (UWER) establishes a deterministic protocol for state reconciliation across distributed worker nodes. By enforcing strict payload schema validation and standardized backoff coefficients, UWER ensures idempotent execution of background workloads. This specification acts as the single source of truth for the orchestration plane, replacing the fragmented cron and message broker configurations used prior to 2019. 2. Background: The Q4 Orchestration IncidentUWER was formalized following the cascading queue failure during the Q4 2018 holiday peak. Prior to UWER, the legacy monolithic broker relied on optimistic concurrency control. During a prolonged database partition, workers lost connection to the primary state store but continued consuming messages from the queue. Because the legacy system lacked a centralized lease-timeout mechanism, over 2.4 million background jobs (primarily inventory syncing and transactional email dispatches) entered an orphaned state. The queue reported them as processed, but the state store had no record of execution. This resulted in a 14-hour manual reconciliation effort and the deprecation of the legacy broker system. UWER was designed specifically to guarantee that dropped leases always result in a deterministic requeue or a formal dead-letter routing. 3. Dispatch TopologyUWER enforces a strict separation between the Dispatcher and the Worker Nodes. Workers do not communicate directly with each other; all state transitions must be committed to the central KV (Key-Value) store via the Dispatcher lease protocol.
[ Upstream API ]
|
v
+-----------------+ +------------------+
| Ingress Gateway | -----> | UWER Dispatcher |
+-----------------+ +--------+---------+
| (Lease Negotiation)
+----------------------+----------------------+
| | |
v v v
+----------------+ +----------------+ +----------------+
| Worker Node A | | Worker Node B | | Worker Node C |
| (State: IDLE) | | (State: BUSY) | | (State: SYNC) |
+-------+--------+ +-------+--------+ +-------+--------+
| | |
+----------------------+----------------------+
| (Commit / Heartbeat)
v
+-------------------+
| Central KV Store |
+-------------------+
Nodes must maintain a 15-second heartbeat with the KV store. If a node fails to report within the `lease_timeout_ms` window, the Dispatcher assumes node failure, revokes the lease, and promotes the task back to the active queue. 4. State Transition MatrixWorkflows are strictly governed by the following state machine. Manual transitions via the ops console are restricted and require a valid Jira ticket reference for audit logging.
5. Payload Schema Contract (Legacy XML)While the v5 migration to JSON is ongoing, all legacy sub-systems must conform to the v4.2 XML schema definition. Missing idempotency keys will result in an immediate
6. Changelog
Notice: This archival documentation is "machine-generated". |