Sequencer Architectures in Capital Markets

15th January 2026

From principles to practice: operating sequenced distributed systems in capital markets

This article continues our exploration of modern distributed systems for capital markets by moving from design principles to concrete implementation choices around sequencer architectures. The on-demand recording includes detailed diagrams, timelines, and walkthroughs for deeper study.

If you read the first post on ‘Building consistent, low‑latency distributed systems for capital markets’, this is the practical follow‑on; if you did not, this article stands alone with a deeper focus on how to structure sequencer architectures as deterministic state machines, recover quickly from faults, and keep latency in check at scale.

What this blog covers:

  • Why consistency is a requirement in trading workflows and how state divergence across services creates business and operational risk.
  • How a global ordered log combined with determinism can provide consistency across distributed services.
  • How RAFT consensus and a global ordered log enables advanced resiliency and availability models.
  • Where other approaches (databases in the fast path, distributed caches, general‑purpose message brokers) break causality and force performance/consistency trade‑offs.
  • Practical implications for low‑latency, high‑throughput, resilient systems that operate continuously, including state machine replication, snapshots, pipelining, and asynchronous communication.

Sequencer design taxonomy: non‑gating, gating, and consensus (and their trade‑offs)

There are several ways to build a sequencer, each making different trade‑offs across latency, availability, and data safety. Broadly, the designs fall into three families:

  1. Non‑gating (optimize for speed and tolerate loss).
  2. Gating (reduce loss with synchronous replication).
  3. Consensus‑based (replicate to a quorum of leader + follower nodes with automated failover and leader election).

Sequencer design options
Figure 1: Sequencer Design Options


Understanding where each fits prevents surprises during failure and recovery in production.

  • Non‑gating: The primary broadcasts events to clients while replicating to a secondary in parallel; this minimizes hops and reduces latency, but if the primary fails before the secondary commits, clients may have observed effects that replicas do not, making recovery protocols complex. Failover is typically manual and recovery must be tolerant to loss (RPO > 0). This can be acceptable for some market‑data fan‑outs but not for trading state or risk workflows.
  • Gating: The primary synchronously replicates to a secondary and waits for an acknowledgement (ACK) before publishing to clients, reducing the risk of data loss compared to non‑gating designs but introducing synchronization latency. Operationally, failover is often manual, and if the secondary is unavailable the primary may stall, effectively limiting availability until the secondary is operational.
  • Consensus‑based: A cluster runs a leader and followers using a protocol such as RAFT; events are replicated and considered committed when a majority acknowledges receipt, preserving order and durability while enabling automatic leader election on failure. This removes manual orchestration from common failure paths and sustains availability as long as quorum holds, aligning better with continuous‑operation requirements in trading systems.

Taken together: non‑gating designs optimize for speed but accept loss; gating reduces loss but adds synchronization and manual operations; consensus‑based sequencing combines ordered durability with automated failover to balance availability and performance in production environments.

State and consistency revisited: adopt state machine replication, not eventual consistency

With these sequencing options in view, the next question is how to keep state consistent under load and maintain predictable performance. In trading a component should always agree on the same state at the same logical time; consistency is an important safety property of modern trading systems.

Modelling systems as deterministic state machines that apply ordered inputs from the sequencer to transition state and produce output achieves this. The internal state of these systems remain consistent across restarts and produce the same side effects.

By leveraging the sequencer to deliver inputs to multiple instances of the same state machine we are able to achieve state machine replication (SMR). Given the same starting state and identical order of inputs, every replica evolves to the same state and emits the same outputs. This allows us to model core domains (orders, accounts, order books) and offer advanced availability models, such as active/active with predictable recovery and auditability.

In practice, applications modelled using state machine replication consume only from that log, avoiding non‑deterministic inputs (wall‑clock time, randomness, unordered iteration), and start from the same initial state. This combination produces a consistent computation history that is testable, reproducible, and explainable across environments and time.

Availability modes and orchestrated failover: active/active cores, active/warm edges

With ordering and determinism in place, availability becomes an orchestration problem. Deterministic domains (risk, matching, positions) can run active/active: multiple replicas consume the same ordered inputs and will produce the same outputs; the sequencer handles de‑duplication of inputs so losing one replica does not change externally visible results. At the ingress edge, FIX and WebSocket gateways are inherently non‑deterministic because they accept inputs outside the sequencer. These can be operated as active/warm pairs, with the sequencer managing promotion/demotion on failure and return to service without leaking non‑determinism into core state machines.

End‑to‑end workflow: preserving causality via a global ordered log

Once domains are modeled as state machines, a global ordered log becomes the backbone that preserves causality across services. Consider a typical flow:

A new order is sequenced; the risk domain validates and reserves funds and emits a downstream “order” event; the matching domain fills and emits an execution report; then risk finalizes accounting and order state. Because each service consumes the same ordered sequence and runs deterministic logic, replicas converge to the same state and publish identical outputs, which the sequencer can de‑duplicate before distribution to observers and gateways. The causality and timing relationships remain explicit and replayable, which simplifies reasoning and recovery.

 

Snapshots and durable recovery: restart from state, not from zero

Even with fast logs, replaying millions of events during restart is rarely acceptable. The remedy is snapshots of in‑memory state taken at safe points. On failure, this allows you to restore the latest snapshot and replay only subsequent events from that safe point. This shifts recovery from “replay the entire log” to “load snapshot + apply delta since snapshot,” which materially reduces time‑to‑service.

Sequencer design options
Figure 2: Snapshots and Recovery

For host‑level or zonal failures where the local snapshot is gone, snapshot durability should not rely on a single machine or component. Snapshots can be persisted and replicated to multiple locations, which may include—but do not have to be limited to—the sequencer itself. The key requirement is that, after a leader election or any infrastructure failover, a surviving store can serve the same point‑in‑time snapshot to any recovering replica so all peers restart from an identical state. Decoupling snapshot storage from the ordered log gives operators flexibility in placement, capacity planning, and recovery topology, while ensuring that snapshots themselves are not a single point of failure.

Latency under consensus: keeping p99 tight with pipelining and asynchronous communication

Sequencer consensus will incur additional network costs, but those can be effectively managed with various advanced techniques to meet trading targets.

  • The first lever is pipelining and asynchrony: we recommend replicating in batches, send new batches while awaiting acknowledgements, and overlap commit‑to‑client sends with replication; this reduces head‑of‑line blocking and leader queuing.
  • The second lever is quorum behavior: in a consensus-based cluster, commits require a majority, so the leader advances on the fastest majority of followers rather than waiting for the slowest—reducing the impact of jitter from transient network or host effects compared to traditional gated sequencer designs.
  • A third lever is at the application boundary: when deterministic consumers run active/active, the first response is the only response required and is the fastest making slower responses redundant, thus flattening the cost of jitter across different processes, machines and network routes.
  • Finally, companies should consider composing deterministic state machines where appropriate to avoid unnecessary consensus round‑trips on hot paths while still publishing outcomes for downstream consumers.

Together, these techniques support tens‑of‑microseconds targets at institutional message rates while keeping p99 and beyond under control.

Lessons learned: properties your sequencer and services must provide

Let’s summarize what we have learned about high-performance sequencer architectures: a production‑grade platform for capital markets should exhibit a number of none-negotiable characteristics and quality attributes:

  1. A RAFT consensus-based sequencer provides access to a fault tolerant, highly available global ordered log.
  2. State machine replication combined with sequencing enable determinism and consistency across replicas of a service.
  3. Snapshotting with snapshot replication across the sequencer cluster provides a resilient mechanism for fast recovery of processes.
  4. These architectures enable availability modes (both active/active and active/warm) for services with group membership managed via the sequencer.
  5. Pervasive asynchrony, pipelining, and batching to enable systems to retain high throughput, low latency and compensating mechanisms against causes of high tail latencies.

Taken together, these properties align consistency, availability, and performance (rather than forcing a trade-off decision between them), making sequencer architectures the solid, scalable choice for future capital markets architectures.