Multi-Leader Replication: When One Leader Isn't Enough
Your users are in Tokyo, London, and New York. A single leader means trans-Pacific latency on every write. Multi-leader replication solves this... but introduces the nightmare of write conflicts.
The Hook
Your users are in Tokyo, London, and New York. A single leader in Virginia means 300ms latency on every write for Asian users. Multi-leader replication puts a leader in each region—writes are fast everywhere. But now you have a new problem: what happens when two users edit the same document simultaneously from different continents? Welcome to the nightmare of write conflicts.
Learning Objectives
By the end of this article, you will be able to:
- Identify the three use cases where multi-leader replication is necessary
- Analyze the conflict resolution strategies (LWW, merge, custom logic)
- Evaluate the trade-offs between different replication topologies
- Recognize the dangers of multi-leader in single-datacenter deployments
The Big Picture: When Single-Leader Isn't Enough
Single-leader replication has one critical limitation: all writes must go through one node. This creates problems at global scale.
Diagram Explanation: With single leader, Tokyo users face 300ms round-trip latency. Multi-leader puts a leader in each region—local writes are fast, and leaders synchronize asynchronously in the background.
Three Use Cases for Multi-Leader
Use Case 1: Multi-Datacenter Operation
The primary use case. Each datacenter has its own leader:
Diagram Explanation: Three datacenters, each with its own leader. Within each datacenter, leader → follower replication is synchronous (or semi-sync). Between datacenters, leader ↔ leader replication is asynchronous.
| Aspect | Single-Leader | Multi-Leader | |--------|---------------|--------------| | Write latency | High (cross-datacenter) | Low (local leader) | | Datacenter failure | Failover required | Other DCs continue | | Network tolerance | Writes blocked if leader unreachable | Writes succeed locally | | Conflict handling | None needed | Critical problem |
Use Case 2: Clients with Offline Operation
Think calendar apps, note-taking apps, or email clients:
Diagram Explanation: Each device is essentially a datacenter with one replica. When offline, you make local changes (writes to local leader). When back online, devices sync—and conflicts can arise.
Tools that handle this: CouchDB, PouchDB, and Firestore are designed for multi-leader offline-first scenarios.
Use Case 3: Collaborative Editing
Google Docs, Notion, Figma—real-time collaboration where multiple users edit simultaneously:
Diagram Explanation: Both users write to the same document at nearly the same time. The system must decide how to merge these concurrent edits into a consistent final state.
The Conflict Problem
Conflict = two writes to the same data that the system cannot automatically reconcile.
Diagram Explanation: Two leaders accept different values for the same field before they can sync. When replication catches up, the conflict must be resolved—but there's no "correct" answer.
Why This Doesn't Happen with Single-Leader
With a single leader, concurrent writes are serialized:
Diagram Explanation: The single leader processes writes sequentially. There's no conflict—just one write overwriting another. The final state is deterministic.
Conflict Avoidance: The Simple Solution
The best way to handle conflicts is to avoid them entirely.
Diagram Explanation: Route all writes for a particular record to the same leader. User 123's data always goes through DC1—no conflicts possible because writes are serialized.
Limitations:
- Breaks if user moves locations or leader fails
- Doesn't work for shared data (collaborative docs, shared calendars)
- Requires careful routing logic
Conflict Resolution Strategies
When conflicts can't be avoided, they must be resolved. Options:
1. Last Write Wins (LWW)
Most common, most dangerous.
Diagram Explanation: Attach a timestamp to each write. The write with the latest timestamp wins. Simple, but data loss is guaranteed for concurrent writes.
The problem: Both writes were "successful"—but only one survives. This is acceptable for caches, not for user data.
2. Merge Values
For some data types, you can intelligently merge:
Diagram Explanation: For sets and counters, concurrent additions can be merged (union). No data loss! But not all data types support this.
3. CRDTs: Conflict-Free Replicated Data Types
CRDTs are data structures designed for automatic conflict resolution:
| CRDT Type | Use Case | How It Works | |-----------|----------|--------------| | G-Counter | Like counts | Each node's count merged | | PN-Counter | Likes/unlikes | Separate counters for +/- | | G-Set | Tags, members | Union of all additions | | LWW-Register | Last writer wins | Timestamp comparison | | OR-Set | Add/remove items | Unique IDs per add |
Real-world usage: Riak uses CRDTs extensively. Redis has CRDT-based data types for geo-distributed deployments.
4. Custom Application Logic
When automatic resolution isn't possible:
def resolve_document_conflict(version_a, version_b, user_context):
"""
Custom conflict resolution for document conflicts.
Present both versions to the user for manual merge.
"""
if version_a['author'] == user_context['current_user']:
# Prefer user's own version
return version_a
# Can't auto-resolve - prompt user
raise ConflictError(
"Manual resolution required",
versions=[version_a, version_b]
)Replication Topologies
How do leaders communicate with each other?
Diagram Explanation: Three topologies for multi-leader replication. Circular and star have single points of failure. All-to-all is most resilient but can cause ordering issues.
| Topology | Fault Tolerance | Ordering | Complexity | |----------|-----------------|----------|------------| | Circular | ❌ One node breaks chain | Good | Low | | Star | ❌ Root is SPOF | Good | Medium | | All-to-All | ✅ Any node can fail | ⚠️ Causality issues | High |
The All-to-All Causality Problem
Diagram Explanation: With all-to-all replication, messages take different network paths. Leader 3 might receive the UPDATE before the INSERT—violating causality. Solution: Version vectors to track causal dependencies.
The Single-Datacenter Warning
⚠️ Multi-leader within a single datacenter is almost never a good idea.
Diagram Explanation: Multi-leader adds conflict resolution complexity. The only benefit is reduced write latency—but within a single datacenter, writes to a single leader are already fast enough.
If you're considering multi-leader for performance in a single DC: Try partitioning instead. Each partition has its own leader—no conflicts within a partition.
Real-World Examples
MySQL with Tungsten Replicator
External tooling for MySQL multi-leader:
- Handles conflict detection
- Provides custom resolution hooks
- Common in MySQL-heavy environments
PostgreSQL with BDR
Bi-Directional Replication for PostgreSQL:
- Commercial extension
- CRDT support for some data types
- Conflict logging and resolution APIs
CockroachDB
Modern distributed SQL with Raft consensus:
- Technically multi-leader (any node can accept writes)
- Uses Raft for conflict-free consensus
- Trades latency for consistency
Key Takeaways
-
Multi-leader is for specific use cases: Multi-datacenter, offline-capable apps, and real-time collaboration. Don't use it just because you can.
-
Conflict resolution is THE challenge: LWW loses data. Merging only works for some data types. Custom logic pushes complexity to the application.
-
Avoidance beats resolution: Route related writes to the same leader when possible. Accept conflicts only when the use case demands it.
-
CRDTs are powerful but limited: Great for counters, sets, and text editing. Not suitable for all data types. Research-heavy field.
-
Topology matters: All-to-all is most resilient but introduces causality issues. Use version vectors to track dependencies.
Common Pitfalls
| ❌ Misconception | ✅ Reality | |-----------------|-----------| | "Multi-leader is always better for performance" | It adds complexity; single-leader + partitioning is often simpler | | "Last-write-wins is fine" | LWW silently loses data—unacceptable for most user data | | "Conflicts are rare" | Under high concurrency, conflicts can be common | | "We'll handle conflicts later" | Conflict handling must be designed upfront; retrofitting is painful | | "CRDTs solve everything" | CRDTs work for specific data types; not a silver bullet | | "Google Docs is simple multi-leader" | It uses Operational Transformation, a complex algorithm for conflict-free editing |
What's Next?
Multi-leader replication has conflict resolution complexity but still has leaders—designated nodes that coordinate writes.
In the next section, we'll explore the radical alternative: Leaderless Replication. What if any node could accept writes? Dynamo, Cassandra, and Riak took this approach—and discovered new consistency challenges with quorums, read repair, and the limits of "N, R, W" formulas.