Operational Transformation
The Core Idea
Operational Transformation has an elegant premise: when two operations happen concurrently, transform one against the other so it accounts for the effect of the concurrent edit.
Alice inserts "X" at position 1. Bob deletes position 3. These edits were made on the same document state, but by the time Alice receives Bob's edit, her document has changed (she already inserted "X"). So Bob's "delete position 3" needs to become "delete position 4" — because Alice's insert shifted everything after position 1 to the right by one.
That adjustment is the transform function. Given two concurrent operations, it produces new versions of each that can be applied to the other's result and produce the same final document.
Operations and Their Structure
OT works with operations, not states. An operation describes a change:
type Operation =
| { type: 'insert'; position: number; char: string }
| { type: 'delete'; position: number };
In practice, most OT systems use a more compact representation — a sequence of retain, insert, and delete instructions:
type OTOp = (
| { retain: number }
| { insert: string }
| { delete: number }
)[];
// "Insert 'X' at position 5 in a 20-char document"
// becomes: [{ retain: 5 }, { insert: 'X' }, { retain: 15 }]
// "Delete 3 characters starting at position 2 in a 20-char document"
// becomes: [{ retain: 2 }, { delete: 3 }, { retain: 15 }]
This representation is nice because operations compose: you can merge multiple operations into one, and the retain/insert/delete format maps directly to how text editors describe changes.
Transform Functions: The Heart of OT
The transform function xform(a, b) takes two operations that were made against the same document state and returns two new operations (a', b') such that:
apply(apply(doc, a), b') === apply(apply(doc, b), a')
Both paths produce the same result. This is called TP1 (Transformation Property 1).
Let's implement transforms for basic insert and delete:
function transformInsertInsert(
a: { type: 'insert'; position: number; char: string },
b: { type: 'insert'; position: number; char: string }
): [typeof a, typeof b] {
if (a.position < b.position) {
return [a, { ...b, position: b.position + 1 }];
}
if (a.position > b.position) {
return [{ ...a, position: a.position + 1 }, b];
}
// Same position: need a tiebreaker (e.g., client ID)
// By convention, the "first" client wins left position
return [a, { ...b, position: b.position + 1 }];
}
function transformInsertDelete(
ins: { type: 'insert'; position: number; char: string },
del: { type: 'delete'; position: number }
): [typeof ins, typeof del] {
if (ins.position <= del.position) {
return [ins, { ...del, position: del.position + 1 }];
}
return [{ ...ins, position: ins.position - 1 }, del];
}
function transformDeleteDelete(
a: { type: 'delete'; position: number },
b: { type: 'delete'; position: number }
): [typeof a | null, typeof b | null] {
if (a.position < b.position) {
return [a, { ...b, position: b.position - 1 }];
}
if (a.position > b.position) {
return [{ ...a, position: a.position - 1 }, b];
}
// Same position: both delete the same character — both become no-ops
return [null, null];
}
Think of OT like two people giving directions to the same starting point. Alice says "turn right at the 3rd light." Bob says "turn left at the 5th light." If a new traffic light was added before both of them (an insert), Alice's directions still work, but Bob's "5th light" is now the "6th light." The transform adjusts Bob's directions to account for Alice's change to the shared landscape.
TP1 and TP2: The Correctness Properties
TP1 (Transformation Property 1)
For any two concurrent operations a and b, and any document state S:
apply(apply(S, a), xform(b, a).b') = apply(apply(S, b), xform(a, b).a')
Both transformation paths converge. This is sufficient for a client-server architecture where the server sequences operations.
TP2 (Transformation Property 2)
For three concurrent operations a, b, c:
xform(xform(c, a).c', xform(b, a).b').c'' = xform(xform(c, b).c', xform(a, b).a').c''
Transforming c against a then b' gives the same result as transforming c against b then a'. This is needed for peer-to-peer OT where there's no central sequencer.
Here's the uncomfortable truth: TP2 is incredibly hard to satisfy. Many published OT algorithms that claim to work in peer-to-peer settings have been shown to violate TP2 in edge cases. This is a major reason why Google Docs uses a central server.
The Client-Server OT Architecture
Google Docs uses a simplified architecture: the server acts as a single sequencer.
interface ServerState {
document: string;
revision: number;
history: Array<{ op: OTOp; clientId: string }>;
}
class OTServer {
private state: ServerState = {
document: '',
revision: 0,
history: [],
};
receiveOperation(
clientId: string,
clientRevision: number,
op: OTOp
): { transformedOp: OTOp; revision: number } {
let transformedOp = op;
for (let i = clientRevision; i < this.state.revision; i++) {
const serverOp = this.state.history[i].op;
const [, transformed] = transformOps(serverOp, transformedOp);
transformedOp = transformed;
}
this.state.document = applyOp(this.state.document, transformedOp);
this.state.history.push({ op: transformedOp, clientId });
this.state.revision++;
return {
transformedOp,
revision: this.state.revision,
};
}
}
The client side maintains a state machine with three states:
type ClientOTState =
| { type: 'synchronized' }
| { type: 'awaitingAck'; pending: OTOp }
| { type: 'awaitingAckWithBuffer'; pending: OTOp; buffer: OTOp };
class OTClient {
private state: ClientOTState = { type: 'synchronized' };
private revision = 0;
applyLocal(op: OTOp, sendToServer: (rev: number, op: OTOp) => void): void {
switch (this.state.type) {
case 'synchronized':
sendToServer(this.revision, op);
this.state = { type: 'awaitingAck', pending: op };
break;
case 'awaitingAck':
this.state = {
type: 'awaitingAckWithBuffer',
pending: this.state.pending,
buffer: op,
};
break;
case 'awaitingAckWithBuffer':
this.state = {
...this.state,
buffer: composeOps(this.state.buffer, op),
};
break;
}
}
applyServer(
op: OTOp,
applyToEditor: (op: OTOp) => void,
sendToServer: (rev: number, op: OTOp) => void
): void {
this.revision++;
switch (this.state.type) {
case 'synchronized':
applyToEditor(op);
break;
case 'awaitingAck': {
const [pending, transformed] = transformOps(this.state.pending, op);
this.state = { type: 'awaitingAck', pending };
applyToEditor(transformed);
break;
}
case 'awaitingAckWithBuffer': {
const [pending1, serverOp1] = transformOps(this.state.pending, op);
const [buffer1, serverOp2] = transformOps(this.state.buffer, serverOp1);
this.state = {
type: 'awaitingAckWithBuffer',
pending: pending1,
buffer: buffer1,
};
applyToEditor(serverOp2);
break;
}
}
}
serverAck(sendToServer: (rev: number, op: OTOp) => void): void {
this.revision++;
switch (this.state.type) {
case 'awaitingAck':
this.state = { type: 'synchronized' };
break;
case 'awaitingAckWithBuffer':
sendToServer(this.revision, this.state.buffer);
this.state = { type: 'awaitingAck', pending: this.state.buffer };
break;
}
}
}
Why OT Is Hard: The N-Client Problem
With 2 clients and a server, OT is manageable. With N clients in a peer-to-peer setting, the complexity explodes.
Consider 3 clients making concurrent edits. Client A's operation needs to be transformed against B's and C's. But the order of transformation matters — transforming A against B first, then against C-transformed-against-B, must give the same result as transforming A against C first, then against B-transformed-against-C. This is TP2, and it's where most OT implementations break.
The number of transformation paths grows factorially with the number of concurrent operations. For N concurrent operations, there are N! possible orderings, and all must converge.
The academic history of OT is littered with published algorithms later proven incorrect. The dOPT algorithm (1989) was shown to violate convergence in certain cases. The adOPTed algorithm attempted to fix it but introduced new edge cases. Jupiter (used by Google Wave) worked for client-server but couldn't be generalized to peer-to-peer. The GOT and GOTO algorithms attempted general solutions with increasing complexity.
This 20+ year struggle is exactly why CRDTs — which guarantee convergence by construction — have become the preferred approach for new systems.
OT's Limitations
- 1OT requires a central server to avoid TP2 (peer-to-peer OT is nearly impossible to get right)
- 2Transform functions must handle every combination of operation types — complexity is O(n^2) in operation types
- 3Server becomes a bottleneck and single point of failure for all edits
- 4Offline editing is extremely difficult with OT — how do you transform against operations you haven't seen?
- 5Undo is complex — you must transform the undo operation against all operations that happened since the original
Why the Industry Is Moving to CRDTs
OT served Google Docs well for over a decade. But the requirements have changed:
- Offline-first: Users expect apps to work without internet. OT's server dependency makes this painful.
- Decentralized architectures: Local-first software, peer-to-peer sync, edge computing — OT doesn't fit.
- Multi-device sync: Your phone, laptop, and tablet editing the same note. CRDTs handle this naturally.
- Simplicity: A correct OT implementation is thousands of lines of subtle code. CRDT libraries (Yjs, Automerge) give you correctness out of the box.
New collaborative apps almost universally choose CRDTs. Google Docs itself is an OT legacy — if built today, it would likely use a CRDT-based approach.
| What developers do | What they should do |
|---|---|
| Implementing OT from scratch for a new project OT implementations are notoriously bug-prone. Even published academic algorithms have been proven incorrect. Unless you're at Google scale with a dedicated team, a CRDT library gives you convergence guarantees without the risk. | Use a battle-tested CRDT library like Yjs or Automerge |
| Thinking OT and CRDTs are interchangeable The choice between OT and CRDTs determines your entire system architecture — server topology, offline support, sync protocol, conflict resolution strategy. | Understand the architectural implications: OT needs a central server, CRDTs work decentralized |
| Using position-based operations without transformation Raw position-based operations (insert at index 5) break when concurrent operations shift indices. You must either adjust positions via transformation or eliminate position-dependency entirely. | Either transform positions (OT) or use position-independent identifiers (CRDTs) |
Deep Dive: OT Transform Correctness
Given a document "ABC" and three concurrent operations: (1) Insert "X" at position 1, (2) Delete position 2, (3) Insert "Y" at position 2. Trace through all possible orderings and show that your transform functions produce the same final document. What happens if a transform function has a bug for the insert-insert-at-same-position case? Show how it breaks convergence.