Operational Transformation

advanced18 min read

The Core Idea

Operational Transformation has an elegant premise: when two operations happen concurrently, transform one against the other so it accounts for the effect of the concurrent edit.

Alice inserts "X" at position 1. Bob deletes position 3. These edits were made on the same document state, but by the time Alice receives Bob's edit, her document has changed (she already inserted "X"). So Bob's "delete position 3" needs to become "delete position 4" — because Alice's insert shifted everything after position 1 to the right by one.

That adjustment is the transform function. Given two concurrent operations, it produces new versions of each that can be applied to the other's result and produce the same final document.

Operations and Their Structure

OT works with operations, not states. An operation describes a change:

type Operation =
  | { type: 'insert'; position: number; char: string }
  | { type: 'delete'; position: number };

In practice, most OT systems use a more compact representation — a sequence of retain, insert, and delete instructions:

type OTOp = (
  | { retain: number }
  | { insert: string }
  | { delete: number }
)[];

// "Insert 'X' at position 5 in a 20-char document"
// becomes: [{ retain: 5 }, { insert: 'X' }, { retain: 15 }]

// "Delete 3 characters starting at position 2 in a 20-char document"
// becomes: [{ retain: 2 }, { delete: 3 }, { retain: 15 }]

This representation is nice because operations compose: you can merge multiple operations into one, and the retain/insert/delete format maps directly to how text editors describe changes.

Transform Functions: The Heart of OT

The transform function xform(a, b) takes two operations that were made against the same document state and returns two new operations (a', b') such that:

apply(apply(doc, a), b') === apply(apply(doc, b), a')

Both paths produce the same result. This is called TP1 (Transformation Property 1).

Let's implement transforms for basic insert and delete:

function transformInsertInsert(
  a: { type: 'insert'; position: number; char: string },
  b: { type: 'insert'; position: number; char: string }
): [typeof a, typeof b] {
  if (a.position < b.position) {
    return [a, { ...b, position: b.position + 1 }];
  }
  if (a.position > b.position) {
    return [{ ...a, position: a.position + 1 }, b];
  }
  // Same position: need a tiebreaker (e.g., client ID)
  // By convention, the "first" client wins left position
  return [a, { ...b, position: b.position + 1 }];
}

function transformInsertDelete(
  ins: { type: 'insert'; position: number; char: string },
  del: { type: 'delete'; position: number }
): [typeof ins, typeof del] {
  if (ins.position <= del.position) {
    return [ins, { ...del, position: del.position + 1 }];
  }
  return [{ ...ins, position: ins.position - 1 }, del];
}

function transformDeleteDelete(
  a: { type: 'delete'; position: number },
  b: { type: 'delete'; position: number }
): [typeof a | null, typeof b | null] {
  if (a.position < b.position) {
    return [a, { ...b, position: b.position - 1 }];
  }
  if (a.position > b.position) {
    return [{ ...a, position: a.position - 1 }, b];
  }
  // Same position: both delete the same character — both become no-ops
  return [null, null];
}

Mental Model

Think of OT like two people giving directions to the same starting point. Alice says "turn right at the 3rd light." Bob says "turn left at the 5th light." If a new traffic light was added before both of them (an insert), Alice's directions still work, but Bob's "5th light" is now the "6th light." The transform adjusts Bob's directions to account for Alice's change to the shared landscape.

Quiz

Document is 'HELLO'. Alice inserts 'X' at position 2. Bob inserts 'Y' at position 4. What are the transformed operations?

ABCD

TP1 and TP2: The Correctness Properties

TP1 (Transformation Property 1)

For any two concurrent operations a and b, and any document state S:

apply(apply(S, a), xform(b, a).b') = apply(apply(S, b), xform(a, b).a')

Both transformation paths converge. This is sufficient for a client-server architecture where the server sequences operations.

TP2 (Transformation Property 2)

For three concurrent operations a, b, c:

xform(xform(c, a).c', xform(b, a).b').c'' = xform(xform(c, b).c', xform(a, b).a').c''

Transforming c against a then b' gives the same result as transforming c against b then a'. This is needed for peer-to-peer OT where there's no central sequencer.

Here's the uncomfortable truth: TP2 is incredibly hard to satisfy. Many published OT algorithms that claim to work in peer-to-peer settings have been shown to violate TP2 in edge cases. This is a major reason why Google Docs uses a central server.

Quiz

Why does Google Docs require a central server for its OT implementation?

ABCD

The Client-Server OT Architecture

Google Docs uses a simplified architecture: the server acts as a single sequencer.

interface ServerState {
  document: string;
  revision: number;
  history: Array<{ op: OTOp; clientId: string }>;
}

class OTServer {
  private state: ServerState = {
    document: '',
    revision: 0,
    history: [],
  };

  receiveOperation(
    clientId: string,
    clientRevision: number,
    op: OTOp
  ): { transformedOp: OTOp; revision: number } {
    let transformedOp = op;

    for (let i = clientRevision; i < this.state.revision; i++) {
      const serverOp = this.state.history[i].op;
      const [, transformed] = transformOps(serverOp, transformedOp);
      transformedOp = transformed;
    }

    this.state.document = applyOp(this.state.document, transformedOp);
    this.state.history.push({ op: transformedOp, clientId });
    this.state.revision++;

    return {
      transformedOp,
      revision: this.state.revision,
    };
  }
}

The client side maintains a state machine with three states:

Execution Trace

Synchronized

No pending operations. Client revision matches server.

Happy path — user sees exactly what's on the server

Awaiting ACK

User edited locally. Operation sent to server, waiting for acknowledgment.

Local state is ahead of server. Incoming remote ops are transformed against pending op.

Awaiting ACK + Buffer

User edited again while waiting for ACK of first edit.

Two pending operations: one in-flight, one buffered. Remote ops transform against both.

type ClientOTState =
  | { type: 'synchronized' }
  | { type: 'awaitingAck'; pending: OTOp }
  | { type: 'awaitingAckWithBuffer'; pending: OTOp; buffer: OTOp };

class OTClient {
  private state: ClientOTState = { type: 'synchronized' };
  private revision = 0;

  applyLocal(op: OTOp, sendToServer: (rev: number, op: OTOp) => void): void {
    switch (this.state.type) {
      case 'synchronized':
        sendToServer(this.revision, op);
        this.state = { type: 'awaitingAck', pending: op };
        break;
      case 'awaitingAck':
        this.state = {
          type: 'awaitingAckWithBuffer',
          pending: this.state.pending,
          buffer: op,
        };
        break;
      case 'awaitingAckWithBuffer':
        this.state = {
          ...this.state,
          buffer: composeOps(this.state.buffer, op),
        };
        break;
    }
  }

  applyServer(
    op: OTOp,
    applyToEditor: (op: OTOp) => void,
    sendToServer: (rev: number, op: OTOp) => void
  ): void {
    this.revision++;

    switch (this.state.type) {
      case 'synchronized':
        applyToEditor(op);
        break;
      case 'awaitingAck': {
        const [pending, transformed] = transformOps(this.state.pending, op);
        this.state = { type: 'awaitingAck', pending };
        applyToEditor(transformed);
        break;
      }
      case 'awaitingAckWithBuffer': {
        const [pending1, serverOp1] = transformOps(this.state.pending, op);
        const [buffer1, serverOp2] = transformOps(this.state.buffer, serverOp1);
        this.state = {
          type: 'awaitingAckWithBuffer',
          pending: pending1,
          buffer: buffer1,
        };
        applyToEditor(serverOp2);
        break;
      }
    }
  }

  serverAck(sendToServer: (rev: number, op: OTOp) => void): void {
    this.revision++;

    switch (this.state.type) {
      case 'awaitingAck':
        this.state = { type: 'synchronized' };
        break;
      case 'awaitingAckWithBuffer':
        sendToServer(this.revision, this.state.buffer);
        this.state = { type: 'awaitingAck', pending: this.state.buffer };
        break;
    }
  }
}

Quiz

In the OT client state machine, what happens when a user makes an edit while already waiting for server acknowledgment of a previous edit?

ABCD

Why OT Is Hard: The N-Client Problem

With 2 clients and a server, OT is manageable. With N clients in a peer-to-peer setting, the complexity explodes.

Consider 3 clients making concurrent edits. Client A's operation needs to be transformed against B's and C's. But the order of transformation matters — transforming A against B first, then against C-transformed-against-B, must give the same result as transforming A against C first, then against B-transformed-against-C. This is TP2, and it's where most OT implementations break.

The number of transformation paths grows factorially with the number of concurrent operations. For N concurrent operations, there are N! possible orderings, and all must converge.

The academic history of OT is littered with published algorithms later proven incorrect. The dOPT algorithm (1989) was shown to violate convergence in certain cases. The adOPTed algorithm attempted to fix it but introduced new edge cases. Jupiter (used by Google Wave) worked for client-server but couldn't be generalized to peer-to-peer. The GOT and GOTO algorithms attempted general solutions with increasing complexity.

This 20+ year struggle is exactly why CRDTs — which guarantee convergence by construction — have become the preferred approach for new systems.

OT's Limitations

Key Rules

1OT requires a central server to avoid TP2 (peer-to-peer OT is nearly impossible to get right)
2Transform functions must handle every combination of operation types — complexity is O(n^2) in operation types
3Server becomes a bottleneck and single point of failure for all edits
4Offline editing is extremely difficult with OT — how do you transform against operations you haven't seen?
5Undo is complex — you must transform the undo operation against all operations that happened since the original

Why the Industry Is Moving to CRDTs

OT served Google Docs well for over a decade. But the requirements have changed:

Offline-first: Users expect apps to work without internet. OT's server dependency makes this painful.
Decentralized architectures: Local-first software, peer-to-peer sync, edge computing — OT doesn't fit.
Multi-device sync: Your phone, laptop, and tablet editing the same note. CRDTs handle this naturally.
Simplicity: A correct OT implementation is thousands of lines of subtle code. CRDT libraries (Yjs, Automerge) give you correctness out of the box.

New collaborative apps almost universally choose CRDTs. Google Docs itself is an OT legacy — if built today, it would likely use a CRDT-based approach.

What developers do	What they should do
Implementing OT from scratch for a new project OT implementations are notoriously bug-prone. Even published academic algorithms have been proven incorrect. Unless you're at Google scale with a dedicated team, a CRDT library gives you convergence guarantees without the risk.	Use a battle-tested CRDT library like Yjs or Automerge
Thinking OT and CRDTs are interchangeable The choice between OT and CRDTs determines your entire system architecture — server topology, offline support, sync protocol, conflict resolution strategy.	Understand the architectural implications: OT needs a central server, CRDTs work decentralized
Using position-based operations without transformation Raw position-based operations (insert at index 5) break when concurrent operations shift indices. You must either adjust positions via transformation or eliminate position-dependency entirely.	Either transform positions (OT) or use position-independent identifiers (CRDTs)

Interview Question

Deep Dive: OT Transform Correctness

Given a document "ABC" and three concurrent operations: (1) Insert "X" at position 1, (2) Delete position 2, (3) Insert "Y" at position 2. Trace through all possible orderings and show that your transform functions produce the same final document. What happens if a transform function has a bug for the insert-insert-at-same-position case? Show how it breaks convergence.