Generational Garbage Collection

advanced21 min read

Most Objects Die Young

If you could watch every object your JavaScript app creates, you'd see something wild: the vast majority of them — often 80-95% — become garbage within milliseconds of being born. They live fast and die young.

function processItems(items) {
  // These objects are born and die within one function call:
  const mapped = items.map(item => ({ ...item, processed: true }));
  const filtered = mapped.filter(item => item.valid);
  const result = filtered.reduce((acc, item) => acc + item.value, 0);
  // 'mapped' and 'filtered' arrays + all spread copies are now garbage
  return result;
}

Every .map() call creates a new array and N new objects. Every .filter() creates another array. Most of these objects live for microseconds. A few objects — module-level state, cached data, DOM references — live for the entire application lifetime.

This is the generational hypothesis: most objects die young, and the survivors tend to live long. And this observation is so consistently true across virtually all programs that V8 built its entire garbage collection strategy around it.

Mental Model

Think of the heap as a hospital with an Emergency Room and an ICU. The ER (Young Generation) sees a flood of patients — most are treated and released in minutes. The few who need long-term care are transferred to the ICU (Old Generation), where more thorough but less frequent care is provided. Running ICU-level diagnostics on every ER patient would be catastrophically slow.

V8's Heap Architecture

Let's look at how V8 actually organizes memory. It divides the heap into several spaces, each with a specific purpose:

┌──────────────────────────────────────────┐
│                V8 Heap                    │
├────────────────────┬─────────────────────┤
│  Young Generation  │   Old Generation     │
│  (New Space)       │   (Old Space)        │
│  ┌──────┬───────┐  │   ┌───────────────┐ │
│  │ From │  To   │  │   │ Old Pointer   │ │
│  │ Space│ Space │  │   │ Space         │ │
│  │1-8MB │1-8MB  │  │   ├───────────────┤ │
│  └──────┴───────┘  │   │ Old Data      │ │
│                    │   │ Space         │ │
│                    │   ├───────────────┤ │
│                    │   │ Large Object  │ │
│                    │   │ Space         │ │
│                    │   ├───────────────┤ │
│                    │   │ Code Space    │ │
│                    │   ├───────────────┤ │
│                    │   │ Map Space     │ │
│                    │   └───────────────┘ │
└────────────────────┴─────────────────────┘

New Space: 1-8 MB, split into two semi-spaces. All new objects are allocated here (unless too large)
Old Pointer Space: objects that survived GC and contain pointers to other objects
Old Data Space: objects that survived GC and contain only data (no pointers) — strings, boxed numbers
Large Object Space: objects larger than a page size (~512 KB). Never moved — too expensive to copy
Code Space: compiled machine code (JIT output)
Map Space: hidden class (Map) objects

Quiz

Why does V8 separate the Old Generation into 'Old Pointer Space' and 'Old Data Space'?

ABCD

The Scavenger: Young Generation Collection

New Space uses a semi-space copying collector called the Scavenger. The algorithm is surprisingly elegant — and once you see it, you'll understand why V8 is so fast at handling short-lived objects:

Allocation: Bump Pointer

New objects are allocated with a bump allocator — the simplest possible allocation strategy:

allocation_pointer += object_size;
if (allocation_pointer > space_end) trigger_scavenge();
return allocation_pointer - object_size;

One pointer increment. That's it. No free-list scanning, no fragmentation management, no header coalescing. This is nearly as fast as stack allocation — the bump pointer approach works because the semi-space is a contiguous memory region. Basically, allocating a JavaScript object is almost free.

Collection: Copy the Living, Forget the Dead

When the active semi-space fills up:

Scavenger Algorithm (Young Generation)4 steps

Press Play or click any node to begin.

Execution Trace

Before GC

From: [A, B, C, D, E, F, G] | To: [empty]

From-space is full. A, C, F are reachable from GC roots. B, D, E, G are garbage.

Trace roots

Roots → A → C, Roots → F

Scavenger traces from root set. Only live objects are visited. Dead objects are never touched.

Copy alive

From: [A, B, C, D, E, F, G] | To: [A', C', F']

Live objects copied to To-space in visit order. Compacted — no gaps.

Update refs

All pointers to A/C/F updated to A'/C'/F'

Forwarding addresses left in From-space headers. Reference updates are O(live objects).

Swap spaces

To becomes new From: [A', C', F', ...] | Old From wiped

Semi-spaces swap roles. The entire old From-space is freed in O(1) — no per-object deallocation.

The cost of the Scavenger is proportional to the number of live objects, not the heap size. If 90% of objects are dead, the Scavenger only copies the surviving 10%. Dead objects are never visited — their memory is reclaimed for free when the semi-space is wiped.

Quiz

The Scavenger's cost is proportional to what?

ABCD

Parallel Scavenging

Modern V8 (post-2018) runs the Scavenger in parallel across multiple threads. The main thread and several worker threads cooperate to trace and copy live objects simultaneously. This reduces Scavenger pause times from ~2ms to ~0.5ms in typical workloads.

Promotion: Young to Old

So what happens to the objects that don't die young? Objects that survive two Scavenger cycles are promoted (tenured) to the Old Generation. V8 tracks survival with an age bit on each object:

First Scavenge: Object survives → age bit set to 1
Second Scavenge: Object still alive, age bit = 1 → promote to Old Space

// This object will be promoted
const cache = new Map();  // Lives for the entire application → promoted after 2 GC cycles

// This object will never be promoted
function render() {
  const styles = { color: 'red', fontSize: 14 };  // Dies when render() returns
  applyStyles(styles);
}

The Promotion Problem

Here's the thing most people miss about promotion. It creates a tricky complication: Old Generation objects can now point to Young Generation objects. But the Scavenger only traces from roots into New Space — it doesn't scan the entire Old Generation (that would defeat the purpose of generational GC).

Write barriers solve this. Whenever a write stores a reference from an Old object to a Young object, a write barrier records this in a remembered set:

const oldObj = {};  // Already promoted to Old Space
const youngObj = { data: 42 };  // In New Space
oldObj.ref = youngObj;  // Write barrier fires → record this cross-generational reference

During Scavenging, V8 traces from both the regular root set AND the remembered set. This ensures that Young objects referenced only by Old objects are correctly identified as live.

Write Barrier Cost

Every store that might create a cross-generational reference must execute a write barrier — a small check that adds ~2-5 ns per write. In write-heavy code (building large data structures), write barriers can add measurable overhead. This is one reason V8 keeps New Space relatively small: fewer promotions mean fewer cross-generational references and fewer write barrier triggers.

Quiz

An Old Space object holds the only reference to a Young Space object. Without write barriers, what happens during a Scavenge?

ABCD

Old Generation: Mark-Sweep-Compact

The Old Generation plays by completely different rules. It uses a fundamentally different algorithm because:

Most Old objects are long-lived — copying them every collection is wasteful
Old Space is much larger (hundreds of MB) — a semi-space approach would waste half the memory
Old GC runs less frequently — higher per-collection cost is acceptable

Mark-Sweep-Compact (Old Generation)3 steps

Press Play or click any node to begin.

Phase 1: Marking

Starting from GC roots (global object, stack frames, handles), the marker visits every reachable object and sets a mark bit:

Roots → [Global] → [moduleA.cache] → [Map entry 1] → [cached value]
                 → [moduleB.state] → [subscriber list] → ...

Every reachable object gets marked. Unmarked objects are garbage.

Phase 2: Sweeping

V8 walks through Old Space pages and builds a free list from unmarked (dead) object regions:

Page: [LIVE][dead][dead][LIVE][dead][LIVE][dead][dead][dead]
Free list: [dead@2, size:2], [dead@5, size:1], [dead@7, size:3]

Sweeping doesn't move objects — it just records free regions for future allocation.

Phase 3: Compaction (Optional)

If fragmentation is too high (too many small free regions that can't satisfy allocation requests), V8 compacts by moving live objects together:

Before:  [LIVE][    ][LIVE][  ][LIVE][      ][LIVE]
After:   [LIVE][LIVE][LIVE][LIVE][                ]

Compaction is expensive (requires updating all pointers to moved objects) and only runs when fragmentation exceeds a threshold.

Concurrent and Incremental Marking

Now here's the engineering problem that keeps V8 developers up at night. A full marking pause on a 200 MB Old Space could take 50-100ms — that's multiple dropped frames, visible jank, angry users. V8 uses two clever techniques to minimize pause time:

Incremental Marking

Instead of marking the entire heap in one pause, V8 interleaves small marking steps with JavaScript execution:

[JS 5ms] [Mark 1ms] [JS 5ms] [Mark 1ms] [JS 5ms] [Mark 1ms] ... [Final mark 0.5ms]

Each incremental step marks a few hundred objects, then yields back to JavaScript. The total marking time is the same, but no single pause exceeds ~1ms.

Concurrent Marking

V8 runs marking on a separate thread while JavaScript executes on the main thread:

Main thread:     [JS running normally ........................]
Worker thread:   [    marking objects in parallel    ][done]
Main thread:     [tiny pause to finalize marking][JS continues]

The final pause only needs to re-trace objects that were modified during concurrent marking (using write barriers that log mutations).

Tri-Color Marking and the Write Barrier

Concurrent and incremental marking use a tri-color abstraction:

White: not yet visited (presumed garbage until proven otherwise)
Gray: visited but children not yet scanned
Black: visited and all children scanned

The marking worklist is the set of gray objects. The algorithm processes gray objects by scanning their children (turning children gray) and then turning the processed object black.

The danger: while the marker runs concurrently, JavaScript might store a reference from a black object to a white object. The marker already finished scanning the black object, so it won't revisit it — the white object would be incorrectly collected.

The write barrier catches this: when a black object stores a reference to a white object, the barrier turns the black object back to gray (re-enqueues it for scanning). This guarantees no live object is missed.

Quiz

During concurrent marking, JavaScript stores a new reference from a fully-scanned (black) object to an unvisited (white) object. What prevents the white object from being incorrectly collected?

ABCD

Parallel Sweeping and Compaction

Sweeping is embarrassingly parallel — each memory page can be swept independently. V8 runs sweeping on background threads while JavaScript executes:

Main thread:     [JS ...... JS ...... JS ...... JS]
Sweep thread 1:  [sweep page 1][sweep page 4][...]
Sweep thread 2:  [sweep page 2][sweep page 5][...]
Sweep thread 3:  [sweep page 3][sweep page 6][...]

JavaScript only pauses briefly if it needs to allocate on a page that hasn't been swept yet.

Compaction is also parallelized but requires a brief pause to update all pointers atomically.

GC Pauses in Practice

Modern V8 (post-Orinoco) GC pause times for typical web applications:

Collection Type	Typical Pause	Frequency
Scavenge (Young GC)	0.5-2 ms	Every few seconds
Incremental Mark step	0.5-1 ms	Interleaved with JS
Final marking pause	1-5 ms	When marking completes
Sweeping	Near-zero (concurrent)	After marking
Compaction	2-10 ms	Infrequent

Total visible pause per major GC: typically under 5ms on modern hardware. This is a dramatic improvement from early V8, which used a stop-the-world mark-sweep with pauses of 50-200ms.

Quiz

What is the main reason V8 uses concurrent marking instead of stop-the-world marking?

ABCD

Key Rules

1V8 splits the heap into Young Generation (New Space, 1-8 MB) and Old Generation (hundreds of MB). Different collection strategies for each.
2Young Generation uses semi-space copying (Scavenger). Cost is O(live objects) — dead objects are free to reclaim.
3New objects are allocated with a bump pointer — nearly stack-speed allocation. No free-list overhead.
4Objects surviving 2 Scavenge cycles are promoted to Old Space. Write barriers track cross-generational references.
5Old Generation uses Mark-Sweep-Compact. Marking finds live objects, sweeping builds free lists from dead regions, compaction defragments.
6Concurrent marking runs on background threads while JavaScript executes. Write barriers ensure correctness when JS mutates the heap during marking.
7Incremental marking breaks the marking phase into small steps interleaved with JS execution, capping individual pauses at ~1ms.
8Modern V8 GC pauses are typically under 5ms for major collections. Scavenge pauses are under 2ms.