Skip to content

DOM Construction and CSSOM

advanced8 min read

Building the Trees That Power Every Page

The DOM and CSSOM are not abstract concepts — they are literal data structures that the browser builds, stores in memory, and queries thousands of times per second. Every querySelector, every computed style, every layout calculation operates on these trees. Understanding how they're built reveals why certain performance patterns matter.

Mental Model

Imagine you're reading a novel written in a foreign language. First, you recognize individual characters (byte decoding). Then you group characters into words (tokenizing). Then you understand each word's role — noun, verb, adjective (node creation). Finally, you build the sentence structure — who does what to whom (tree construction). The browser does exactly this with HTML and CSS, except it processes them character by character in a single pass.

The HTML Parsing Pipeline

HTML parsing follows a four-stage pipeline, each feeding the next:

Stage 1: Bytes to Characters

The browser receives raw bytes from the network. It looks at the Content-Type header and any <meta charset> tag to determine the encoding (usually UTF-8), then decodes bytes into characters.

Raw bytes:   3C 68 31 3E 48 65 6C 6C 6F 3C 2F 68 31 3E
Characters:  < h 1 > H e l l o < / h 1 >

Stage 2: Characters to Tokens

The tokenizer (a state machine) scans characters and emits tokens. The HTML spec defines the exact states and transitions — there are over 80 tokenizer states.

Characters:  <h1>Hello</h1>
Tokens:      StartTag{h1} Character{Hello} EndTag{h1}

The tokenizer doesn't understand nesting. It produces a flat stream of tokens. Structure comes from the next stage.

Stage 3: Tokens to Nodes

Each token is converted to a node object with properties and relationships. A StartTag{h1} becomes an Element node. Character{Hello} becomes a Text node.

Stage 4: Nodes to DOM Tree

The tree construction algorithm takes nodes and builds the tree by tracking open elements on a stack. When a StartTag arrives, it pushes onto the stack. When an EndTag arrives, it pops. The current stack top is always the parent.

Execution Trace
Token
StartTag`{html}` → push html onto stack
Stack: [html]
Token
StartTag`{head}` → push head onto stack
Stack: [html, head]
Token
StartTag`{title}` → push title
Stack: [html, head, title]
Token
Character`{Hello}` → create Text node, append to title
Stack: [html, head, title]
Token
EndTag`{title}` → pop title from stack
Stack: [html, head]
Token
EndTag`{head}` → pop head
Stack: [html]
Token
StartTag`{body}` → push body
Stack: [html, body]
Token
StartTag`{h1}` → push h1
Stack: [html, body, h1]
Token
Character`{World}` → create Text node, append to h1
Stack: [html, body, h1]
Token
EndTag`{h1}` → pop h1
Stack: [html, body]
HTML error recovery — the parser never fails

Unlike XML, HTML parsing never throws an error. The spec defines exact recovery behavior for every possible malformed input. Missing </p> tag? The parser auto-closes it when it encounters a block element. <table> inside a <p>? The parser closes the <p> first. Nested <b> tags? The adoption agency algorithm handles it. This error tolerance is why "view source" often looks different from the DOM — the parser fixes your markup silently. This error recovery is expensive, though — malformed HTML forces the parser into slower recovery code paths.

Speculative Parsing (Preload Scanner)

When the main parser is blocked by a synchronous <script>, the browser does not sit idle. A preload scanner (also called the speculative parser) continues scanning the HTML looking for resource URLs — images, stylesheets, scripts — and starts downloading them early.

<head>
  <link rel="stylesheet" href="styles.css">
  <script src="heavy-script.js"></script>
  <!-- Main parser is BLOCKED waiting for heavy-script.js -->
  <!-- But the preload scanner has already found these: -->
  <script src="analytics.js" defer></script>
  <link rel="stylesheet" href="above-fold.css">
  <img src="hero.jpg">
  <!-- All three start downloading in parallel while the main parser waits -->
</head>
Don't break the preload scanner

The preload scanner can only find resources declared in HTML markup. If you load resources dynamically from JavaScript (const img = new Image(); img.src = 'hero.jpg'), the preload scanner cannot discover them. Resources loaded from JS always start downloading later. For critical resources, declare them in HTML or use <link rel="preload">.

CSSOM Construction

CSS follows a similar pipeline: bytes → characters → tokens → nodes → CSSOM tree. But there is a critical difference: CSS cannot be parsed incrementally.

HTML parsing is incremental — the browser can build part of the DOM and render it. CSS parsing is not. A rule at the end of a stylesheet can override a rule at the beginning. The browser must parse the entire stylesheet before it can determine the final computed style of any element.

/* Rule 1: Set color to blue */
h1 { color: blue; }

/* ... 5000 lines of CSS later ... */

/* Rule 5001: Override to red — without parsing this, computed style is wrong */
.hero h1 { color: red; }

This is why CSS is render-blocking. The browser does not know the final style of <h1> until every CSS rule has been processed.

The CSSOM Tree

The CSSOM mirrors the DOM structure but represents computed styles. Every DOM node that participates in rendering has a corresponding entry in the CSSOM with its resolved styles.

CSSOM Tree:
  body
    font-size: 16px
    color: #333
    ├── h1
    │     font-size: 2em (→ 32px computed)
    │     color: red (overridden from blue)
    │     margin: 0.67em auto
    └── p
          font-size: 1em (→ 16px computed)
          line-height: 1.5

Every value in the CSSOM is computed — relative units are resolved to absolute values, shorthand properties are expanded, inherit and initial values are resolved. The CSSOM contains no ambiguity.

Production Scenario: The Invisible Above-the-Fold

A SaaS dashboard loaded a 380KB CSS file (uncompressed) containing styles for every page in the app. On first navigation:

  1. HTML parsed in ~50ms
  2. CSS download took 800ms on a typical connection
  3. No rendering occurred until CSS was fully parsed
  4. Users saw a white screen for nearly a second

The fix: Extract critical CSS for each route (the styles needed for above-the-fold content) and inline it in <head>. Load the full stylesheet asynchronously.

<head>
  <!-- Critical CSS inlined — no network request needed -->
  <style>
    .dashboard-header { display: flex; height: 64px; }
    .sidebar { width: 240px; }
    .main-content { flex: 1; padding: 24px; }
    /* ~3KB of above-the-fold styles */
  </style>

  <!-- Full stylesheet loaded async — does not block rendering -->
  <link rel="stylesheet" href="full.css" media="print" onload="this.media='all'">
  <noscript><link rel="stylesheet" href="full.css"></noscript>
</head>

Result: First paint dropped from 950ms to 180ms.

What developers doWhat they should do
Assume the DOM is built all at once after HTML downloads
The browser starts building the DOM from the first bytes received, enabling progressive rendering
DOM construction is incremental — the parser emits nodes as it goes
Think CSS parsing is incremental like HTML
Later CSS rules can override earlier ones, so the browser must see all rules before computing final styles
CSSOM requires the full stylesheet — it cannot be built incrementally
Load critical resources from JavaScript instead of HTML
The preload scanner can only discover resources in HTML — JS-loaded resources start fetching later
Declare critical resources in HTML markup or use `<link rel='preload'>`
Rely on the preload scanner for dynamically injected resources
Preload scanner only scans static HTML, not dynamically generated content
Use `<link rel='preload'>` for resources you know you'll need
Quiz
The main HTML parser is blocked by a synchronous script. What happens to resource discovery?
Quiz
Why does the browser need to parse ALL CSS before rendering any element?
Key Rules
  1. 1DOM construction is incremental — the parser builds the tree as bytes arrive, enabling progressive rendering.
  2. 2CSSOM construction requires the complete stylesheet — CSS cannot be parsed incrementally because later rules override earlier ones.
  3. 3The preload scanner continues discovering resources even when the main parser is blocked by a synchronous script.
  4. 4Resources loaded from JavaScript bypass the preload scanner and always start downloading later.
  5. 5Inline critical CSS to eliminate the CSS network request for above-the-fold rendering.
  6. 6Malformed HTML never causes parsing errors — the spec defines exact recovery behavior, but recovery paths are slower.