Skip to content

What Is HTML?

beginner12 min read

Pop Quiz Before We Start

Quick — what does this display in a browser?

<p>Price: 5 < 10</p>

If you said "Price: 5 < 10" — nope. The browser sees < and thinks you're opening a new tag. It tries to parse 10 as an HTML element, gets confused, and the output is broken. You need &lt; instead.

HTML looks simple. But browsers are incredibly forgiving parsers, and that forgiveness hides bugs that bite you in production. Let's build the real mental model.

Mental Model

Think of HTML as a set of labeled containers — like a warehouse with nested boxes. Each box has a label on the outside (the tag name) describing what's inside, and some boxes can only fit inside certain other boxes. A paragraph box fits inside a section box. A section fits inside the body. The browser reads the labels, decides what each box is, and arranges them on screen accordingly. The labels are not the content — they describe the content.

What HTML Actually Is

HTML stands for HyperText Markup Language. Let's break that down:

  • HyperText — text that links to other text. That's the web's killer feature: clicking a link takes you somewhere new.
  • Markup — annotations that describe structure. Bold, heading, paragraph, list — these are structural descriptions, not visual instructions.
  • Language — a formal syntax with rules the browser can parse.

HTML is not a programming language. It has no variables, no loops, no logic. It's a declarative document format — you declare what things are, and the browser figures out how to display them.

<h1>Breaking News</h1>
<p>Something <strong>important</strong> happened today.</p>

You're not telling the browser "make this text big and bold." You're saying "this is a top-level heading" and "this word is strongly emphasized." The browser (and CSS) decides what that looks like.

Quiz
What does HTML primarily describe?

How the Browser Parses HTML

When the browser receives an HTML file, it doesn't just display text. It runs a sophisticated parsing algorithm defined in the HTML specification.

Here's the simplified process:

<!DOCTYPE html>
<html>
  <head>
    <title>My Page</title>
  </head>
  <body>
    <h1>Hello</h1>
    <p>World</p>
  </body>
</html>
Execution Trace
Byte stream
Browser receives raw bytes from the network
Bytes are decoded into characters using the specified encoding (usually UTF-8)
Tokenizer
Characters are converted into tokens
Tokens include start tags, end tags, attributes, and text content
Tree builder
Tokens are assembled into a DOM tree
The tree builder maintains a stack of open elements and handles nesting rules
DOM ready
The complete DOM tree is available
JavaScript can now query and manipulate the tree

The parser is shockingly forgiving. Try this:

<p>First paragraph
<p>Second paragraph

No closing tags, and it still works. The parser sees the second <p> and automatically closes the first one. This is called implicit closing and it's part of the spec, not a browser quirk.

Common Trap

Just because the browser fixes your mistakes doesn't mean you should rely on it. Implicit closing rules are complex and differ by element. A missing closing tag in one context might work fine, but in another context it creates a completely different DOM tree than you intended. Always close your tags explicitly.

Quiz
What happens when a browser encounters HTML without closing tags?

Elements, Tags, and Attributes

These three terms get mixed up constantly. Let's nail them down:

<a href="https://example.com" target="_blank">Click me</a>
  • Element — the whole thing: opening tag + content + closing tag. The a element above includes everything from <a> to </a>.
  • Tag — the markup syntax. <a> is the opening tag. </a> is the closing tag. Tags are not elements — they delimit elements.
  • Attribute — metadata on the opening tag. href and target are attributes. They configure the element's behavior.
  • Content — what's between the tags. "Click me" is the text content.

Some elements are void elements — they can't have content and don't get a closing tag:

<img src="photo.jpg" alt="A photo">
<br>
<input type="text">
<hr>
<meta charset="utf-8">
Quiz
Which of these is a void element (no closing tag allowed)?

Nesting Rules: What Goes Where

HTML is not a free-for-all. There are strict rules about which elements can be children of which:

<!-- Valid: inline element inside block -->
<p>This is <strong>important</strong> text.</p>

<!-- Invalid: block element inside inline -->
<span><div>This breaks things</div></span>

<!-- Invalid: paragraph inside paragraph -->
<p>Outer <p>Inner</p></p>

The HTML spec defines content models that dictate nesting. The main categories:

  • Flow content — most elements (divs, paragraphs, headings)
  • Phrasing content — inline-level elements (spans, strongs, links)
  • Sectioning content — structural sections (article, section, nav, aside)

The golden rule: phrasing content elements cannot contain flow content. A span can't hold a div. A strong can't hold an h1.

Quiz
Why is this HTML invalid: a p element containing a div element?

Production Scenario: When Bad HTML Causes Real Bugs

Here's a bug that actually ships to production more often than you'd think:

<p>
  Please read our
  <div class="highlight-box">
    <strong>Terms of Service</strong>
  </div>
  before continuing.
</p>

A developer wraps a highlight box inside a paragraph. Looks reasonable. But the browser's parser sees the div inside the p, knows that's invalid, and auto-closes the p before the div. The actual DOM becomes:

<p>Please read our</p>
<div class="highlight-box">
  <strong>Terms of Service</strong>
</div>
before continuing.
<p></p>

Now your CSS that styles p elements breaks. The "before continuing" text is an orphaned text node. Your layout is busted, and DevTools shows a completely different tree than your source code.

What developers doWhat they should do
Putting block-level elements inside p tags
The parser auto-closes p when it encounters a block element, creating unexpected DOM structure
Use span or other phrasing elements inside p, or restructure with div as the wrapper
Omitting alt attributes on images
Screen readers need alt text. Missing alt is an accessibility violation (WCAG 1.1.1)
Always include alt — use empty alt for decorative images
Using tags for visual styling instead of meaning
Using br tags for spacing or h1 for 'big text' creates accessibility and SEO problems
Use semantic elements for structure, CSS for appearance
Skipping the DOCTYPE declaration
Without it, browsers enter 'quirks mode' — a legacy rendering mode with unpredictable behavior differences
Always start with the doctype declaration

Challenge: Fix the Broken HTML

This HTML has several problems. Can you spot and fix all of them?

<html>
  <body>
    <h1>Welcome to my site
    <p>This is <b>bold and <i>italic</b> text</i></p>
    <img src="logo.png">
    <p>Click <div class="btn">here</div> to continue</p>
  </body>
</html>
Show Answer

There are five issues:

  1. Missing DOCTYPE — add <!DOCTYPE html> at the top
  2. Missing closing h1 tag — add </h1> after "my site"
  3. Misnested tags — the b and i tags overlap. Tags must nest properly: <b><i>italic</i></b>
  4. Missing alt attribute — the img needs alt="Site logo" (or alt="" if decorative)
  5. Block element inside p — the div inside the p will cause the parser to auto-close the paragraph. Use a span instead: <span class="btn">here</span>

Fixed version:

<!DOCTYPE html>
<html>
  <body>
    <h1>Welcome to my site</h1>
    <p>This is <b><i>bold and italic</i></b> text</p>
    <img src="logo.png" alt="Site logo">
    <p>Click <span class="btn">here</span> to continue</p>
  </body>
</html>
Key Rules
  1. 1HTML describes structure and meaning, not visual appearance — CSS handles styling
  2. 2The browser's parser is error-tolerant and will always produce a DOM tree, but its error recovery may not match your intent
  3. 3Elements, tags, and attributes are different things — tags delimit elements, attributes configure them
  4. 4Void elements like img, br, and input cannot have content or closing tags
  5. 5Nesting rules matter: phrasing content elements cannot contain flow content — the parser will auto-close and restructure your DOM
1/9