What Is HTML?

beginner12 min read

Pop Quiz Before We Start

Quick — what does this display in a browser?

<p>Price: 5 < 10</p>

If you said "Price: 5 < 10" — nope. The browser sees < and thinks you're opening a new tag. It tries to parse 10 as an HTML element, gets confused, and the output is broken. You need < instead.

HTML looks simple. But browsers are incredibly forgiving parsers, and that forgiveness hides bugs that bite you in production. Let's build the real mental model.

Mental Model

Think of HTML as a set of labeled containers — like a warehouse with nested boxes. Each box has a label on the outside (the tag name) describing what's inside, and some boxes can only fit inside certain other boxes. A paragraph box fits inside a section box. A section fits inside the body. The browser reads the labels, decides what each box is, and arranges them on screen accordingly. The labels are not the content — they describe the content.

What HTML Actually Is

HTML stands for HyperText Markup Language. Let's break that down:

HyperText — text that links to other text. That's the web's killer feature: clicking a link takes you somewhere new.
Markup — annotations that describe structure. Bold, heading, paragraph, list — these are structural descriptions, not visual instructions.
Language — a formal syntax with rules the browser can parse.

HTML is not a programming language. It has no variables, no loops, no logic. It's a declarative document format — you declare what things are, and the browser figures out how to display them.

<h1>Breaking News</h1>
<p>Something <strong>important</strong> happened today.</p>

You're not telling the browser "make this text big and bold." You're saying "this is a top-level heading" and "this word is strongly emphasized." The browser (and CSS) decides what that looks like.

Quiz

What does HTML primarily describe?

ABCD

How the Browser Parses HTML

When the browser receives an HTML file, it doesn't just display text. It runs a sophisticated parsing algorithm defined in the HTML specification.

Here's the simplified process:

<!DOCTYPE html>
<html>
  <head>
    <title>My Page</title>
  </head>
  <body>
    <h1>Hello</h1>
    <p>World</p>
  </body>
</html>

Execution Trace

Byte stream

Browser receives raw bytes from the network

Bytes are decoded into characters using the specified encoding (usually UTF-8)

Tokenizer

Characters are converted into tokens

Tokens include start tags, end tags, attributes, and text content

Tree builder

Tokens are assembled into a DOM tree

The tree builder maintains a stack of open elements and handles nesting rules

DOM ready

The complete DOM tree is available

JavaScript can now query and manipulate the tree

The parser is shockingly forgiving. Try this:

<p>First paragraph
<p>Second paragraph

No closing tags, and it still works. The parser sees the second  and automatically closes the first one. This is called implicit closing and it's part of the spec, not a browser quirk.

Common Trap

Just because the browser fixes your mistakes doesn't mean you should rely on it. Implicit closing rules are complex and differ by element. A missing closing tag in one context might work fine, but in another context it creates a completely different DOM tree than you intended. Always close your tags explicitly.

Quiz

What happens when a browser encounters HTML without closing tags?

ABCD

Elements, Tags, and Attributes

These three terms get mixed up constantly. Let's nail them down:

<a href="https://example.com" target="_blank">Click me</a>

Element — the whole thing: opening tag + content + closing tag. The a element above includes everything from <a> to </a>.
Tag — the markup syntax. <a> is the opening tag. </a> is the closing tag. Tags are not elements — they delimit elements.
Attribute — metadata on the opening tag. href and target are attributes. They configure the element's behavior.
Content — what's between the tags. "Click me" is the text content.

Some elements are void elements — they can't have content and don't get a closing tag:

<img src="photo.jpg" alt="A photo">
<br>
<input type="text">
<hr>
<meta charset="utf-8">

Quiz

Which of these is a void element (no closing tag allowed)?

ABCD

Nesting Rules: What Goes Where

HTML is not a free-for-all. There are strict rules about which elements can be children of which:

<!-- Valid: inline element inside block -->
<p>This is <strong>important</strong> text.</p>

<!-- Invalid: block element inside inline -->
<span><div>This breaks things</div></span>

<!-- Invalid: paragraph inside paragraph -->
<p>Outer <p>Inner</p></p>

The HTML spec defines content models that dictate nesting. The main categories:

Flow content — most elements (divs, paragraphs, headings)
Phrasing content — inline-level elements (spans, strongs, links)
Sectioning content — structural sections (article, section, nav, aside)

The golden rule: phrasing content elements cannot contain flow content. A span can't hold a div. A strong can't hold an h1.

Quiz

Why is this HTML invalid: a p element containing a div element?

ABCD

Production Scenario: When Bad HTML Causes Real Bugs

Here's a bug that actually ships to production more often than you'd think:

<p>
  Please read our
  <div class="highlight-box">
    <strong>Terms of Service</strong>
  </div>
  before continuing.
</p>

A developer wraps a highlight box inside a paragraph. Looks reasonable. But the browser's parser sees the div inside the p, knows that's invalid, and auto-closes the p before the div. The actual DOM becomes:

<p>Please read our</p>
<div class="highlight-box">
  <strong>Terms of Service</strong>
</div>
before continuing.
<p></p>

Now your CSS that styles p elements breaks. The "before continuing" text is an orphaned text node. Your layout is busted, and DevTools shows a completely different tree than your source code.

What developers do	What they should do
Putting block-level elements inside p tags The parser auto-closes p when it encounters a block element, creating unexpected DOM structure	Use span or other phrasing elements inside p, or restructure with div as the wrapper
Omitting alt attributes on images Screen readers need alt text. Missing alt is an accessibility violation (WCAG 1.1.1)	Always include alt — use empty alt for decorative images
Using tags for visual styling instead of meaning Using br tags for spacing or h1 for 'big text' creates accessibility and SEO problems	Use semantic elements for structure, CSS for appearance
Skipping the DOCTYPE declaration Without it, browsers enter 'quirks mode' — a legacy rendering mode with unpredictable behavior differences	Always start with the doctype declaration

Challenge: Fix the Broken HTML

This HTML has several problems. Can you spot and fix all of them?

<html>
  <body>
    <h1>Welcome to my site
    <p>This is <b>bold and <i>italic</b> text</i></p>
    <img src="logo.png">
    <p>Click <div class="btn">here</div> to continue</p>
  </body>
</html>

Show Answer

There are five issues:

Missing DOCTYPE — add <!DOCTYPE html> at the top
Missing closing h1 tag — add </h1> after "my site"
Misnested tags — the b and i tags overlap. Tags must nest properly: italic
Missing alt attribute — the img needs alt="Site logo" (or alt="" if decorative)
Block element inside p — the div inside the p will cause the parser to auto-close the paragraph. Use a span instead: here

Fixed version:

<!DOCTYPE html>
<html>
  <body>
    <h1>Welcome to my site</h1>
    <p>This is <b><i>bold and italic</i></b> text</p>
    <img src="logo.png" alt="Site logo">
    <p>Click <span class="btn">here</span> to continue</p>
  </body>
</html>

Key Rules

1HTML describes structure and meaning, not visual appearance — CSS handles styling
2The browser's parser is error-tolerant and will always produce a DOM tree, but its error recovery may not match your intent
3Elements, tags, and attributes are different things — tags delimit elements, attributes configure them
4Void elements like img, br, and input cannot have content or closing tags
5Nesting rules matter: phrasing content elements cannot contain flow content — the parser will auto-close and restructure your DOM