Skip to content

HTTP/1.1 and Its Limitations

intermediate15 min read

The Protocol You Use Without Thinking About It

Every time you load a webpage, your browser speaks HTTP. It's the language of the web — the protocol that defines how your browser asks for resources and how servers respond. HTTP/1.1 has been doing this job since 1997, and despite being largely replaced by HTTP/2 and HTTP/3 for performance, understanding its design (and its flaws) is essential.

Why? Because HTTP/2 and HTTP/3 exist specifically to fix HTTP/1.1's problems. You can't appreciate the solutions without understanding the disease.

The Mental Model

Mental Model

HTTP/1.1 is like a single-lane drive-through. You pull up, place your order, wait for the food, then the next car pulls up. Even if the car behind you just wants a drink (tiny request), they wait for your full meal (large file) to be prepared and handed over. This is head-of-line blocking — and it's the fundamental limitation of HTTP/1.1.

How HTTP/1.1 Works

An HTTP message is plain text. Here's what your browser actually sends when you visit a page:

The Request

GET /index.html HTTP/1.1
Host: example.com
Accept: text/html
Accept-Encoding: gzip, br
Connection: keep-alive
User-Agent: Mozilla/5.0 ...

Every request has:

  • MethodGET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS
  • Path — the resource being requested
  • Headers — metadata about the request (accept types, encoding, cookies, auth)
  • Body — optional, used with POST/PUT/PATCH

The Response

HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 14832
Content-Encoding: gzip
Cache-Control: max-age=3600

<!DOCTYPE html>...

Every response has:

  • Status code — 200 (ok), 301 (redirect), 404 (not found), 500 (server error)
  • Headers — metadata about the response (content type, caching, cookies)
  • Body — the actual content

Status Codes You Need to Know

RangeMeaningCommon Codes
1xxInformational103 Early Hints
2xxSuccess200 OK, 201 Created, 204 No Content
3xxRedirection301 Permanent, 302 Found, 304 Not Modified
4xxClient Error400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 429 Too Many Requests
5xxServer Error500 Internal Error, 502 Bad Gateway, 503 Service Unavailable
Quiz
A server returns a 304 status code. What does this mean?

Keep-Alive: The First Optimization

In HTTP/1.0, every request required a brand new TCP connection. Request a page with 30 resources? That's 30 TCP handshakes. Brutal.

HTTP/1.1 introduced persistent connections (keep-alive) as the default. A single TCP connection stays open and handles multiple requests sequentially:

Connection 1:
  Request  → GET /index.html
  Response ← 200 OK (HTML)
  Request  → GET /style.css
  Response ← 200 OK (CSS)
  Request  → GET /app.js
  Response ← 200 OK (JS)
  ...all on the same connection

This was a huge improvement — no repeated handshake costs. But there's a catch: requests are sequential. The second request doesn't start until the first response completes.

The Big Problem: Head-of-Line Blocking

This sequential behavior creates head-of-line (HOL) blocking. If the first response is slow (a large image, a slow database query), every request behind it is stuck waiting, even if they'd be lightning-fast:

Connection:
  GET /giant-image.jpg  → [server processing... 800ms ...............]
  GET /tiny-icon.svg    →                                             [2ms]
  GET /critical.css     →                                                   [5ms]

That critical CSS file could have loaded in 5ms, but it waited 800ms for the image to finish. Your above-the-fold content is held hostage by a below-the-fold image.

HTTP Pipelining: The Failed Fix

The HTTP/1.1 spec included pipelining — the ability to send multiple requests without waiting for responses:

Pipelining (in theory):
  → GET /index.html
  → GET /style.css     (sent immediately, don't wait)
  → GET /app.js        (sent immediately)
  ← Response to /index.html
  ← Response to /style.css
  ← Response to /app.js

Sounds great. In practice, it was a disaster:

  • Responses MUST arrive in order — if the first response is slow, the others still block
  • Buggy proxies — many intermediate proxies corrupted pipelined responses or couldn't handle them
  • Head-of-line blocking at the response level — you could send requests in parallel, but responses still queued

Every major browser disabled pipelining by default. Chrome never shipped it. It was the right idea at the wrong layer.

Quiz
HTTP/1.1 pipelining was supposed to solve head-of-line blocking. Why did it fail?

The Workarounds: How Developers Fought HOL Blocking

Since the protocol couldn't be fixed (too many servers, proxies, and CDNs in the wild), developers invented clever hacks:

1. Domain Sharding

Browsers limit concurrent connections per origin (typically 6 in Chrome). By spreading resources across multiple subdomains, you multiply the available connections:

images1.example.com  → 6 connections
images2.example.com  → 6 connections
images3.example.com  → 6 connections
cdn.example.com      → 6 connections
─────────────────────────────────────
Total: 24 parallel connections

The downside? Each new domain requires a separate DNS lookup and TCP+TLS handshake. And each connection goes through TCP slow start independently.

2. Concatenation (Bundling)

Instead of serving 20 small JavaScript files, combine them into one large bundle:

<!-- Before: 20 requests with HOL blocking -->
<script src="/utils.js"></script>
<script src="/helpers.js"></script>
<script src="/router.js"></script>
<!-- ...17 more -->

<!-- After: 1 request, no HOL blocking between files -->
<script src="/bundle.js"></script>

Webpack, Rollup, and other bundlers exist partly because of HTTP/1.1's per-request overhead. The tradeoff: changing one line in utils.js invalidates the entire bundle's cache.

3. Spriting

Combine multiple images into a single image and use CSS background-position to show the right portion:

.icon-home { background: url(sprite.png) -10px -20px; }
.icon-user { background: url(sprite.png) -60px -20px; }
.icon-menu { background: url(sprite.png) -110px -20px; }

One HTTP request instead of dozens. The downside: changing one icon means re-downloading the entire sprite, and unused icons waste bandwidth.

4. Inlining

Embed small resources directly in HTML or CSS:

<!-- Inline critical CSS -->
<style>body { margin: 0; } .header { ... }</style>

<!-- Inline small images as data URIs -->
<img src="data:image/svg+xml;base64,PHN2Zy..." alt="icon">

This eliminates the request entirely but sacrifices caching — the inlined resource is re-downloaded with every HTML page load.

Common Trap

All these HTTP/1.1 workarounds (domain sharding, concatenation, spriting, inlining) are anti-patterns in HTTP/2. HTTP/2 multiplexing makes parallel requests cheap, so splitting resources into many small files is actually better — it improves caching granularity. If you're migrating to HTTP/2, undo these hacks or you'll actually hurt performance.

Quiz
Why does domain sharding help with HTTP/1.1 performance?

Connection Limits Per Origin

Every browser enforces a limit on simultaneous connections to a single origin:

BrowserMax Connections per OriginTotal Max Connections
Chrome6256
Firefox6256
Safari6N/A
Edge6256

This means with a single domain, only 6 resources can download in parallel. Resource #7 waits for one of the first 6 to complete. For a page with 80 resources (not uncommon), that's significant queuing.

This limit is why the Chrome DevTools Network tab shows a "Queueing" time for many requests — they're waiting for an available connection.

Why the 6-connection limit?

The limit exists to prevent browsers from overwhelming servers and network infrastructure. If every browser tab opened 100 connections to the same origin, a busy server would quickly run out of file descriptors. The number 6 was a practical compromise between parallelism and resource conservation. HTTP/2 eliminates this problem by multiplexing all requests over a single connection, making the connection limit irrelevant.

HTTP/1.1 Headers: The Hidden Bandwidth Tax

HTTP/1.1 headers are sent as uncompressed plain text with every single request and response. A typical request carries 500-800 bytes of headers. Cookies alone can add kilobytes.

GET /api/data HTTP/1.1
Host: app.example.com
Accept: application/json
Accept-Language: en-US,en;q=0.9
Accept-Encoding: gzip, deflate, br
Cookie: session=abc123def456; _ga=GA1.2.12345; preferences=theme:dark;
        recently_viewed=item1,item2,item3,item4,item5
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)...
Referer: https://app.example.com/dashboard

These headers are repeated verbatim for every request. On a page that makes 80 requests, that's potentially 60-80KB of redundant header data. HTTP/2 solves this with HPACK header compression, sending only the differences between requests.

Common Mistakes

What developers doWhat they should do
Applying HTTP/1.1 workarounds (domain sharding, spriting) on HTTP/2 sites
HTTP/2 multiplexes all requests over a single connection. Domain sharding forces multiple connections that can't share multiplexing. Many small files are better than few large bundles in HTTP/2 because they cache independently.
Remove domain sharding and use many small requests with HTTP/2 multiplexing
Thinking that the 6-connection limit is a browser bug or misconfiguration
The limit prevents resource exhaustion on servers. HTTP/2's single-connection multiplexing makes the limit moot — all requests share one connection with no HOL blocking at the HTTP layer.
The connection limit is intentional for HTTP/1.1. HTTP/2 makes it irrelevant by using one connection.
Bundling everything into one massive JS file for 'fewer requests'
A single massive bundle means any code change invalidates the entire cache. Code splitting lets unchanged routes stay cached. With HTTP/2, the per-request overhead is negligible, so many small files are preferable to one giant file.
Use code splitting for route-level bundles. Shared code goes in a common chunk.

Key Takeaways

Key Rules
  1. 1HTTP/1.1 processes requests sequentially on each connection. Head-of-line blocking means a slow response delays everything behind it.
  2. 2Browsers limit concurrent connections to 6 per origin. This is the root cause of most HTTP/1.1 performance hacks.
  3. 3Pipelining was designed to fix HOL blocking but failed because responses must arrive in order and proxies couldn't handle it.
  4. 4Domain sharding, bundling, spriting, and inlining were workarounds for HTTP/1.1 limitations. They become anti-patterns in HTTP/2.
  5. 5HTTP/1.1 headers are uncompressed plain text sent with every request. Cookies and auth tokens can add kilobytes of redundant data per request.