HTTP/1.1 and Its Limitations
The Protocol You Use Without Thinking About It
Every time you load a webpage, your browser speaks HTTP. It's the language of the web — the protocol that defines how your browser asks for resources and how servers respond. HTTP/1.1 has been doing this job since 1997, and despite being largely replaced by HTTP/2 and HTTP/3 for performance, understanding its design (and its flaws) is essential.
Why? Because HTTP/2 and HTTP/3 exist specifically to fix HTTP/1.1's problems. You can't appreciate the solutions without understanding the disease.
The Mental Model
HTTP/1.1 is like a single-lane drive-through. You pull up, place your order, wait for the food, then the next car pulls up. Even if the car behind you just wants a drink (tiny request), they wait for your full meal (large file) to be prepared and handed over. This is head-of-line blocking — and it's the fundamental limitation of HTTP/1.1.
How HTTP/1.1 Works
An HTTP message is plain text. Here's what your browser actually sends when you visit a page:
The Request
GET /index.html HTTP/1.1
Host: example.com
Accept: text/html
Accept-Encoding: gzip, br
Connection: keep-alive
User-Agent: Mozilla/5.0 ...
Every request has:
- Method —
GET,POST,PUT,DELETE,PATCH,HEAD,OPTIONS - Path — the resource being requested
- Headers — metadata about the request (accept types, encoding, cookies, auth)
- Body — optional, used with
POST/PUT/PATCH
The Response
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 14832
Content-Encoding: gzip
Cache-Control: max-age=3600
<!DOCTYPE html>...
Every response has:
- Status code — 200 (ok), 301 (redirect), 404 (not found), 500 (server error)
- Headers — metadata about the response (content type, caching, cookies)
- Body — the actual content
Status Codes You Need to Know
| Range | Meaning | Common Codes |
|---|---|---|
| 1xx | Informational | 103 Early Hints |
| 2xx | Success | 200 OK, 201 Created, 204 No Content |
| 3xx | Redirection | 301 Permanent, 302 Found, 304 Not Modified |
| 4xx | Client Error | 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 429 Too Many Requests |
| 5xx | Server Error | 500 Internal Error, 502 Bad Gateway, 503 Service Unavailable |
Keep-Alive: The First Optimization
In HTTP/1.0, every request required a brand new TCP connection. Request a page with 30 resources? That's 30 TCP handshakes. Brutal.
HTTP/1.1 introduced persistent connections (keep-alive) as the default. A single TCP connection stays open and handles multiple requests sequentially:
Connection 1:
Request → GET /index.html
Response ← 200 OK (HTML)
Request → GET /style.css
Response ← 200 OK (CSS)
Request → GET /app.js
Response ← 200 OK (JS)
...all on the same connection
This was a huge improvement — no repeated handshake costs. But there's a catch: requests are sequential. The second request doesn't start until the first response completes.
The Big Problem: Head-of-Line Blocking
This sequential behavior creates head-of-line (HOL) blocking. If the first response is slow (a large image, a slow database query), every request behind it is stuck waiting, even if they'd be lightning-fast:
Connection:
GET /giant-image.jpg → [server processing... 800ms ...............]
GET /tiny-icon.svg → [2ms]
GET /critical.css → [5ms]
That critical CSS file could have loaded in 5ms, but it waited 800ms for the image to finish. Your above-the-fold content is held hostage by a below-the-fold image.
HTTP Pipelining: The Failed Fix
The HTTP/1.1 spec included pipelining — the ability to send multiple requests without waiting for responses:
Pipelining (in theory):
→ GET /index.html
→ GET /style.css (sent immediately, don't wait)
→ GET /app.js (sent immediately)
← Response to /index.html
← Response to /style.css
← Response to /app.js
Sounds great. In practice, it was a disaster:
- Responses MUST arrive in order — if the first response is slow, the others still block
- Buggy proxies — many intermediate proxies corrupted pipelined responses or couldn't handle them
- Head-of-line blocking at the response level — you could send requests in parallel, but responses still queued
Every major browser disabled pipelining by default. Chrome never shipped it. It was the right idea at the wrong layer.
The Workarounds: How Developers Fought HOL Blocking
Since the protocol couldn't be fixed (too many servers, proxies, and CDNs in the wild), developers invented clever hacks:
1. Domain Sharding
Browsers limit concurrent connections per origin (typically 6 in Chrome). By spreading resources across multiple subdomains, you multiply the available connections:
images1.example.com → 6 connections
images2.example.com → 6 connections
images3.example.com → 6 connections
cdn.example.com → 6 connections
─────────────────────────────────────
Total: 24 parallel connections
The downside? Each new domain requires a separate DNS lookup and TCP+TLS handshake. And each connection goes through TCP slow start independently.
2. Concatenation (Bundling)
Instead of serving 20 small JavaScript files, combine them into one large bundle:
<!-- Before: 20 requests with HOL blocking -->
<script src="/utils.js"></script>
<script src="/helpers.js"></script>
<script src="/router.js"></script>
<!-- ...17 more -->
<!-- After: 1 request, no HOL blocking between files -->
<script src="/bundle.js"></script>
Webpack, Rollup, and other bundlers exist partly because of HTTP/1.1's per-request overhead. The tradeoff: changing one line in utils.js invalidates the entire bundle's cache.
3. Spriting
Combine multiple images into a single image and use CSS background-position to show the right portion:
.icon-home { background: url(sprite.png) -10px -20px; }
.icon-user { background: url(sprite.png) -60px -20px; }
.icon-menu { background: url(sprite.png) -110px -20px; }
One HTTP request instead of dozens. The downside: changing one icon means re-downloading the entire sprite, and unused icons waste bandwidth.
4. Inlining
Embed small resources directly in HTML or CSS:
<!-- Inline critical CSS -->
<style>body { margin: 0; } .header { ... }</style>
<!-- Inline small images as data URIs -->
<img src="data:image/svg+xml;base64,PHN2Zy..." alt="icon">
This eliminates the request entirely but sacrifices caching — the inlined resource is re-downloaded with every HTML page load.
All these HTTP/1.1 workarounds (domain sharding, concatenation, spriting, inlining) are anti-patterns in HTTP/2. HTTP/2 multiplexing makes parallel requests cheap, so splitting resources into many small files is actually better — it improves caching granularity. If you're migrating to HTTP/2, undo these hacks or you'll actually hurt performance.
Connection Limits Per Origin
Every browser enforces a limit on simultaneous connections to a single origin:
| Browser | Max Connections per Origin | Total Max Connections |
|---|---|---|
| Chrome | 6 | 256 |
| Firefox | 6 | 256 |
| Safari | 6 | N/A |
| Edge | 6 | 256 |
This means with a single domain, only 6 resources can download in parallel. Resource #7 waits for one of the first 6 to complete. For a page with 80 resources (not uncommon), that's significant queuing.
This limit is why the Chrome DevTools Network tab shows a "Queueing" time for many requests — they're waiting for an available connection.
Why the 6-connection limit?
The limit exists to prevent browsers from overwhelming servers and network infrastructure. If every browser tab opened 100 connections to the same origin, a busy server would quickly run out of file descriptors. The number 6 was a practical compromise between parallelism and resource conservation. HTTP/2 eliminates this problem by multiplexing all requests over a single connection, making the connection limit irrelevant.
HTTP/1.1 Headers: The Hidden Bandwidth Tax
HTTP/1.1 headers are sent as uncompressed plain text with every single request and response. A typical request carries 500-800 bytes of headers. Cookies alone can add kilobytes.
GET /api/data HTTP/1.1
Host: app.example.com
Accept: application/json
Accept-Language: en-US,en;q=0.9
Accept-Encoding: gzip, deflate, br
Cookie: session=abc123def456; _ga=GA1.2.12345; preferences=theme:dark;
recently_viewed=item1,item2,item3,item4,item5
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)...
Referer: https://app.example.com/dashboard
These headers are repeated verbatim for every request. On a page that makes 80 requests, that's potentially 60-80KB of redundant header data. HTTP/2 solves this with HPACK header compression, sending only the differences between requests.
Common Mistakes
| What developers do | What they should do |
|---|---|
| Applying HTTP/1.1 workarounds (domain sharding, spriting) on HTTP/2 sites HTTP/2 multiplexes all requests over a single connection. Domain sharding forces multiple connections that can't share multiplexing. Many small files are better than few large bundles in HTTP/2 because they cache independently. | Remove domain sharding and use many small requests with HTTP/2 multiplexing |
| Thinking that the 6-connection limit is a browser bug or misconfiguration The limit prevents resource exhaustion on servers. HTTP/2's single-connection multiplexing makes the limit moot — all requests share one connection with no HOL blocking at the HTTP layer. | The connection limit is intentional for HTTP/1.1. HTTP/2 makes it irrelevant by using one connection. |
| Bundling everything into one massive JS file for 'fewer requests' A single massive bundle means any code change invalidates the entire cache. Code splitting lets unchanged routes stay cached. With HTTP/2, the per-request overhead is negligible, so many small files are preferable to one giant file. | Use code splitting for route-level bundles. Shared code goes in a common chunk. |
Key Takeaways
- 1HTTP/1.1 processes requests sequentially on each connection. Head-of-line blocking means a slow response delays everything behind it.
- 2Browsers limit concurrent connections to 6 per origin. This is the root cause of most HTTP/1.1 performance hacks.
- 3Pipelining was designed to fix HOL blocking but failed because responses must arrive in order and proxies couldn't handle it.
- 4Domain sharding, bundling, spriting, and inlining were workarounds for HTTP/1.1 limitations. They become anti-patterns in HTTP/2.
- 5HTTP/1.1 headers are uncompressed plain text sent with every request. Cookies and auth tokens can add kilobytes of redundant data per request.