Async Iterators and Streaming
The AI Response That Froze the Page
A team built a chat interface that calls an AI API. The API streams the response token by token, but their code waited for the entire response before rendering:
const response = await fetch('/api/chat', { method: 'POST', body });
const data = await response.json();
renderMessage(data.text);
A 2000-token response takes 8 seconds to generate. The user stares at a spinner for 8 seconds, then the entire answer pops in at once. Compare that to ChatGPT, where words appear as they're generated. Same data, completely different experience.
The difference? Streaming. And async iterators are how you consume streams in JavaScript.
Async Iteration Protocol
Regular iterators produce values synchronously: you call next(), you get { value, done } immediately. Async iterators produce values over time: next() returns a promise that resolves to { value, done }.
const asyncIterable = {
[Symbol.asyncIterator]() {
let i = 0;
return {
next() {
if (i >= 3) return Promise.resolve({ done: true, value: undefined });
return new Promise(resolve =>
setTimeout(() => resolve({ value: i++, done: false }), 1000)
);
}
};
}
};
for await (const value of asyncIterable) {
console.log(value); // 0 (after 1s), 1 (after 2s), 2 (after 3s)
}
The for-await-of loop automatically calls next(), waits for the promise, extracts the value, and repeats until done: true.
Need to collect all values from an async iterable into an array? Array.fromAsync() (Stage 4, baseline) does it in one call:
const items = await Array.fromAsync(asyncIterable);
It's the async equivalent of Array.from() — iterates the async iterable, awaits each value, and returns a promise that resolves to the collected array.
Async Generators: The Elegant Way
Writing async iterables by hand is tedious. Async generators give you the same power with much cleaner syntax:
async function* fetchPages(url) {
let page = 1;
let hasMore = true;
while (hasMore) {
const response = await fetch(`${url}?page=${page}`);
const data = await response.json();
yield data.items;
hasMore = data.hasNextPage;
page++;
}
}
for await (const items of fetchPages('/api/products')) {
renderProducts(items);
}
Each yield pauses the generator until the consumer asks for the next value. This creates natural backpressure — the generator doesn't fetch page 3 until the consumer has processed page 2.
Async Generator as a Transform Pipeline
Generators compose beautifully into processing pipelines:
async function* fetchLines(url) {
const response = await fetch(url);
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop();
for (const line of lines) {
yield line;
}
}
if (buffer) yield buffer;
}
async function* parseJSON(lines) {
for await (const line of lines) {
if (line.trim()) {
yield JSON.parse(line);
}
}
}
async function* filterValid(records) {
for await (const record of records) {
if (record.status === 'active') {
yield record;
}
}
}
const lines = fetchLines('/api/export.ndjson');
const records = parseJSON(lines);
const active = filterValid(records);
for await (const record of active) {
processRecord(record);
}
Each generator only processes one item at a time. Memory stays flat regardless of how large the file is. This is the async equivalent of Unix pipes.
ReadableStream as Async Iterable
Modern browsers support iterating over ReadableStream with for-await-of directly. This is the cleanest way to process streaming responses:
async function streamResponse(url) {
const response = await fetch(url);
const stream = response.body;
const decoder = new TextDecoder();
for await (const chunk of stream) {
const text = decoder.decode(chunk, { stream: true });
appendToUI(text);
}
}
ReadableStream is async iterable in all modern browsers and Node.js 18+. But if you're using the response body stream, remember: once you consume it (via for-await-of, .getReader(), or .json()), it's gone. You can't read it twice. If you need the data in multiple places, clone the response first with response.clone(), or tee the stream with stream.tee().
Explicit Resource Management (Stage 4) brings await using to JavaScript, which auto-cleans up resources when they go out of scope. For streams, this means you can write await using reader = stream.getReader() and the reader lock is automatically released when the block exits — no manual reader.releaseLock() or try/finally needed. This is similar to Python's async with or C#'s await using. Browser and Node.js support is still rolling out, but it's the future of resource cleanup in async code.
Processing Streams With TransformStream
For more complex transforms, TransformStream lets you pipe streams together:
function createLineStream() {
let buffer = '';
return new TransformStream({
transform(chunk, controller) {
buffer += new TextDecoder().decode(chunk);
const lines = buffer.split('\n');
buffer = lines.pop();
for (const line of lines) {
controller.enqueue(line);
}
},
flush(controller) {
if (buffer) controller.enqueue(buffer);
},
});
}
const response = await fetch('/api/logs');
const lineStream = response.body.pipeThrough(createLineStream());
for await (const line of lineStream) {
console.log(line);
}
Server-Sent Events (SSE) Consumption
SSE is HTTP streaming with a built-in protocol. The server sends events in a specific text format, and the browser's EventSource API handles reconnection automatically:
const source = new EventSource('/api/events');
source.addEventListener('message', (event) => {
const data = JSON.parse(event.data);
updateUI(data);
});
source.addEventListener('error', () => {
console.log('Connection lost, auto-reconnecting...');
});
But EventSource has limitations: no custom headers, no POST requests, no request body. For those cases, consume SSE manually with fetch streaming:
async function* consumeSSE(url, options = {}) {
const response = await fetch(url, options);
if (!response.ok) {
throw new Error(`SSE connection failed: ${response.status}`);
}
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const events = buffer.split('\n\n');
buffer = events.pop();
for (const event of events) {
const lines = event.split('\n');
const data = lines
.filter(line => line.startsWith('data: '))
.map(line => line.slice(6))
.join('\n');
if (data) yield JSON.parse(data);
}
}
}
for await (const event of consumeSSE('/api/chat', {
method: 'POST',
body: JSON.stringify({ message: 'Hello' }),
headers: { 'Content-Type': 'application/json' },
})) {
appendToken(event.token);
}
Streaming JSON Parsing
Standard response.json() buffers the entire response in memory before parsing. For large JSON responses, stream-parse instead:
Newline-Delimited JSON (NDJSON)
The simplest streaming JSON format — one JSON object per line:
async function* parseNDJSON(response) {
let buffer = '';
const decoder = new TextDecoder();
for await (const chunk of response.body) {
buffer += decoder.decode(chunk, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop();
for (const line of lines) {
if (line.trim()) {
yield JSON.parse(line);
}
}
}
if (buffer.trim()) {
yield JSON.parse(buffer);
}
}
const response = await fetch('/api/large-dataset');
let count = 0;
for await (const record of parseNDJSON(response)) {
processRecord(record);
count++;
}
console.log(`Processed ${count} records`);
This processes records one at a time. A 500MB NDJSON file uses the same memory as a 1KB file because you're never holding the entire dataset in memory.
Backpressure in the Browser
Backpressure is like a garden hose. If the sprinkler (consumer) can only handle a certain flow rate and the faucet (producer) pushes water faster, pressure builds up and something bursts. Backpressure is the signal that flows backward from consumer to producer saying "slow down, I can't keep up."
In the browser, backpressure matters in three situations:
- Stream processing — reading data faster than you can render it
- WebSocket messages — server sends events faster than the UI updates
- File processing — reading a large file chunk by chunk
The Streams API has built-in backpressure through its pull-based model:
const readable = new ReadableStream({
async pull(controller) {
const data = await fetchNextChunk();
if (data) {
controller.enqueue(data);
} else {
controller.close();
}
},
}, {
highWaterMark: 3,
});
The pull function is only called when the consumer needs data. If the consumer is slow, pull simply isn't called. highWaterMark controls how many chunks to buffer ahead — higher means more memory but smoother throughput, lower means less memory but more waiting.
For WebSocket-style scenarios where the producer pushes data, implement your own backpressure:
async function* throttledConsume(source, processingTime) {
for await (const item of source) {
yield item;
if (processingTime > 16) {
await new Promise(r => requestAnimationFrame(r));
}
}
}
Real-World Pattern: Streaming AI Chat
Putting it all together — here's how you'd build a streaming chat interface like ChatGPT:
async function streamChat(messages, onToken) {
const response = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ messages }),
});
if (!response.ok) {
throw new Error(`Chat API error: ${response.status}`);
}
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
let fullText = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop();
for (const line of lines) {
if (line.startsWith('data: ') && line !== 'data: [DONE]') {
const parsed = JSON.parse(line.slice(6));
const token = parsed.choices?.[0]?.delta?.content;
if (token) {
fullText += token;
onToken(token, fullText);
}
}
}
}
return fullText;
}
await streamChat(
[{ role: 'user', content: 'Explain closures' }],
(token, fullText) => {
messageElement.textContent = fullText;
}
);
| What developers do | What they should do |
|---|---|
| Using response.json() for large API responses response.json() buffers the entire response in memory before parsing. A 100MB response needs 100MB of memory. Streaming processes chunks as they arrive | Stream-parse with for-await-of on response.body for large or real-time data |
| Reading a response body twice ReadableStream is consumed once. Attempting to read it again throws a TypeError. Clone before consuming if you need the data in multiple places | Clone the response first with response.clone() or tee the stream |
| Forgetting to handle partial chunks in stream parsing Network chunks don't align with logical boundaries (lines, JSON objects). A chunk might end in the middle of a line. Without buffering, you'll get parse errors | Always buffer incomplete data and process it on the next chunk or flush |
| Using EventSource when you need authentication or POST EventSource only supports GET requests with no custom headers. For anything requiring auth tokens or request bodies, manual fetch-based SSE is required | Use fetch with streaming body for authenticated or POST-based SSE |
You're building a real-time dashboard that receives 500 data points per second via WebSocket. Each data point needs to be rendered as a chart update. The browser can only repaint at 60fps. How would you handle the mismatch between data production rate and render rate? Describe your backpressure strategy.