Structured Output Parsing
The JSON You Can Actually Trust
Here's the deal with LLMs: they're incredible at generating text, but text is chaos. You ask for JSON and sometimes you get valid JSON, sometimes you get JSON wrapped in markdown code fences, sometimes you get a friendly explanation followed by JSON, and sometimes you get something that looks like JSON but has trailing commas that blow up JSON.parse. Building production UIs on "maybe-JSON" is a nightmare.
Structured output solves this completely. Instead of hoping the model returns valid JSON, you force it to. You define a schema upfront — "I want an object with these exact fields, these exact types, nothing else" — and the model's output is constrained to match that schema. Every single time. No parsing hacks, no regex extraction, no retry loops.
This is the difference between a demo and a product.
Think of structured output like a form with strict validation. Without structured output, you're handing someone a blank piece of paper and saying "write your address somewhere on here." With structured output, you're handing them a form with labeled fields, character limits, and required markers. They can only write in the boxes, and they can't submit until every required field is filled correctly. The LLM is the person filling out the form — the schema is the form itself.
Why This Matters for Frontend Engineers
You might think structured output is a backend concern. It's not. As a frontend engineer working with AI, you need structured output for:
- Rendering AI responses as UI components — not raw text, but cards, tables, forms, charts
- Type-safe AI data — your TypeScript types match exactly what the model returns
- Streaming partial UI — rendering fields as they arrive, not waiting for the full response
- Form generation — the model outputs a form schema, your app renders it
- Data extraction — pull structured data from unstructured user input
Without structured output, you're writing fragile parsing code that breaks on edge cases. With it, you get a typed object you can pass straight to your components.
How It Actually Works Under the Hood
When you send a schema to an LLM provider with structured output enabled, the model doesn't just "try harder" to output valid JSON. The provider modifies the token sampling process itself. At each step of generation, tokens that would make the output invalid according to the schema are masked out — their probability is set to zero. The model literally cannot produce invalid output.
This is called constrained decoding or grammar-guided generation. The provider maintains a state machine that tracks where in the JSON structure the model currently is. Writing an object? Only " (to start a key) or } (to close) are valid next tokens. Just wrote a string key? Only : is valid. Just wrote the value for the last required field? The object must close.
The result: 100% schema conformance. Not 99.9%. Not "usually works." One hundred percent.
The Provider Landscape
Each major provider handles structured output differently. Let's break down the three approaches you'll encounter in production.
| Feature | OpenAI Strict Mode | Anthropic tool_use | Vercel AI SDK |
|---|---|---|---|
| Mechanism | json_schema in response_format with strict: true | Single tool definition where input_schema = desired output | generateObject() / streamObject() with Zod schema |
| Schema format | JSON Schema (auto-converted from Zod) | JSON Schema via tool input_schema | Zod schema (converts to JSON Schema internally) |
| Conformance | 100% guaranteed | Very high but not formally 100% | Depends on underlying provider |
| Streaming | Partial JSON via stream | Partial tool input via stream | streamObject() with partial Zod parsing |
| Refusal handling | First-class refusal field in response | Stop reason check | Built-in error types |
| Best for | Direct OpenAI API usage | Anthropic API usage | Provider-agnostic apps, Next.js |
OpenAI: Strict Mode
OpenAI's structured output uses response_format with type: "json_schema" and strict: true. This is the production approach — don't confuse it with legacy JSON mode, which only guarantees valid JSON but not schema adherence.
import OpenAI from "openai";
import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";
const QuizSchema = z.object({
question: z.string(),
options: z.array(z.string()).length(4),
correctIndex: z.number().int().min(0).max(3),
explanation: z.string(),
});
const client = new OpenAI();
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: "Generate a quiz question about JavaScript closures.",
},
],
response_format: zodResponseFormat(QuizSchema, "quiz"),
});
const quiz = JSON.parse(response.choices[0].message.content!);
// quiz is guaranteed to match QuizSchema
A few things to notice here. The zodResponseFormat helper converts your Zod schema into JSON Schema automatically. The "quiz" string is just a name for the schema — it doesn't affect behavior. And strict: true is set internally by the helper.
What Strict Mode Supports
Not every JSON Schema feature works with strict mode. Here's what you can use:
- String, number, integer, boolean, null
- Objects with
propertiesandrequired(all properties must be required) - Arrays with
items - Enums (
z.enum(["a", "b", "c"])) - Union types via
anyOf - Recursive schemas (up to a depth limit)
And what you can't use:
- Optional properties — every field must be required (use a union with null instead)
additionalProperties— not supportedminItems,maxItemson arrays — not enforced structurally- Complex
oneOf/allOfpatterns
OpenAI strict mode requires every object property to be marked as required. If you have an optional field, you can't just use z.optional(). Instead, make it required but nullable: z.string().nullable(). The model will output null when the field isn't applicable. This catches a lot of people — your Zod schema works fine for regular validation but fails when converted to strict mode JSON Schema if it has optional fields.
Handling Refusals
Sometimes the model refuses to generate content that matches your schema — maybe you asked for something that violates safety guidelines. OpenAI handles this with a dedicated refusal field:
const message = response.choices[0].message;
if (message.refusal) {
console.log("Model refused:", message.refusal);
} else {
const data = JSON.parse(message.content!);
}
Always check for refusals before parsing. If the model refuses, content will be null and JSON.parse will throw.
Anthropic: The Tool-Use Trick
Anthropic doesn't have a dedicated structured output mode (yet). But there's a clever technique the community discovered: define a single tool whose input_schema is exactly the JSON structure you want. When the model "calls" that tool, its arguments are your structured data.
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const response = await client.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
tools: [
{
name: "output_quiz",
description:
"Output a structured quiz question. Always use this tool to respond.",
input_schema: {
type: "object" as const,
properties: {
question: { type: "string", description: "The quiz question" },
options: {
type: "array",
items: { type: "string" },
description: "Exactly 4 answer options",
},
correctIndex: {
type: "integer",
minimum: 0,
maximum: 3,
description: "0-indexed correct answer",
},
explanation: { type: "string", description: "Why the answer is correct" },
},
required: ["question", "options", "correctIndex", "explanation"],
},
},
],
tool_choice: { type: "tool", name: "output_quiz" },
messages: [
{
role: "user",
content: "Generate a quiz question about JavaScript closures.",
},
],
});
const toolBlock = response.content.find((block) => block.type === "tool_use");
if (toolBlock && toolBlock.type === "tool_use") {
const quiz = toolBlock.input;
// quiz matches your schema
}
The key trick is tool_choice: { type: "tool", name: "output_quiz" }. This forces the model to call that specific tool — it can't respond with plain text. The tool's input_schema acts as your structured output schema.
Why This Works
The model's tool calling mechanism already constrains output to match the tool's input schema. By defining a "tool" that doesn't actually do anything (you never execute it — you just read the arguments), you're hijacking that constraint mechanism for structured output. It's hacky but effective, and it's the recommended approach until Anthropic ships native structured output.
The Anthropic tool-use approach has a subtle advantage: tool input schemas support richer JSON Schema features than OpenAI's strict mode. You can use optional properties, minItems/maxItems, pattern validation, and other constraints. The trade-off is that conformance isn't formally 100% guaranteed — the model might occasionally produce output that doesn't perfectly match complex constraints. In practice, with Claude models, conformance is extremely high. But if you need absolute guarantees, validate with Zod on the client side.
Vercel AI SDK: The Frontend-First Approach
If you're building with Next.js (and you probably are), the Vercel AI SDK is the best way to work with structured output. It gives you generateObject() for one-shot generation and streamObject() for streaming — both with first-class Zod integration.
import { generateObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";
const QuizSchema = z.object({
question: z.string().describe("The quiz question"),
options: z.array(z.string()).length(4).describe("4 answer options"),
correctIndex: z.number().int().min(0).max(3),
explanation: z.string(),
});
const { object } = await generateObject({
model: openai("gpt-4o"),
schema: QuizSchema,
prompt: "Generate a quiz about JavaScript closures.",
});
// object is fully typed as z.infer<typeof QuizSchema>
// TypeScript knows: object.question is string, object.options is string[], etc.
Notice what happened: you wrote a Zod schema, passed it to generateObject, and got back a fully typed object. No JSON.parse. No type assertions. No validation step. The SDK handles schema conversion, API calls, response parsing, and type inference in one clean function call.
Streaming Structured Output
This is where the AI SDK really shines. streamObject() lets you render fields as they arrive — the user sees the UI building up in real time instead of staring at a loading spinner:
import { streamObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";
const AnalysisSchema = z.object({
summary: z.string(),
sentiment: z.enum(["positive", "negative", "neutral"]),
keyTopics: z.array(z.string()),
score: z.number().min(0).max(100),
});
const { partialObjectStream } = streamObject({
model: openai("gpt-4o"),
schema: AnalysisSchema,
prompt: `Analyze this customer review: "${review}"`,
});
for await (const partial of partialObjectStream) {
// partial is a DeepPartial<Analysis>
// Fields appear as the model generates them:
// First iteration: { summary: "The cust..." }
// Later: { summary: "The customer is...", sentiment: "positive" }
// Later: { summary: "...", sentiment: "positive", keyTopics: ["speed"] }
renderPartialUI(partial);
}
The partialObjectStream yields increasingly complete objects. Each field appears as the model generates it. Your UI can render available fields immediately and show skeletons for fields that haven't arrived yet.
Using streamObject in a Next.js Route Handler
Here's the real-world pattern for a Next.js API route:
// app/api/analyze/route.ts
import { streamObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";
const AnalysisSchema = z.object({
summary: z.string(),
sentiment: z.enum(["positive", "negative", "neutral"]),
keyTopics: z.array(z.string()),
confidence: z.number(),
});
export async function POST(req: Request) {
const { text } = await req.json();
const result = streamObject({
model: openai("gpt-4o"),
schema: AnalysisSchema,
prompt: `Analyze: ${text}`,
});
return result.toTextStreamResponse();
}
And the client-side React component:
// components/AnalysisCard.tsx
"use client";
import { experimental_useObject as useObject } from "ai/react";
import { z } from "zod";
const AnalysisSchema = z.object({
summary: z.string(),
sentiment: z.enum(["positive", "negative", "neutral"]),
keyTopics: z.array(z.string()),
confidence: z.number(),
});
export function AnalysisCard() {
const { object, submit, isLoading, error } = useObject({
api: "/api/analyze",
schema: AnalysisSchema,
});
return (
<div>
<button onClick={() => submit({ text: "Great product!" })}>
Analyze
</button>
{isLoading && !object && <Skeleton />}
{object && (
<div>
{object.summary && <p>{object.summary}</p>}
{object.sentiment && <Badge>{object.sentiment}</Badge>}
{object.keyTopics?.map((topic) => (
<Tag key={topic}>{topic}</Tag>
))}
{object.confidence != null && (
<Progress value={object.confidence} />
)}
</div>
)}
{error && <ErrorMessage>{error.message}</ErrorMessage>}
</div>
);
}
The useObject hook manages the streaming connection, parses partial JSON, and gives you a reactive object that updates as fields arrive. Each field is optional in the partial state (it's DeepPartial), so you conditionally render based on what's available. The UI fills in progressively — summary first, then sentiment, then topics, then the confidence score.
Zod: The Universal Schema Language
Here's something beautiful about this whole ecosystem: Zod is the bridge between frontend validation and AI structured output. The same schema you use to validate a form submission is the same schema you send to the LLM. One source of truth.
// schemas/quiz.ts — shared between frontend and AI
import { z } from "zod";
export const QuizSchema = z.object({
question: z.string().min(10).describe("Clear, specific question"),
options: z
.array(z.string().min(1))
.length(4)
.describe("Exactly 4 plausible options"),
correctIndex: z
.number()
.int()
.min(0)
.max(3)
.describe("0-indexed correct answer"),
explanation: z
.string()
.min(20)
.describe("Explains why the correct answer is right"),
});
export type Quiz = z.infer<typeof QuizSchema>;
// Used for AI generation:
const { object } = await generateObject({
model: openai("gpt-4o"),
schema: QuizSchema,
prompt: "Generate a JavaScript quiz",
});
// Used for form validation:
const result = QuizSchema.safeParse(formData);
if (!result.success) {
console.log(result.error.issues);
}
// Used for API validation:
export async function POST(req: Request) {
const body = await req.json();
const quiz = QuizSchema.parse(body); // throws on invalid
}
The .describe() calls are important — they become part of the JSON Schema sent to the model and help it understand what each field should contain. Think of descriptions as prompt engineering at the schema level.
Schema Design Tips for AI
Not all Zod schemas work equally well with LLMs. Here are patterns that produce better results:
// Use enums instead of open strings when you have a fixed set
const sentiment = z.enum(["positive", "negative", "neutral"]);
// Use .describe() liberally — it guides the model
const score = z
.number()
.min(0)
.max(100)
.describe("Confidence score from 0 to 100");
// Use discriminated unions for polymorphic output
const ContentBlock = z.discriminatedUnion("type", [
z.object({
type: z.literal("text"),
content: z.string(),
}),
z.object({
type: z.literal("code"),
language: z.string(),
code: z.string(),
}),
z.object({
type: z.literal("quiz"),
question: z.string(),
options: z.array(z.string()),
answer: z.number(),
}),
]);
// Nullable over optional for OpenAI strict mode
const profile = z.object({
name: z.string(),
bio: z.string().nullable(), // model outputs null if not applicable
});
Error Handling in Production
Structured output isn't magic — things can still go wrong. Here's a production-grade error handling strategy:
import { generateObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";
const ResultSchema = z.object({
answer: z.string(),
confidence: z.number(),
sources: z.array(z.string()),
});
type GenerationResult =
| { success: true; data: z.infer<typeof ResultSchema> }
| { success: false; error: string; partial?: unknown };
async function generateWithFallback(
prompt: string,
): Promise<GenerationResult> {
try {
const { object } = await generateObject({
model: openai("gpt-4o"),
schema: ResultSchema,
prompt,
});
return { success: true, data: object };
} catch (error) {
if (error instanceof Error) {
if (error.message.includes("refusal")) {
return {
success: false,
error: "The model declined to generate this content.",
};
}
if (error.message.includes("rate_limit")) {
return {
success: false,
error: "Rate limited. Please try again shortly.",
};
}
}
return {
success: false,
error: "Failed to generate structured output.",
};
}
}
Schema Validation as a Safety Net
Even with structured output guarantees, add a Zod validation step. Why? Because you might switch providers, and not all providers guarantee 100% conformance. Defense in depth:
async function safeGenerate<T extends z.ZodType>(
schema: T,
prompt: string,
): Promise<z.infer<T>> {
const { object } = await generateObject({
model: openai("gpt-4o"),
schema,
prompt,
});
const result = schema.safeParse(object);
if (!result.success) {
throw new Error(
`Schema validation failed: ${result.error.issues.map((i) => i.message).join(", ")}`,
);
}
return result.data;
}
This pattern costs almost nothing at runtime but saves you from silent data corruption when your AI provider changes behavior.
Real-World Patterns
Let's look at patterns you'll actually use in production.
Pattern 1: AI-Generated Form Schemas
The model generates a form definition, your app renders it dynamically:
const FormFieldSchema = z.object({
id: z.string(),
label: z.string(),
type: z.enum(["text", "email", "number", "select", "textarea"]),
placeholder: z.string().nullable(),
required: z.boolean(),
options: z.array(z.string()).nullable(),
validation: z
.object({
min: z.number().nullable(),
max: z.number().nullable(),
pattern: z.string().nullable(),
})
.nullable(),
});
const FormSchema = z.object({
title: z.string(),
description: z.string(),
fields: z.array(FormFieldSchema),
submitLabel: z.string(),
});
const { object: form } = await generateObject({
model: openai("gpt-4o"),
schema: FormSchema,
prompt: "Create a job application form for a frontend engineer role.",
});
// form.fields is a typed array you can map over to render inputs
Pattern 2: Structured Data Extraction
Pull structured data from unstructured text — invoices, emails, support tickets:
const InvoiceSchema = z.object({
vendor: z.string(),
invoiceNumber: z.string(),
date: z.string().describe("ISO 8601 date format"),
lineItems: z.array(
z.object({
description: z.string(),
quantity: z.number(),
unitPrice: z.number(),
total: z.number(),
}),
),
subtotal: z.number(),
tax: z.number(),
total: z.number(),
currency: z.string().describe("3-letter ISO currency code"),
});
const { object: invoice } = await generateObject({
model: openai("gpt-4o"),
schema: InvoiceSchema,
prompt: `Extract invoice data from this text:\n\n${rawText}`,
});
Pattern 3: UI Component Generation
The model outputs a component definition that your app renders:
const ChartSchema = z.object({
type: z.enum(["bar", "line", "pie", "scatter"]),
title: z.string(),
xAxis: z.object({
label: z.string(),
values: z.array(z.string()),
}),
yAxis: z.object({
label: z.string(),
values: z.array(z.number()),
}),
color: z.string().describe("CSS color value"),
});
const { object: chart } = await generateObject({
model: openai("gpt-4o"),
schema: ChartSchema,
prompt: `Create a chart showing monthly revenue for 2025: ${data}`,
});
// Pass chart directly to your charting component
// <Chart type={chart.type} data={chart} />
Streaming Structured Data Deep Dive
Streaming structured output is where the UX becomes genuinely impressive. Instead of a loading spinner followed by a wall of content, users see data materialize field by field. Here's how partial JSON parsing works:
When the model streams {"summary": "The product, you don't have valid JSON yet. But a partial JSON parser can extract that summary has started and its current value is "The product". As more tokens arrive — {"summary": "The product is excellent", "sent — the parser knows summary is complete and sentiment has started.
The Vercel AI SDK handles this internally with streamObject(). But understanding the mechanics helps you build better UIs:
// Progressive rendering based on field availability
function StreamingAnalysis({ partial }: { partial: DeepPartial<Analysis> }) {
return (
<div>
<div style={{ minHeight: "3rem" }}>
{partial.summary ? (
<p>{partial.summary}</p>
) : (
<TextSkeleton lines={2} />
)}
</div>
<div style={{ minHeight: "2rem" }}>
{partial.sentiment ? (
<SentimentBadge value={partial.sentiment} />
) : (
<PillSkeleton />
)}
</div>
<div style={{ minHeight: "2rem" }}>
{partial.keyTopics && partial.keyTopics.length > 0 ? (
<TopicList topics={partial.keyTopics.filter(Boolean) as string[]} />
) : (
<TagsSkeleton count={3} />
)}
</div>
<div style={{ minHeight: "1.5rem" }}>
{partial.confidence != null ? (
<ConfidenceBar value={partial.confidence} />
) : (
<BarSkeleton />
)}
</div>
</div>
);
}
The fixed minHeight values prevent layout shifts as content appears — a CLS optimization that matters for Core Web Vitals even on AI-generated content.
Partial JSON parsing is trickier than it sounds. Consider this mid-stream state: {"items": [{"name": "Wid. Is items an array with one object whose name starts with "Wid"? Or is the stream about to produce "Widget" as the full value? The parser has to handle both cases — it gives you the partial value "Wid" for now and updates it as more tokens arrive. Libraries like partial-json and the AI SDK's built-in parser handle these edge cases, including nested objects, arrays mid-element, and escaped characters mid-string. You almost never need to implement this yourself, but knowing it exists helps you debug weird streaming behavior.
Common Patterns and Gotchas
| What developers do | What they should do |
|---|---|
| Using legacy JSON mode (response_format: { type: 'json_object' }) and assuming it matches your schema JSON mode only guarantees valid JSON — it could return any valid JSON object. You might get { "answer": 42 } when you expected { "question": "...", "options": [...] }. Strict mode constrains every field to match your schema. | Use strict mode with json_schema (response_format: zodResponseFormat(schema, name)) which guarantees schema conformance |
| Using z.optional() for fields the model might not fill OpenAI strict mode requires all properties to be in the required array. Optional fields are not supported. Nullable fields let the model output null when a value is not applicable while keeping the field required. | Use z.string().nullable() — make the field required but nullable |
| Not handling refusals — assuming content is always present When a model refuses to generate content (safety filters, policy violations), the content field is null. Calling JSON.parse(null) throws. Always check for refusal first — it's a normal, expected response type. | Check for refusal before parsing: if (message.refusal) handle it, else parse content |
| Defining huge monolithic schemas with 20+ fields Large schemas increase latency, reduce quality (model has more constraints to satisfy simultaneously), and make streaming less useful (many fields means longer wait for each). Smaller schemas generate faster and with higher quality. | Break into smaller focused schemas and compose them, or make separate calls |
- 1Always validate with Zod even after structured output — defense in depth against provider changes
- 2Use .describe() on schema fields to guide the model — it reads these as instructions
- 3Prefer nullable over optional for OpenAI strict mode compatibility
- 4Set fixed dimensions on streaming UI containers to prevent layout shifts
- 5Handle refusals as a first-class error type, not an edge case
- 6Share Zod schemas between frontend validation and AI generation — single source of truth
- 7Keep schemas focused and small — large schemas degrade quality and increase latency
Putting It All Together
Here's the mental framework for structured output in production:
- Define your schema with Zod — use
.describe(), prefer enums over open strings, make fields nullable not optional - Choose your generation method —
generateObjectfor one-shot,streamObjectfor progressive UI - Handle errors at every level — refusals, rate limits, schema validation failures, network errors
- Build streaming UIs — render available fields immediately, show skeletons for pending fields, prevent layout shifts
- Validate the output — even with structured output guarantees, run
safeParseas a safety net - Share schemas — the same Zod schema validates forms, API inputs, and AI outputs
Structured output turns LLMs from unpredictable text generators into reliable data sources. Once you start thinking of AI responses as typed objects instead of strings, everything clicks — your components get cleaner, your error handling gets simpler, and your users get a polished experience that feels like magic.