Testing Philosophy and the Testing Trophy
The Dirty Secret of Most Test Suites
You've seen it before. A project with 95% code coverage, a green CI badge proudly displayed in the README, and a codebase that still ships bugs to production every week. How?
Because code coverage measures whether lines of code executed during tests. It says nothing about whether the tests actually verify anything useful. You can hit 100% coverage with tests that assert nothing meaningful — and many teams do exactly that.
The problem isn't that developers don't write tests. The problem is they write the wrong tests. Tests that are tightly coupled to implementation details. Tests that break when you refactor but pass when you introduce bugs. Tests that give you a warm fuzzy feeling while providing zero confidence that your app works.
Good testing isn't about quantity. It's about confidence per dollar spent.
Think of tests like smoke detectors in a building. You could install 500 smoke detectors in a single room and call your building "well protected." The coverage numbers look great. But if there are zero detectors in the kitchen, the server room, and the garage — the places where fires actually start — your building burns down despite having "excellent coverage." Smart testing means putting detectors where fires start, not where they're easiest to install.
The Testing Pyramid — and Why It's Wrong
In 2009, Mike Cohn introduced the testing pyramid in Succeeding with Agile. The idea was simple: write a lot of unit tests (fast, cheap, isolated), fewer integration tests (slower, touch multiple units), and very few end-to-end tests (slowest, most expensive, most brittle).
For years, this was gospel. The pyramid made sense in a world of server-rendered monoliths where "units" were standalone functions with clear inputs and outputs. But modern frontend development doesn't work that way.
Here's the uncomfortable truth about the pyramid applied to frontend code:
- Unit tests for React components are often meaningless. Testing that a component renders a button with the right class name tells you nothing about whether the user can actually complete a workflow.
- The "unit" boundary is artificial. A React component is not a standalone unit — it depends on hooks, context, state management, routing, and the DOM. Isolating it with mocks means you're testing a fiction.
- Lots of passing unit tests creates false confidence. Each test passes in isolation, but the pieces don't work together. The integration points — where most bugs live — are completely untested.
The Testing Trophy
Kent C. Dodds proposed the testing trophy as an alternative to the pyramid. Instead of weighting toward unit tests, the trophy weights toward integration tests — and reshapes the entire testing philosophy.
The trophy flips the pyramid's priorities. The biggest layer isn't unit tests — it's integration tests. And it adds a layer the pyramid completely ignores: static analysis.
Why Integration Tests Win
Integration tests give you the best return on investment for one fundamental reason: they test the way users actually use your software.
When a user clicks "Add to Cart," they don't care whether your useCart hook correctly updates its internal state. They care that:
- The item appears in the cart
- The count updates in the header
- The price total changes
- They can proceed to checkout
An integration test renders the component tree, simulates a click, and checks that the UI reflects the right state. It exercises the hooks, the state management, the child components, the DOM updates — all the pieces working together. It tests the contract between components, not their internals.
import { render, screen } from '@testing-library/react';
import userEvent from '@testing-library/user-event';
import { ProductPage } from './ProductPage';
test('adding an item updates the cart count and total', async () => {
const user = userEvent.setup();
render(<ProductPage productId="abc-123" />);
await user.click(screen.getByRole('button', { name: /add to cart/i }));
expect(screen.getByText(/cart \(1\)/i)).toBeInTheDocument();
expect(screen.getByText(/\$29\.99/)).toBeInTheDocument();
});
This single test covers the product display, the add-to-cart button, the cart state update, and the UI reflection — all the integration points where bugs actually hide. An equivalent set of unit tests would need 5+ separate tests, each mocking out the others, and you'd still miss the integration bugs.
Confidence vs Speed — The Real Tradeoff
Every testing approach sits on a spectrum between two competing values:
| Attribute | Unit Tests | Integration Tests | E2E Tests |
|---|---|---|---|
| Speed | Milliseconds | Seconds | Minutes |
| Confidence level | Low — tests isolated pieces | High — tests real interactions | Highest — tests the full system |
| Maintenance cost | Low per test, but high volume | Medium — but fewer tests needed | High — brittle, slow, flaky |
| What they catch | Logic errors in pure functions | Component interaction bugs, state bugs, rendering bugs | Full-flow regressions, auth bugs, cross-page flows |
| What they miss | Integration bugs, rendering bugs | Server-side issues, browser quirks | Rare — but too slow for TDD |
| Best for | Utils, algorithms, data transforms | Components, hooks, features, forms | Critical user journeys only |
| Refactor resilience | Breaks easily if implementation-coupled | Survives refactors that preserve behavior | Survives anything that preserves user flow |
The key insight: integration tests occupy the sweet spot. They're fast enough to run on every commit, confident enough to catch real bugs, and resilient enough to survive refactors.
Unit tests are still valuable — but only for code that actually is a standalone unit. A pure formatCurrency(amount, locale) function is a perfect unit test candidate. A React component that uses three hooks and renders four children is not.
What to Test: User Behavior, Not Implementation
This is the single most important shift in testing philosophy. Stop testing how your code works internally. Start testing what it does from the user's perspective.
Kent C. Dodds put it simply: "The more your tests resemble the way your software is used, the more confidence they can give you."
Here's what that means in practice:
Instead of this (testing implementation):
test('calls setCount when button clicked', () => {
const setCount = vi.fn();
vi.spyOn(React, 'useState').mockReturnValue([0, setCount]);
render(<Counter />);
fireEvent.click(screen.getByText('Increment'));
expect(setCount).toHaveBeenCalledWith(1);
});Write this (testing behavior):
test('increments the displayed count when button clicked', async () => {
const user = userEvent.setup();
render(<Counter />);
expect(screen.getByText('Count: 0')).toBeInTheDocument();
await user.click(screen.getByRole('button', { name: /increment/i }));
expect(screen.getByText('Count: 1')).toBeInTheDocument();
});The first test breaks when you switch from useState to useReducer. The second test survives because it tests what the user sees and does — which is what matters.
The guiding questions for every test:
- Can the user see or interact with this? If yes, test it through the rendered UI.
- Is this a pure function with clear inputs/outputs? If yes, unit test it.
- Am I mocking more than the network boundary? If yes, you're probably testing implementation details.
When NOT to Test
Here's something no testing tutorial tells you: some things aren't worth testing. Writing tests has a cost — writing time, maintenance time, CI time, cognitive overhead. If a test doesn't provide meaningful confidence relative to its cost, skip it.
Don't test:
- Third-party library behavior. Don't test that
Array.filterfilters correctly or that React Router navigates. The library authors already tested that. - Simple pass-through components. A component that just renders
childrenwith a CSS class doesn't need a test. It's a styleddiv. - Generated code. GraphQL codegen output, Prisma types, API client code — test the code that uses them, not the generated code itself.
- Implementation details. Internal state shape, hook return values consumed only internally, private functions that only support the public API.
- Things TypeScript already catches. If your function takes a
stringand TypeScript enforces that at compile time, you don't need a runtime test asserting the argument is a string.
Always test:
- User-facing behavior. If a user can see it or interact with it, test it.
- Complex business logic. Pricing calculations, permission checks, data transformations with edge cases.
- Error states. What happens when the API fails? When the user enters invalid data? When the network drops?
- Accessibility. Can keyboard users navigate? Are ARIA attributes correct? Do screen readers announce changes?
- Critical paths. Login, checkout, onboarding, data submission — anything where a bug means lost revenue or lost users.
The Cost of Bad Tests
Bad tests aren't just useless — they're actively harmful. They cost you in two ways:
False positives (tests fail when the code is fine): These happen when tests are coupled to implementation details. You refactor for performance, every test turns red, and you spend a day updating tests that were "protecting" code that never had a bug. After enough false positives, teams start ignoring test failures — "oh, those tests always break, just update the snapshots."
False negatives (tests pass when the code is broken): These happen when tests mock too aggressively or test the wrong things. Everything is green, the team ships confidently, and users hit a bug that no test caught. This is worse than false positives because it creates false confidence.
| What developers do | What they should do |
|---|---|
| Mocking everything except the component under test Over-mocking tests a fantasy version of your app. Real bugs live at the seams between components, and mocks hide those seams entirely. | Mock only the network boundary — let real components, hooks, and context run |
| Using snapshot tests as your primary testing strategy Snapshots tell you something changed but not whether the change is correct. Teams update snapshots reflexively without reviewing diffs, turning snapshots into rubber stamps. | Use explicit assertions on specific elements and text content |
| Testing internal state values directly with component internals Internal state is an implementation detail. Asserting state shape locks you into one implementation. Assert on the UI result of that state instead. | Assert on rendered output that the user can see |
| Aiming for 100% code coverage as a quality goal Coverage measures execution, not verification. The last 10% of coverage usually requires testing implementation details and edge cases that will never happen in production — all at a high maintenance cost. | Aim for high confidence in critical paths, accept lower coverage in trivial code |
The Testing Trophy Layers in Practice
Let's make each layer concrete with real examples of what belongs where.
Static Analysis
This is your first line of defense, and it's free. TypeScript, ESLint, and Prettier catch entire categories of bugs before a single test runs.
function calculateDiscount(price: number, discount: number): number {
return price * (1 - discount);
}
calculateDiscount("29.99", 0.1);
// TypeScript error: Argument of type 'string' is not
// assignable to parameter of type 'number'
That's a bug caught at zero runtime cost. No test needed. Static analysis also catches unused variables, unreachable code, incorrect hook usage (eslint-plugin-react-hooks), and accessibility issues (eslint-plugin-jsx-a11y).
Unit Tests
Reserve these for pure logic with clear boundaries:
import { describe, it, expect } from 'vitest';
import { formatReadingTime } from './format';
describe('formatReadingTime', () => {
it('formats minutes under 60 as "X min read"', () => {
expect(formatReadingTime(5)).toBe('5 min read');
});
it('formats exactly 60 minutes as "1 hr read"', () => {
expect(formatReadingTime(60)).toBe('1 hr read');
});
it('formats over 60 minutes with hours and minutes', () => {
expect(formatReadingTime(90)).toBe('1 hr 30 min read');
});
it('rounds fractional minutes down', () => {
expect(formatReadingTime(7.8)).toBe('7 min read');
});
});
Clean inputs, clear outputs, no DOM, no React, no side effects. This is where unit tests shine.
Integration Tests
The bulk of your test suite. Render real component trees, interact like a user:
import { render, screen, waitFor } from '@testing-library/react';
import userEvent from '@testing-library/user-event';
import { http, HttpResponse } from 'msw';
import { server } from '../mocks/server';
import { LoginPage } from './LoginPage';
test('shows validation error for invalid email', async () => {
const user = userEvent.setup();
render(<LoginPage />);
await user.type(screen.getByLabelText(/email/i), 'not-an-email');
await user.click(screen.getByRole('button', { name: /sign in/i }));
expect(screen.getByRole('alert')).toHaveTextContent('Enter a valid email');
});
test('redirects to dashboard on successful login', async () => {
const user = userEvent.setup();
server.use(
http.post('/api/auth/login', () => {
return HttpResponse.json({ token: 'abc123' });
})
);
render(<LoginPage />);
await user.type(screen.getByLabelText(/email/i), 'dev@example.com');
await user.type(screen.getByLabelText(/password/i), 'securepass');
await user.click(screen.getByRole('button', { name: /sign in/i }));
await waitFor(() => {
expect(window.location.pathname).toBe('/dashboard');
});
});
Notice what's mocked: only the HTTP endpoint (via MSW). The component, its form logic, its validation, its routing — all real. That's the integration test sweet spot.
E2E Tests
Reserve these for the critical paths where failure means real business damage:
import { test, expect } from '@playwright/test';
test('new user can complete onboarding', async ({ page }) => {
await page.goto('/signup');
await page.getByLabel('Email').fill('newuser@example.com');
await page.getByLabel('Password').fill('StrongP@ss1');
await page.getByRole('button', { name: 'Create account' }).click();
await expect(page.getByText('Welcome! Let\'s get started')).toBeVisible();
await page.getByRole('button', { name: 'Choose your path' }).click();
await page.getByText('Frontend Engineering').click();
await page.getByRole('button', { name: 'Start learning' }).click();
await expect(page).toHaveURL(/\/courses\//);
});
You might have 500 integration tests and only 20 E2E tests. That ratio is healthy.
The Pyramid vs The Trophy — Side by Side
| Aspect | Testing Pyramid | Testing Trophy |
|---|---|---|
| Emphasis | Lots of unit tests at the base | Integration tests as the largest layer |
| Static analysis | Not included | Foundation of the trophy — catches bugs before runtime |
| Philosophy | Isolate everything, mock dependencies | Test like a user, mock only network boundaries |
| Ideal codebase | Backend services with pure functions | Frontend apps with complex component interactions |
| Refactor safety | Low — tests coupled to internals break on refactors | High — behavior-focused tests survive refactors |
| Confidence | Many tests, modest confidence | Fewer tests, higher confidence per test |
| False positive rate | High — implementation coupling | Low — behavior coupling |
| Mocking strategy | Mock everything except the unit under test | Mock only the network boundary |
| Coverage goal | High line coverage | High use-case coverage |
The pyramid isn't wrong — it was right for its era. But frontend code in 2025 is component-based, state-driven, and deeply integrated. The trophy matches how modern frontend apps are built and how users interact with them.
Writing Your First Test the Right Way
Here's the mental checklist every time you sit down to write a test:
- What user behavior am I testing? Start with a user story: "the user types a search query and sees matching results."
- What's the smallest component tree that exercises this behavior? Render that tree — don't render the entire app, but don't isolate a single component either.
- What do I need to mock? Only the network boundary. Use MSW for API mocking. Let everything else run for real.
- How would a user interact with this? Use
userEvent(notfireEvent) — it simulates real user interactions including focus, keyboard events, and pointer events. - What does the user see when it works? Assert on visible text, accessible roles, and ARIA attributes — not internal state, CSS classes, or DOM structure.
- 1The testing trophy prioritizes integration tests over unit tests — they give the best confidence per dollar for frontend code
- 2Test user behavior, not implementation details — your tests should not break when you refactor internal code
- 3Mock only the network boundary — let real components, hooks, context, and state run in your tests
- 4Static analysis (TypeScript, ESLint) is the foundation — it catches entire categories of bugs at zero runtime cost
- 5Not everything needs a test — skip trivial pass-through components, generated code, and things TypeScript already catches
- 6False confidence from bad tests is worse than no tests — a passing test suite means nothing if the tests verify the wrong things