Testing Philosophy and the Testing Trophy

intermediate16 min read

The Dirty Secret of Most Test Suites

You've seen it before. A project with 95% code coverage, a green CI badge proudly displayed in the README, and a codebase that still ships bugs to production every week. How?

Because code coverage measures whether lines of code executed during tests. It says nothing about whether the tests actually verify anything useful. You can hit 100% coverage with tests that assert nothing meaningful — and many teams do exactly that.

The problem isn't that developers don't write tests. The problem is they write the wrong tests. Tests that are tightly coupled to implementation details. Tests that break when you refactor but pass when you introduce bugs. Tests that give you a warm fuzzy feeling while providing zero confidence that your app works.

Good testing isn't about quantity. It's about confidence per dollar spent.

Mental Model

Think of tests like smoke detectors in a building. You could install 500 smoke detectors in a single room and call your building "well protected." The coverage numbers look great. But if there are zero detectors in the kitchen, the server room, and the garage — the places where fires actually start — your building burns down despite having "excellent coverage." Smart testing means putting detectors where fires start, not where they're easiest to install.

The Testing Pyramid — and Why It's Wrong

In 2009, Mike Cohn introduced the testing pyramid in Succeeding with Agile. The idea was simple: write a lot of unit tests (fast, cheap, isolated), fewer integration tests (slower, touch multiple units), and very few end-to-end tests (slowest, most expensive, most brittle).

For years, this was gospel. The pyramid made sense in a world of server-rendered monoliths where "units" were standalone functions with clear inputs and outputs. But modern frontend development doesn't work that way.

Here's the uncomfortable truth about the pyramid applied to frontend code:

Unit tests for React components are often meaningless. Testing that a component renders a button with the right class name tells you nothing about whether the user can actually complete a workflow.
The "unit" boundary is artificial. A React component is not a standalone unit — it depends on hooks, context, state management, routing, and the DOM. Isolating it with mocks means you're testing a fiction.
Lots of passing unit tests creates false confidence. Each test passes in isolation, but the pieces don't work together. The integration points — where most bugs live — are completely untested.

Quiz

A React component has 100% unit test coverage. All tests pass. You refactor the state management from useState to useReducer with identical behavior. What happens?

ABCD

The Testing Trophy

Kent C. Dodds proposed the testing trophy as an alternative to the pyramid. Instead of weighting toward unit tests, the trophy weights toward integration tests — and reshapes the entire testing philosophy.

The Testing TrophyPhase 1 / 4

Phase 1 / 4Static Analysis

TypeScript, ESLint, Prettier. Catches typos, type errors, and code style issues before tests even run. Zero runtime cost.

instant feedbackzero runtime

1/4

The trophy flips the pyramid's priorities. The biggest layer isn't unit tests — it's integration tests. And it adds a layer the pyramid completely ignores: static analysis.

Why Integration Tests Win

Integration tests give you the best return on investment for one fundamental reason: they test the way users actually use your software.

When a user clicks "Add to Cart," they don't care whether your useCart hook correctly updates its internal state. They care that:

The item appears in the cart
The count updates in the header
The price total changes
They can proceed to checkout

An integration test renders the component tree, simulates a click, and checks that the UI reflects the right state. It exercises the hooks, the state management, the child components, the DOM updates — all the pieces working together. It tests the contract between components, not their internals.

import { render, screen } from '@testing-library/react';
import userEvent from '@testing-library/user-event';
import { ProductPage } from './ProductPage';

test('adding an item updates the cart count and total', async () => {
  const user = userEvent.setup();
  render(<ProductPage productId="abc-123" />);

  await user.click(screen.getByRole('button', { name: /add to cart/i }));

  expect(screen.getByText(/cart \(1\)/i)).toBeInTheDocument();
  expect(screen.getByText(/\$29\.99/)).toBeInTheDocument();
});

This single test covers the product display, the add-to-cart button, the cart state update, and the UI reflection — all the integration points where bugs actually hide. An equivalent set of unit tests would need 5+ separate tests, each mocking out the others, and you'd still miss the integration bugs.

Quiz

You need to test a search feature with autocomplete. Which approach gives you the most confidence with the least maintenance cost?

ABCD

Confidence vs Speed — The Real Tradeoff

Every testing approach sits on a spectrum between two competing values:

Attribute	Unit Tests	Integration Tests	E2E Tests
Speed	Milliseconds	Seconds	Minutes
Confidence level	Low — tests isolated pieces	High — tests real interactions	Highest — tests the full system
Maintenance cost	Low per test, but high volume	Medium — but fewer tests needed	High — brittle, slow, flaky
What they catch	Logic errors in pure functions	Component interaction bugs, state bugs, rendering bugs	Full-flow regressions, auth bugs, cross-page flows
What they miss	Integration bugs, rendering bugs	Server-side issues, browser quirks	Rare — but too slow for TDD
Best for	Utils, algorithms, data transforms	Components, hooks, features, forms	Critical user journeys only
Refactor resilience	Breaks easily if implementation-coupled	Survives refactors that preserve behavior	Survives anything that preserves user flow

The key insight: integration tests occupy the sweet spot. They're fast enough to run on every commit, confident enough to catch real bugs, and resilient enough to survive refactors.

Unit tests are still valuable — but only for code that actually is a standalone unit. A pure formatCurrency(amount, locale) function is a perfect unit test candidate. A React component that uses three hooks and renders four children is not.

What to Test: User Behavior, Not Implementation

This is the single most important shift in testing philosophy. Stop testing how your code works internally. Start testing what it does from the user's perspective.

Kent C. Dodds put it simply: "The more your tests resemble the way your software is used, the more confidence they can give you."

Here's what that means in practice:

Instead of this (testing implementation):

test('calls setCount when button clicked', () => {
  const setCount = vi.fn();
  vi.spyOn(React, 'useState').mockReturnValue([0, setCount]);
  render(<Counter />);
  fireEvent.click(screen.getByText('Increment'));
  expect(setCount).toHaveBeenCalledWith(1);
});

Write this (testing behavior):

test('increments the displayed count when button clicked', async () => {
  const user = userEvent.setup();
  render(<Counter />);
  expect(screen.getByText('Count: 0')).toBeInTheDocument();
  await user.click(screen.getByRole('button', { name: /increment/i }));
  expect(screen.getByText('Count: 1')).toBeInTheDocument();
});

The first test breaks when you switch from useState to useReducer. The second test survives because it tests what the user sees and does — which is what matters.

The guiding questions for every test:

Can the user see or interact with this? If yes, test it through the rendered UI.
Is this a pure function with clear inputs/outputs? If yes, unit test it.
Am I mocking more than the network boundary? If yes, you're probably testing implementation details.

Quiz

Which of these test assertions is testing implementation details?

ABCD

When NOT to Test

Here's something no testing tutorial tells you: some things aren't worth testing. Writing tests has a cost — writing time, maintenance time, CI time, cognitive overhead. If a test doesn't provide meaningful confidence relative to its cost, skip it.

Don't test:

Third-party library behavior. Don't test that Array.filter filters correctly or that React Router navigates. The library authors already tested that.
Simple pass-through components. A component that just renders children with a CSS class doesn't need a test. It's a styled div.
Generated code. GraphQL codegen output, Prisma types, API client code — test the code that uses them, not the generated code itself.
Implementation details. Internal state shape, hook return values consumed only internally, private functions that only support the public API.
Things TypeScript already catches. If your function takes a string and TypeScript enforces that at compile time, you don't need a runtime test asserting the argument is a string.

Always test:

User-facing behavior. If a user can see it or interact with it, test it.
Complex business logic. Pricing calculations, permission checks, data transformations with edge cases.
Error states. What happens when the API fails? When the user enters invalid data? When the network drops?
Accessibility. Can keyboard users navigate? Are ARIA attributes correct? Do screen readers announce changes?
Critical paths. Login, checkout, onboarding, data submission — anything where a bug means lost revenue or lost users.

The Cost of Bad Tests

Bad tests aren't just useless — they're actively harmful. They cost you in two ways:

False positives (tests fail when the code is fine): These happen when tests are coupled to implementation details. You refactor for performance, every test turns red, and you spend a day updating tests that were "protecting" code that never had a bug. After enough false positives, teams start ignoring test failures — "oh, those tests always break, just update the snapshots."

False negatives (tests pass when the code is broken): These happen when tests mock too aggressively or test the wrong things. Everything is green, the team ships confidently, and users hit a bug that no test caught. This is worse than false positives because it creates false confidence.

What developers do	What they should do
Mocking everything except the component under test Over-mocking tests a fantasy version of your app. Real bugs live at the seams between components, and mocks hide those seams entirely.	Mock only the network boundary — let real components, hooks, and context run
Using snapshot tests as your primary testing strategy Snapshots tell you something changed but not whether the change is correct. Teams update snapshots reflexively without reviewing diffs, turning snapshots into rubber stamps.	Use explicit assertions on specific elements and text content
Testing internal state values directly with component internals Internal state is an implementation detail. Asserting state shape locks you into one implementation. Assert on the UI result of that state instead.	Assert on rendered output that the user can see
Aiming for 100% code coverage as a quality goal Coverage measures execution, not verification. The last 10% of coverage usually requires testing implementation details and edge cases that will never happen in production — all at a high maintenance cost.	Aim for high confidence in critical paths, accept lower coverage in trivial code

The Testing Trophy Layers in Practice

Let's make each layer concrete with real examples of what belongs where.

Static Analysis

This is your first line of defense, and it's free. TypeScript, ESLint, and Prettier catch entire categories of bugs before a single test runs.

function calculateDiscount(price: number, discount: number): number {
  return price * (1 - discount);
}

calculateDiscount("29.99", 0.1);
// TypeScript error: Argument of type 'string' is not
// assignable to parameter of type 'number'

That's a bug caught at zero runtime cost. No test needed. Static analysis also catches unused variables, unreachable code, incorrect hook usage (eslint-plugin-react-hooks), and accessibility issues (eslint-plugin-jsx-a11y).

Unit Tests

Reserve these for pure logic with clear boundaries:

import { describe, it, expect } from 'vitest';
import { formatReadingTime } from './format';

describe('formatReadingTime', () => {
  it('formats minutes under 60 as "X min read"', () => {
    expect(formatReadingTime(5)).toBe('5 min read');
  });

  it('formats exactly 60 minutes as "1 hr read"', () => {
    expect(formatReadingTime(60)).toBe('1 hr read');
  });

  it('formats over 60 minutes with hours and minutes', () => {
    expect(formatReadingTime(90)).toBe('1 hr 30 min read');
  });

  it('rounds fractional minutes down', () => {
    expect(formatReadingTime(7.8)).toBe('7 min read');
  });
});

Clean inputs, clear outputs, no DOM, no React, no side effects. This is where unit tests shine.

Integration Tests

The bulk of your test suite. Render real component trees, interact like a user:

import { render, screen, waitFor } from '@testing-library/react';
import userEvent from '@testing-library/user-event';
import { http, HttpResponse } from 'msw';
import { server } from '../mocks/server';
import { LoginPage } from './LoginPage';

test('shows validation error for invalid email', async () => {
  const user = userEvent.setup();
  render(<LoginPage />);

  await user.type(screen.getByLabelText(/email/i), 'not-an-email');
  await user.click(screen.getByRole('button', { name: /sign in/i }));

  expect(screen.getByRole('alert')).toHaveTextContent('Enter a valid email');
});

test('redirects to dashboard on successful login', async () => {
  const user = userEvent.setup();
  server.use(
    http.post('/api/auth/login', () => {
      return HttpResponse.json({ token: 'abc123' });
    })
  );

  render(<LoginPage />);

  await user.type(screen.getByLabelText(/email/i), 'dev@example.com');
  await user.type(screen.getByLabelText(/password/i), 'securepass');
  await user.click(screen.getByRole('button', { name: /sign in/i }));

  await waitFor(() => {
    expect(window.location.pathname).toBe('/dashboard');
  });
});

Notice what's mocked: only the HTTP endpoint (via MSW). The component, its form logic, its validation, its routing — all real. That's the integration test sweet spot.

E2E Tests

Reserve these for the critical paths where failure means real business damage:

import { test, expect } from '@playwright/test';

test('new user can complete onboarding', async ({ page }) => {
  await page.goto('/signup');
  await page.getByLabel('Email').fill('newuser@example.com');
  await page.getByLabel('Password').fill('StrongP@ss1');
  await page.getByRole('button', { name: 'Create account' }).click();

  await expect(page.getByText('Welcome! Let\'s get started')).toBeVisible();

  await page.getByRole('button', { name: 'Choose your path' }).click();
  await page.getByText('Frontend Engineering').click();
  await page.getByRole('button', { name: 'Start learning' }).click();

  await expect(page).toHaveURL(/\/courses\//);
});

You might have 500 integration tests and only 20 E2E tests. That ratio is healthy.

Quiz

You're testing a checkout flow. Where should the test live in the testing trophy?

ABCD

The Pyramid vs The Trophy — Side by Side

Aspect	Testing Pyramid	Testing Trophy
Emphasis	Lots of unit tests at the base	Integration tests as the largest layer
Static analysis	Not included	Foundation of the trophy — catches bugs before runtime
Philosophy	Isolate everything, mock dependencies	Test like a user, mock only network boundaries
Ideal codebase	Backend services with pure functions	Frontend apps with complex component interactions
Refactor safety	Low — tests coupled to internals break on refactors	High — behavior-focused tests survive refactors
Confidence	Many tests, modest confidence	Fewer tests, higher confidence per test
False positive rate	High — implementation coupling	Low — behavior coupling
Mocking strategy	Mock everything except the unit under test	Mock only the network boundary
Coverage goal	High line coverage	High use-case coverage

The pyramid isn't wrong — it was right for its era. But frontend code in 2025 is component-based, state-driven, and deeply integrated. The trophy matches how modern frontend apps are built and how users interact with them.

Quiz

Your team writes tests that mock useContext, useState, and all child components before testing a parent component. A teammate refactors the parent from class component to function component with identical behavior. What happens to the test suite?

ABCD

Writing Your First Test the Right Way

Here's the mental checklist every time you sit down to write a test:

What user behavior am I testing? Start with a user story: "the user types a search query and sees matching results."
What's the smallest component tree that exercises this behavior? Render that tree — don't render the entire app, but don't isolate a single component either.
What do I need to mock? Only the network boundary. Use MSW for API mocking. Let everything else run for real.
How would a user interact with this? Use userEvent (not fireEvent) — it simulates real user interactions including focus, keyboard events, and pointer events.
What does the user see when it works? Assert on visible text, accessible roles, and ARIA attributes — not internal state, CSS classes, or DOM structure.

Key Rules

1The testing trophy prioritizes integration tests over unit tests — they give the best confidence per dollar for frontend code
2Test user behavior, not implementation details — your tests should not break when you refactor internal code
3Mock only the network boundary — let real components, hooks, context, and state run in your tests
4Static analysis (TypeScript, ESLint) is the foundation — it catches entire categories of bugs at zero runtime cost
5Not everything needs a test — skip trivial pass-through components, generated code, and things TypeScript already catches
6False confidence from bad tests is worse than no tests — a passing test suite means nothing if the tests verify the wrong things