E2E Visual Testing (Playwright)

Category: Quality Attributes · Areas: ui, site

Description

Areas

ui, site

Components

Test runner: Playwright (@playwright/test)
Browser engines: Chromium (default), optionally Firefox and WebKit
Screenshot snapshots: Visual regression detection via toHaveScreenshot()
Video recording: Full test execution videos for human review
Demo reel: A dedicated slow-paced Playwright script that produces a polished walkthrough video of the application for stakeholder review, README embedding, or onboarding
Trace files: Playwright traces for debugging failures
Test data: Fake/seed data that exercises every meaningful UI state

Constraints

Tests run in real browsers, not jsdom or happy-dom — no simulated DOM
Every navigable page must have at least one test that loads it and verifies key content
Every user-facing workflow must have at least one test that exercises it end-to-end
Test data must be realistic and comprehensive — seed the application with fake data that covers all UI states (empty, populated, error, edge cases)
Tests must produce visible output:
- Console logs showing what each test is doing (use test.step())
- Screenshots on failure (Playwright default)
- Full-page screenshots for visual regression
- Video recordings of every test for human review
Screenshot baselines are committed to the repo and reviewed in PRs
Tests must be runnable locally and in CI with identical results
Playwright version pinned in package.json

Test Data Requirements

The application under test must have a data seeding mechanism:

Static sites (Hugo): content files ARE the test data — ensure content covers all page types, shortcodes, navigation patterns, and edge cases (empty sections, long titles, deep nesting)
Web apps: seed script or fixture that populates the database/API with realistic fake data before tests run. The seed must be deterministic (same data every run)
Never test against an empty state — an empty app tells you nothing about whether the UI works

Configuration Pattern

import { defineConfig } from '@playwright/test'

export default defineConfig({
  testDir: './e2e',
  timeout: 30000,

  // Video and trace for every test
  use: {
    baseURL: 'http://127.0.0.1:<port>',
    headless: true,
    video: 'on',
    trace: 'retain-on-failure',
    screenshot: 'on',
  },

  // Reporter with step-level detail
  reporter: [
    ['list'],
    ['html', { open: 'never' }],
  ],

  // Dev server
  webServer: {
    command: '<start command>',
    port: '<port>',
    reuseExistingServer: true,
    timeout: 15000,
  },
})

Test Structure Pattern

import { test, expect } from '@playwright/test'

test.describe('Feature Area', () => {
  test('page loads with expected content', async ({ page }) => {
    await test.step('navigate to page', async () => {
      await page.goto('/path')
    })

    await test.step('verify hero content', async () => {
      await expect(page.getByRole('heading', { name: 'Title' })).toBeVisible()
    })

    await test.step('capture screenshot', async () => {
      await expect(page).toHaveScreenshot('page-name.png', { fullPage: true })
    })
  })

  test('workflow: user completes action', async ({ page }) => {
    await test.step('start at entry point', async () => {
      await page.goto('/')
    })

    await test.step('click through workflow', async () => {
      await page.getByRole('link', { name: 'Get Started' }).click()
      await expect(page).toHaveURL(/getting-started/)
    })

    await test.step('verify end state', async () => {
      await expect(page.getByText('expected content')).toBeVisible()
    })
  })
})

Use test.step() for every meaningful action — this produces structured logs that show exactly what each test is doing.

Video and Artifact Output

Playwright generates:

test-results/ — videos, screenshots, traces per test
playwright-report/ — HTML report with embedded videos

Both directories should be gitignored. In CI, upload as artifacts.

Demo Reel

Every project with a web UI should include a Playwright demo reel script that produces a polished walkthrough video. This replaces manual screen recordings with a reproducible, scriptable demo.

Demo reel structure

The demo reel is a single Playwright test file (e.g., e2e/demo.spec.ts) that:

Seeds the application with comprehensive, realistic data
Navigates every major page and workflow at a human-readable pace
Uses page.waitForTimeout() between actions for viewing comfort
Produces a .webm video in test-results/ that can be converted to .mp4 or .gif for embedding

Demo reel conventions

File: e2e/demo.spec.ts (separate from test specs)
Viewport: 1280x720 (HD, suitable for embedding)
Pacing: 1-2 second pauses between navigations, 2-3 seconds on key screens
Use test.step() for structured logging of what each section shows
Narrative structure: Overview (dashboard) -> Primary workflows -> Detail views
Run with: npx playwright test e2e/demo.spec.ts
Output: test-results/demo-*/video.webm
The demo must run against seeded data — never against an empty state
Re-record the demo after major UI changes

Converting output

# WebM to MP4 (for README/docs)
ffmpeg -i test-results/demo-*/video.webm -c:v libx264 -preset slow -crf 22 demo.mp4

# WebM to GIF (for README, keep under 10MB)
ffmpeg -i test-results/demo-*/video.webm -vf "fps=10,scale=960:-1" -loop 0 demo.gif

When to use

Any project with a web UI that users interact with. This includes:

Documentation sites (Hugo/Hextra microsites)
Web applications
Admin dashboards
Any HTML output that needs visual consistency

ADR References

Practices by activity

Agents working in any of these activities inherit the practices below via the bead’s context digest.

Requirements (Frame activity)

Identify all user-facing pages and workflows that need testing
Define what “test data” means for this project — what states must the UI show?
Determine visual regression strategy: which pages get screenshot baselines?
Decide on browser matrix: Chromium only, or cross-browser?

Design

Tests live in <project-root>/e2e/ or website/e2e/ for microsites
One test file per feature area or page group
Use test.describe() blocks to group related tests
Use test.step() for every meaningful action — produces structured logs
Test data is deterministic and committed (or seeded deterministically)
Screenshot baselines committed under e2e/*.spec.ts-snapshots/
Video recording enabled for all tests — video: 'on' in config

Implementation

Install: npm install -D @playwright/test && npx playwright install
Configure playwright.config.ts with video, trace, and reporter settings
Write tests that use real browser interactions (click, type, navigate)
Use Playwright’s locator API: getByRole(), getByText(), getByLabel()
Prefer semantic selectors over CSS selectors or test-ids
Every test should:
1. Navigate to the page
2. Verify key content is visible
3. Capture a full-page screenshot for visual regression
Workflow tests should:
1. Start at the entry point
2. Perform the user’s action sequence
3. Verify the end state
4. Capture screenshots at key steps

Test Data

Static sites: ensure content files cover all page types, shortcodes, navigation levels, and edge cases. Add a “kitchen sink” test page if needed that exercises every component
Web apps: create a seed script (e.g., scripts/seed-test-data.sh) that populates deterministic fake data. Run before tests. Include:
- Empty states (zero items)
- Populated states (10+ items)
- Error states (invalid data, missing fields)
- Edge cases (long text, special characters, large numbers)
Never test against production data — always use controlled test data

Testing

Run locally: npx playwright test (starts dev server automatically)
Update screenshots: npx playwright test --update-snapshots
Review screenshots in PR diffs — screenshot changes must be intentional
Run in CI: same command, upload test-results/ and playwright-report/ as artifacts
After dependency updates (Playwright, browser, framework), re-baseline screenshots and review

Video Review

Videos are generated in test-results/<test-name>/video.webm
Review videos when:
- A test fails and the error message isn’t clear
- Updating screenshot baselines (watch the video to verify visual changes)
- Debugging flaky tests (video shows timing/race conditions)
In CI, upload test-results/ as a build artifact for post-hoc review

Demo Reel

Create e2e/demo.spec.ts — a single test that walks through the entire app
Seed with realistic data that makes the demo compelling (populated states, not empty states)
Set viewport to 1280x720 for clean video output
Pace with page.waitForTimeout(): 1-2s between clicks, 2-3s on key screens
Use test.step() to narrate each section in the test output
Structure: Dashboard overview -> key workflows -> detail pages -> settings
Run separately: npx playwright test e2e/demo.spec.ts
Output video lives in test-results/ — convert to .mp4 for embedding
Re-record after significant UI changes
The demo is documentation — keep it passing, keep it current

Quality Gates

npx playwright test passes with zero failures
All screenshot baselines are committed and up-to-date
Every navigable page has at least one test
Every user-facing workflow has at least one end-to-end test
Video recording is enabled (not disabled for speed)
Demo reel script exists and produces a watchable video
No tests skip or are marked .only

CI Integration

Run Playwright tests after the build step
Upload test-results/ and playwright-report/ as artifacts
Fail the build on any test failure
Cache Playwright browsers to speed up CI
For static sites: build first (hugo --gc --minify), then serve and test