tldr: Test maintenance is what kills most test suites. Tests rot when the system changes faster than the tests are updated. The fix is either dramatic reduction in maintenance cost (AI testing) or dramatic increase in test discipline. Most teams cannot do the second.

Why suites rot

Three forces compound.

The system changes. Every UI change, every API change, every refactor potentially breaks tests.

Tests are not first-class. Tests live in the same repo but rarely get the same review attention as production code.

Maintenance is invisible. A passing test is invisible. A failing test is irritating. A green build is the goal. Engineers patch tests to make them green without thinking about whether they still test what they should.

The result: a year-old test suite is often half-trustworthy at best.

Symptoms of a rotting suite

Five tells.

Tests are commented out because no one knew what they tested.

Retry logic is everywhere because tests are unreliable but no one wants to fix them.

Selectors are brittle. A small UI change breaks dozens of tests.

No one runs the suite locally. Engineers wait for CI because the suite takes too long or fails too often.

Coverage drops over time because new features ship without new tests.

If you see three or more, the suite is in trouble.

What maintenance actually costs

For traditional Selenium-style E2E suites, industry data suggests 40 to 60% of QA time goes to maintenance, not new test creation.

This is the bottleneck most teams hit. They want more E2E coverage. They cannot afford to maintain what they have, let alone more.

How to reduce maintenance cost

Two strategies.

1. Reduce coupling to brittle locators

CSS selectors and XPath break on every UI change. Replace with:

Semantic locators (text content, ARIA labels, data-testid).
Visual locators (find element by appearance).
AI-driven locators (find element by intent).

The further from raw CSS, the more resilient.

2. Use AI testing platforms

AI testing tools eliminate selectors entirely. The agent finds the right element based on the goal description. UI changes that break traditional tests do not break AI tests.

Bug0 and the open-source Passmark engine sit in this category. The maintenance burden drops dramatically because the tests describe what to do, not how to do it.

What to test for maintenance

Every test should be:

Self-contained. No reliance on other tests' state.
Resilient. Survives small UI or content changes.
Clear about failure. When it fails, you know why.
Owned. Someone knows why this test exists.

Tests failing any of these are maintenance time bombs.

When to delete tests

Some tests should be deleted, not maintained.

Tests that test implementation details rather than behavior.
Tests that have been flaky for months and no one has fixed.
Tests that test deprecated features.
Tests that duplicate other tests.

Aggressive deletion improves the suite. Most teams keep too many tests.

FAQs

How often should I review tests for maintenance?

Sprint by sprint. Tag flaky tests, fix or delete them. Quarterly: review the suite as a whole.

Should engineers or QA write E2E tests?

Whoever maintains them. If QA writes and engineers do not look at them, they rot. Best pattern: engineering ownership with QA support.

How do I know if a test is worth maintaining?

Ask: if this test never ran again, would it cause a bug to escape? If no, delete it.

How does Bug0 reduce maintenance?

Bug0 tests describe goals, not steps. UI changes that would break selector-based tests do not break AI-driven tests. The maintenance burden drops to nearly zero for E2E suites.

Test maintenance