Levels of testing

tldr: Software testing has four levels: unit, integration, system, and acceptance. Each catches different defect classes. Teams that skip levels (or merge them sloppily) ship bugs that the missing level was meant to catch.


The four levels and what each owns

Unit testing

Tests a single function, method, or class in isolation. Dependencies are mocked. Runs in milliseconds.

Catches: logic errors, off-by-one bugs, type mismatches, edge cases in pure functions.

Misses: anything involving real network calls, real databases, real concurrency.

Owner: engineering, written alongside the code.

Integration testing

Tests how units work together. Real components where possible, stubs for external services. Runs in seconds to minutes.

Catches: contract mismatches between modules, broken initialization order, configuration errors, error-handling bugs across boundaries.

Misses: end-to-end user flows, performance under load, browser-specific behavior.

Owner: engineering, sometimes shared with QA.

System testing

Tests the integrated software as a whole. Runs against a deployed environment. Verifies functional and non-functional requirements.

Catches: end-to-end flow bugs, environment-specific issues, performance regressions, security defects.

Misses: business-fit problems, usability issues, real-user behavior.

Owner: QA, often supported by engineering.

Acceptance testing

Verifies the system meets the business need. Run by users, business owners, or both. Final gate before release.

Catches: requirements misalignment, usability problems, contractual gaps.

Misses: technical defects, since it focuses on whether the right thing was built.

Owner: product, business, or end users. Not QA. See acceptance testing.


The test pyramid

The classic recommendation: many unit tests, some integration tests, fewer system tests, very few full acceptance tests.

The reasoning: unit tests are fast and stable, system tests are slow and flaky. A 10,000-test suite of mostly unit tests runs faster than a 100-test suite of mostly UI tests, and it produces more useful failures.

The pyramid is still mostly right, with a modern caveat: AI-driven system testing has gotten dramatically cheaper. The trade-off is shifting. You can now afford broader system-level coverage than the original pyramid assumed.


Where teams blur the lines

The most common mistake is calling everything an "integration test."

A test that hits a real API, real database, and real browser is a system test, not an integration test. Calling it integration hides how much you depend on it and how much it costs to run.

A second common mistake: writing system tests when integration tests would do. If you can verify behavior with a real database and a stubbed UI layer, that is faster and more stable than driving Chrome.


What about end-to-end testing?

E2E testing is system testing through the user interface. It is one form of system testing, not a separate level.

E2E earned a bad reputation because it was slow and flaky in the era of brittle selectors. AI testing platforms like Bug0 and the open-source Passmark engine remove most of the brittleness, which makes E2E coverage practical at higher volumes than before.


How the levels combine in CI

A typical pipeline runs:

  1. On every push: unit tests, fast integration tests, lint, type check.
  2. On every PR: full integration suite, smoke E2E tests on the critical paths.
  3. On merge to main: full system test suite.
  4. Pre-release: acceptance tests with stakeholder sign-off.

The fastest tests run most often. The slowest run least often. Each level catches problems before the next more expensive level has to.


FAQs

How is unit testing different from integration testing?

Unit tests isolate one component with mocks. Integration tests let real components talk to each other. The boundary is fuzzy in practice but the intent matters.

Are E2E tests the same as system tests?

E2E is a subset of system testing, the kind that drives the UI. System tests can also include API-level full-flow tests without driving the UI.

Should developers write all four levels of tests?

Developers usually own unit and integration. QA usually owns system. Users own acceptance. Big organizations split further; small teams often have one person writing all four.

What is the right ratio of tests at each level?

Original pyramid: 70% unit, 20% integration, 10% system. Modern reality with AI testing: more like 60-25-15. The exact numbers matter less than the principle: lean toward faster tests where they suffice.

How does Bug0 fit across levels?

Bug0 targets system testing. It does not replace unit or integration testing. It makes high-volume system testing affordable, which lets your team rely less on expensive manual regression at the system level.

Ship every deploy with confidence.

Bug0 gives you a dedicated AI QA engineer that tests every critical flow, on every PR, with zero test code to maintain. 200+ engineering teams already made the switch.

From $2,500/mo. Full coverage in 7 days.

Go on vacation. Bug0 never sleeps. - Your AI QA engineer runs 24/7

Go on vacation.
Bug0 never sleeps.

Your AI QA engineer runs 24/7 — on every commit, every deploy, every schedule. Full coverage while you're off the grid.