Updated Jun 10, 2026

Passmark.

The open-source Playwright library for AI regression testing. The same engine that powers Bug0 Managed.

Tests are written as plain-English steps and executed as Playwright actions by AI. Successful steps are cached so runs stay fast; when your UI changes, steps re-resolve themselves. We published the source so your tests aren't a black box. Read the code your FDE writes against.

Licensed FSL-1.1-Apache-2.0.

Featured in Google AI Studio's developer showcase
checkout.spec.ts
await runSteps([
  { description: "Submit the form" },
  { description: "Verify the success message appears" },
]);

✓ 2 steps replayed from cache · 0 AI calls

✓ assertions agreed: claude + gemini

Prospyr
Ferra
Genzeon
Prospyr
Ferra
Genzeon
Prospyr
Ferra
Genzeon
Prospyr
Ferra
Genzeon

Capture intent, not steps.

Script-based tests break when you rename a button or restructure a form. Passmark steps describe the outcome, so they adapt automatically.

Traditional automation: breaks when the UI changes

await page.click('[data-testid="submit-btn"]');
await page.waitForSelector('.success-message');

Passmark: adapts because it understands intent

await runSteps([
  { description: "Submit the form" },
  { description: "Verify the success message appears" },
]);

How a step runs.

The AI is the fallback, not the hot path. That's why Passmark suites are fast, cheap, and self-healing at the same time.

  1. 01

    Cache check

    Each step is looked up by flow and description. A hit replays the proven Playwright action instantly, with no AI call.

  2. 02

    AI execution

    On a miss (or a failed replay after your UI changed), the AI reads the page and resolves the step from intent.

  3. 03

    Re-cache

    The newly resolved action is cached, so the suite heals itself once and stays fast on every run after.

  4. 04

    Consensus assertions

    Claude and Gemini verify the outcome independently; an arbiter resolves disagreements for a reliable pass/fail.

Step requestCache checkhitReplay cached actionno AI call, millisecondsmiss / failed replayAI resolves stepfrom intent, reads the pageRe-cache actionConsensus assertionsclaude + gemini verify independentlyarbiter resolves disagreementspass / fail

What's in the engine.

Natural-language execution.

Write each step as plain English. The AI translates it into Playwright actions at runtime. No selectors, no XPaths, no page objects.

Intelligent caching.

Successful steps are cached, so repeat runs skip the AI entirely. That's why suites stay fast and cheap: most steps replay in milliseconds.

Auto-healing.

When a cached step fails because your UI changed, Passmark re-resolves it with fresh AI execution and re-caches the result. No broken locators, ever.

Multi-model consensus.

Assertions run on Claude and Gemini in parallel. If they disagree, an arbiter model settles it, so a single model's hallucination never decides your pass/fail.

Video assertions.

Records the full step sequence and evaluates assertions against the video, catching ephemeral UI like toasts and snackbars that screenshots miss.

Computer-use mode.

Opt into visual, screenshot-driven automation per step for interactions that need coordinates. Mix CUA and snapshot-based steps in one run.

Email verification.

A pluggable email provider system tests signup confirmations, magic links, and OTP flows end-to-end with lazy-evaluated extraction.

Dynamic placeholders.

Inject runtime values with patterns like {{run.email}} and {{data.key}} for repeatable, data-driven test scenarios.

pnpm exec playwright test checkout.spec.ts

The engine is open. The service is managed.

Bug0 Managed is a service on top of Passmark. Your dedicated forward-deployed engineer plans your coverage, writes your tests against this library, verifies every run, and gates your releases, on Bug0's infrastructure, for one flat subscription. You get the engine's speed and self-healing, plus human judgment on every failure, without operating anything.

Prefer to build on the engine yourself? Clone the repo and go. Your tests stay portable either way.

Frequently Asked Questions

Passmark is the open-source Playwright library for AI regression testing that powers Bug0 Managed. It handles step execution with AI, multi-model assertion consensus, smart caching, and auto-healing. Source code is at github.com/bug0inc/passmark.
Passmark is licensed under FSL-1.1-Apache-2.0. You can read the source, run it locally, and build on it directly.
So your tests aren't a black box. You can read the code your FDE writes against, audit how every step executes and every assertion is verified, and your tests stay portable. If you ever leave Bug0, your tests still work.
Bug0 Managed is a service on top of Passmark. Your dedicated forward-deployed engineer builds and runs your tests on Passmark, on Bug0's cloud infrastructure, with human verification on every run. You get the engine's benefits without operating anything yourself.
Yes. Passmark works as a standalone library in any Playwright project. Bring your own Anthropic and Google API keys, or route through a gateway. Teams that want the outcome without the operation choose Bug0 Managed.

Read the code. Or let us run it for you.

Bug0 Managed from $2,500/mo flat. Month-to-month. Cancel anytime.

Discounted 60-day pilot. Results in your first week.

Go on vacation. Bug0 never sleeps. The AI tests every commit, every deploy, every schedule. Your forward-deployed engineer reviews every failure and files the bugs. Coverage holds while you're off the grid.

Go on vacation.
Bug0 never sleeps.

The AI tests every commit, every deploy, every schedule. Your forward-deployed engineer reviews every failure and files the bugs. Coverage holds while you're off the grid.