Sandeep PandaUpdated Jun 10, 2026

Passmark.

The open-source Playwright library for AI regression testing. The same engine that powers Bug0 Managed.

Tests are written as plain-English steps and executed as Playwright actions by AI. Successful steps are cached so runs stay fast; when your UI changes, steps re-resolve themselves. We published the source so your tests aren't a black box. Read the code your FDE writes against.

View on GitHub Why we open sourced it

Licensed FSL-1.1-Apache-2.0.

Featured in Google AI Studio's developer showcase

checkout.spec.ts

await runSteps([
  { description: "Submit the form" },
  { description: "Verify the success message appears" },
]);

✓ 2 steps replayed from cache · 0 AI calls

✓ assertions agreed: claude + gemini

Capture intent, not steps.

Script-based tests break when you rename a button or restructure a form. Passmark steps describe the outcome, so they adapt automatically.

Traditional automation: breaks when the UI changes

await page.click('[data-testid="submit-btn"]');
await page.waitForSelector('.success-message');

Passmark: adapts because it understands intent

await runSteps([
  { description: "Submit the form" },
  { description: "Verify the success message appears" },
]);

How a step runs.

The AI is the fallback, not the hot path. That's why Passmark suites are fast, cheap, and self-healing at the same time.

01
Cache check
Each step is looked up by flow and description. A hit replays the proven Playwright action instantly, with no AI call.
02
AI execution
On a miss (or a failed replay after your UI changed), the AI reads the page and resolves the step from intent.
03
Re-cache
The newly resolved action is cached, so the suite heals itself once and stays fast on every run after.
04
Consensus assertions
Claude and Gemini verify the outcome independently; an arbiter resolves disagreements for a reliable pass/fail.

What's in the engine.

Natural-language execution.: Write each step as plain English. The AI translates it into Playwright actions at runtime. No selectors, no XPaths, no page objects.
Intelligent caching.: Successful steps are cached, so repeat runs skip the AI entirely. That's why suites stay fast and cheap: most steps replay in milliseconds.
Auto-healing.: When a cached step fails because your UI changed, Passmark re-resolves it with fresh AI execution and re-caches the result. No broken locators, ever.
Multi-model consensus.: Assertions run on Claude and Gemini in parallel. If they disagree, an arbiter model settles it, so a single model's hallucination never decides your pass/fail.
Video assertions.: Records the full step sequence and evaluates assertions against the video, catching ephemeral UI like toasts and snackbars that screenshots miss.
Computer-use mode.: Opt into visual, screenshot-driven automation per step for interactions that need coordinates. Mix CUA and snapshot-based steps in one run.
Email verification.: A pluggable email provider system tests signup confirmations, magic links, and OTP flows end-to-end with lazy-evaluated extraction.
Dynamic placeholders.: Inject runtime values with patterns like {{run.email}} and {{data.key}} for repeatable, data-driven test scenarios.

pnpm exec playwright test checkout.spec.ts

Running 1 test using 1 worker

[14:02:11.482] INFO (passmark-ai/4821): Starting step-by-step execution of 5 steps.
[14:02:11.530] DEBUG (passmark-ai/4821): Executing Cached Step: Navigate to /checkout
[14:02:11.812] DEBUG (passmark-ai/4821): Executing Cached Step: Add Acme T-Shirt to cart
[14:02:12.045] DEBUG (passmark-ai/4821): Executing Cached Step: Apply promo code SAVE20
[14:02:12.288] DEBUG (passmark-ai/4821): Error executing cached step, falling back to AI execution: locator timeout
[14:02:12.291] DEBUG (passmark-ai/4821): Executing Step: Click the Pay now button
[14:02:12.293] DEBUG (passmark-ai/4821): Using model: claude-sonnet-4-6 for step execution / gateway: vercel
[14:02:15.470] DEBUG (passmark-ai/4821): Cached step action: Click the Pay now button
[14:02:15.512] INFO (passmark-ai/4821): Running assertion: Order confirmation message is visible

  ✓  1 [chromium] › checkout.spec.ts:4:5 › happy path (4.1s)

  1 passed (4.6s)

4 steps from cache, 0 AI calls · 1 step auto-healed and re-cached

The engine is open. The service is managed.

Bug0 Managed is a service on top of Passmark. Your dedicated forward-deployed engineer plans your coverage, writes your tests against this library, verifies every run, and gates your releases, on Bug0's infrastructure, for one flat subscription. You get the engine's speed and self-healing, plus human judgment on every failure, without operating anything.

Prefer to build on the engine yourself? Clone the repo and go. Your tests stay portable either way.

Talk to the founders Learn about Managed QA services

Frequently Asked Questions

Passmark is the open-source Playwright library for AI regression testing that powers Bug0 Managed. It handles step execution with AI, multi-model assertion consensus, smart caching, and auto-healing. Source code is at github.com/bug0inc/passmark.

Passmark is licensed under FSL-1.1-Apache-2.0. You can read the source, run it locally, and build on it directly.

So your tests aren't a black box. You can read the code your FDE writes against, audit how every step executes and every assertion is verified, and your tests stay portable. If you ever leave Bug0, your tests still work.

Bug0 Managed is a service on top of Passmark. Your dedicated forward-deployed engineer builds and runs your tests on Passmark, on Bug0's cloud infrastructure, with human verification on every run. You get the engine's benefits without operating anything yourself.

Yes. Passmark works as a standalone library in any Playwright project. Bring your own Anthropic and Google API keys, or route through a gateway. Teams that want the outcome without the operation choose Bug0 Managed.

Read the code. Or let us run it for you.

Bug0 Managed from $2,500/mo flat. Month-to-month. Cancel anytime.

Book a Demo

Discounted 60-day pilot. Results in your first week.

Passmark.

Capture intent, not steps.

How a step runs.

Cache check

AI execution

Re-cache

Consensus assertions

What's in the engine.

The engine is open. The service is managed.

Frequently Asked Questions

Read the code. Or let us run it for you.