Passmark.
The open-source Playwright library for AI regression testing. The same engine that powers Bug0 Managed.
Tests are written as plain-English steps and executed as Playwright actions by AI. Successful steps are cached so runs stay fast; when your UI changes, steps re-resolve themselves. We published the source so your tests aren't a black box. Read the code your FDE writes against.
Licensed FSL-1.1-Apache-2.0.
Featured in Google AI Studio's developer showcaseawait runSteps([
{ description: "Submit the form" },
{ description: "Verify the success message appears" },
]);✓ 2 steps replayed from cache · 0 AI calls
✓ assertions agreed: claude + gemini




Capture intent, not steps.
Script-based tests break when you rename a button or restructure a form. Passmark steps describe the outcome, so they adapt automatically.
Traditional automation: breaks when the UI changes
await page.click('[data-testid="submit-btn"]');
await page.waitForSelector('.success-message');Passmark: adapts because it understands intent
await runSteps([
{ description: "Submit the form" },
{ description: "Verify the success message appears" },
]);How a step runs.
The AI is the fallback, not the hot path. That's why Passmark suites are fast, cheap, and self-healing at the same time.
- 01
Cache check
Each step is looked up by flow and description. A hit replays the proven Playwright action instantly, with no AI call.
- 02
AI execution
On a miss (or a failed replay after your UI changed), the AI reads the page and resolves the step from intent.
- 03
Re-cache
The newly resolved action is cached, so the suite heals itself once and stays fast on every run after.
- 04
Consensus assertions
Claude and Gemini verify the outcome independently; an arbiter resolves disagreements for a reliable pass/fail.
What's in the engine.
- Natural-language execution.
Write each step as plain English. The AI translates it into Playwright actions at runtime. No selectors, no XPaths, no page objects.
- Intelligent caching.
Successful steps are cached, so repeat runs skip the AI entirely. That's why suites stay fast and cheap: most steps replay in milliseconds.
- Auto-healing.
When a cached step fails because your UI changed, Passmark re-resolves it with fresh AI execution and re-caches the result. No broken locators, ever.
- Multi-model consensus.
Assertions run on Claude and Gemini in parallel. If they disagree, an arbiter model settles it, so a single model's hallucination never decides your pass/fail.
- Video assertions.
Records the full step sequence and evaluates assertions against the video, catching ephemeral UI like toasts and snackbars that screenshots miss.
- Computer-use mode.
Opt into visual, screenshot-driven automation per step for interactions that need coordinates. Mix CUA and snapshot-based steps in one run.
- Email verification.
A pluggable email provider system tests signup confirmations, magic links, and OTP flows end-to-end with lazy-evaluated extraction.
- Dynamic placeholders.
Inject runtime values with patterns like {{run.email}} and {{data.key}} for repeatable, data-driven test scenarios.
Running 1 test using 1 worker [14:02:11.482] INFO (passmark-ai/4821): Starting step-by-step execution of 5 steps. [14:02:11.530] DEBUG (passmark-ai/4821): Executing Cached Step: Navigate to /checkout [14:02:11.812] DEBUG (passmark-ai/4821): Executing Cached Step: Add Acme T-Shirt to cart [14:02:12.045] DEBUG (passmark-ai/4821): Executing Cached Step: Apply promo code SAVE20 [14:02:12.288] DEBUG (passmark-ai/4821): Error executing cached step, falling back to AI execution: locator timeout [14:02:12.291] DEBUG (passmark-ai/4821): Executing Step: Click the Pay now button [14:02:12.293] DEBUG (passmark-ai/4821): Using model: claude-sonnet-4-6 for step execution / gateway: vercel [14:02:15.470] DEBUG (passmark-ai/4821): Cached step action: Click the Pay now button [14:02:15.512] INFO (passmark-ai/4821): Running assertion: Order confirmation message is visible ✓ 1 [chromium] › checkout.spec.ts:4:5 › happy path (4.1s) 1 passed (4.6s)
4 steps from cache, 0 AI calls · 1 step auto-healed and re-cached
The engine is open. The service is managed.
Bug0 Managed is a service on top of Passmark. Your dedicated forward-deployed engineer plans your coverage, writes your tests against this library, verifies every run, and gates your releases, on Bug0's infrastructure, for one flat subscription. You get the engine's speed and self-healing, plus human judgment on every failure, without operating anything.
Prefer to build on the engine yourself? Clone the repo and go. Your tests stay portable either way.
Frequently Asked Questions
Read the code. Or let us run it for you.
Bug0 Managed from $2,500/mo flat. Month-to-month. Cancel anytime.
Discounted 60-day pilot. Results in your first week.