Why We Open Sourced Passmark, Bug0’s AI Regression Testing Tool

Sandeep PandaCo-founder & CTO, Bug0

@Sandeepg33k

April 3, 2026·5 min read

Cover image for Why We Open Sourced Passmark, Bug0’s AI Regression Testing Tool

Most AI testing tools get one thing right: writing tests is painful.

But they often miss the harder problem.

The real pain in regression testing is not generating the first version of a test. It is keeping that test alive as the product changes every week.

That is exactly why we built Passmark (GitHub), and why we decided to open source it. If you are evaluating it against similar tools, we have a head-to-head comparison of Passmark, Stagehand, Agent-Browser, and Expect.

The problem with AI testing today

There is no shortage of tools that can look at your app, understand a prompt, and generate some kind of browser automation. Most AI agents are built to test a single new feature or PR.

This is important. But it is not enough.

In real teams, thousands of tests need to run inside CI, across large suites, at predictable speed and cost. They need to survive UI changes. They need to avoid turning every test run into an expensive AI workflow.

This is where many AI-first testing tools break down.

If AI is in the loop on every single step of every single run, you end up with a system that is:

slower than traditional automation
more expensive at scale
harder to make deterministic
difficult to trust in CI

We wanted to solve regression testing in a way that actually works for engineering teams.

Our belief: AI should discover, Playwright should execute

Passmark is built around a simple idea:

Make AI-driven regression testing work at scale without slowing you down.

That means:

On the first run, AI agents navigate the product and understand the flow.
Each successful action gets cached, when possible
On subsequent runs, Passmark replays those cached actions using Playwright at native speed.
If the UI changes and a step breaks, AI steps back in to heal it.

This model matters.

Instead of paying the AI tax on every run, you pay it once when discovering or repairing a flow. Everything else behaves more like standard Playwright automation.

That gives you the best of both worlds:

natural language authoring
deterministic execution
much faster repeat runs
a practical path to scaling in CI

We think this is a better architecture for AI-powered regression testing.

Why open source?

We open sourced Passmark because the problem is too important to solve behind a black box.

Passmark.dev homepage screenshot.

Testing sits at the core of software delivery. If you are asking engineers to trust an AI system with release quality, the system should be inspectable.

Open source gives teams that.

They can understand how it works, see where AI is used, inspect the tradeoffs, and decide whether it fits their stack. They can run it in their own workflows, extend it, and build confidence over time.

We also think the future of testing needs a strong open foundation.

Developers already trust Playwright because it is flexible, composable, and works with their existing tooling. We wanted Passmark to feel the same way. Not a separate universe. Not a locked platform. A tool that fits into how modern teams already test.

That is why Passmark is designed to work inside normal Playwright tests instead of replacing the entire workflow.

Open source keeps us honest

There is a lot of hype in AI tooling right now.

A lot of products look magical in a demo and fall apart in real usage.

Open sourcing Passmark forces us to be clear about what we believe and how the system actually works.

We are not claiming that AI should replace everything.

We are saying something narrower and, in our view, more useful:

Let humans define intent in plain English
Let AI handle discovery and recovery
Let Playwright handle execution
Let caching make the whole thing practical

That is a much more grounded approach than pretending every test run should be fully agentic forever.

What Passmark is really for

Passmark is for teams that want the speed and reliability of Playwright without the burden of constantly rewriting brittle tests.

It is for teams that like the promise of AI, but do not want to bet their CI pipeline on an LLM improvising every time.

It is for teams that believe the future of testing is not hand-coded selectors everywhere, but also not uncontrolled autonomy.

It is for teams that want a middle path:
intent-driven tests with deterministic execution.

Why this matters for Bug0

Bug0's broader mission is to make regression testing dramatically easier to adopt and maintain.

Passmark is the open-source core of that vision.

By open sourcing it, we are making our thinking public:

where AI helps
where deterministic systems still matter
how testing can be both intelligent and practical

We want developers to use it directly, challenge it, improve it, and push the ecosystem forward.

And for teams that want a done-for-you experience, Bug0 can build on top of that open foundation with managed workflows, QA support, and deeper service layers.

The bigger picture

We do not think the future of software testing will be won by the tool with the most AI in the loop.

We think it will be won by the tool that uses AI in the right places.

That is the bet behind Passmark.

Use AI for discovery.
Use AI for healing.
Use Playwright for execution.
Use caching to make it real.

That is why we built it.

And that is why we open sourced it.

Passmark is the mechanism. The full strategic case for why AI-discovery plus deterministic execution is the new default lives in software testing strategies are obsolete in 2026.

GitHub: https://github.com/bug0inc/passmark

Website: https://passmark.dev/

ai testingRegression TestingplaywrightOpen Sourcetest-automation

The best QA automation tools in 2026, ranked by who should actually use them

May 1, 202621 min read

Introducing Bug0 Browsers: cloud Chromium as a Browser-as-a-Service for AI agents

April 23, 20268 min read

Software testing strategies are obsolete in 2026. Here's what replaced them.

April 21, 202618 min read

How to shard your Playwright tests: from 60 minutes to 8

April 17, 202624 min read

Recent Shorts

Two ways to test a login flow. Script-based: javascript await page.click('data-testid="email-input"'); await page.fill('data-testid="email-input"', 'user@test.com'); await page.click('data-testid="p...

I wrote recently about why our service layer isn't a compromise. Here's the part I didn't go deep enough on: the FDE pod is our best product researcher. Every day, our Forward-Deployed Engineers run ...

The testing industry spent fifteen years solving the wrong problem. CSS selectors break? Use data-testid. Data-testid is too coupled? Use aria-labels. Aria-labels change? Try XPath. XPath is fragile?...

1 min read

The problem with AI testing today

Our belief: AI should discover, Playwright should execute

Why open source?

Open source keeps us honest

What Passmark is really for

Why this matters for Bug0

The bigger picture

The best QA automation tools in 2026, ranked by who should actually use them

Introducing Bug0 Browsers: cloud Chromium as a Browser-as-a-Service for AI agents

Software testing strategies are obsolete in 2026. Here's what replaced them.

How to shard your Playwright tests: from 60 minutes to 8

Ship every deploy with confidence.