Testboxes (Beta)

Testboxes are in early beta. The interface and behavior may evolve as we iterate.

Overview

Testboxes sync your local changes to a Blacksmith microVM and run commands inside a real GitHub Actions job, with access to secrets, OIDC tokens, and service containers. Output streams back to the agent. The CLI is agent-first. The intended consumer is a coding agent, not a human. This informs the design: commands and flags are verbose and explicit so agents can construct invocations without ambiguity.

When to use Testboxes

Parallel agents in worktrees. Each agent warms up its own testbox and runs tests independently, without interfering with the others.
Cross-platform development. Developing on macOS but targeting Linux. Testboxes run on Linux microVMs with the real target environment.
Debug flaky tests. Spin up dozens of testboxes in parallel to reproduce a non-deterministic flake, then let the agent debug once it triggers.
Fast iteration loops. After initial hydration, subsequent runs take 1-3 seconds. Only changed files sync.

How it works

Testbox follows a two-step workflow: warmup, then run.

Warmup

The agent calls blacksmith testbox warmup <workflow>. This dispatches a GitHub Actions workflow and returns a testbox ID immediately. The runner boots and hydrates in the background: installing dependencies, starting service containers, and running any setup steps defined in the workflow.

Run

The agent calls blacksmith testbox run --id <id> "<command>". This syncs local changes to the testbox, executes the command, and streams stdout/stderr back. The agent checks the output and exit code.

Once the testbox is ready, run commands complete in 1-3 seconds since only changed files need to sync. You can run as many commands as you want against the same warm testbox.

Get started

Install the CLI

curl -fsSL https://get.blacksmith.sh | sh

The CLI auto-updates in the background on every invocation.

Set up a testbox workflow for your repository

blacksmith testbox init

Scans your repo’s workflows, selects the right job, and generates a testbox-compatible workflow file and agent skill. If you’re not logged in yet, the CLI will open a browser to authenticate first. Only needs to be run once per repo.

Agent integration

The init command generates a skill file that teaches agents to:

Warm up immediately when receiving a coding task.
Never run tests locally. Always run through Testbox.
Reuse the testbox ID for all subsequent run commands within the same task.
Fix and re-run when tests fail. Re-runs take 1-2 seconds.
Stop the testbox when done, or let the idle timeout handle cleanup.

Pricing

Testbox execution is billed per-minute on the same Blacksmith runner pricing as CI. The idle timeout (default 30 minutes) ensures testboxes don’t run indefinitely.

FAQ

What happens if the testbox is still hydrating when the agent calls run?

The run command automatically waits for the testbox to reach ready status before syncing files and executing the command. The agent does not need to poll status separately.

How does file sync work?

The CLI uses rsync --delete --checksum over SSH to mirror the local working tree to the testbox. Only changed files are transferred. Deleted files locally are removed from the testbox. After the initial full sync, incremental syncs typically complete in under a second.

Can the agent run multiple commands on the same testbox?

Yes. The testbox stays warm until the idle timeout is reached or the agent calls stop. The agent can issue as many run commands as needed.

What is the idle timeout?

The default idle timeout is 30 minutes. If no run command is issued within that window, the testbox is automatically stopped. The timeout can be configured via --idle-timeout during warmup.

Do I need to set up SSH keys?

No. When --ssh-public-key is not specified during warmup, the CLI generates a keypair and caches it at ~/.blacksmith/testboxes/{id}/. You don’t need to manage SSH keys yourself.

Where are credentials stored?

Authentication tokens are stored at ~/.blacksmith/credentials. SSH keypairs are cached at ~/.blacksmith/testboxes/{id}/.

Getting Started

Performance

Testboxes

Observability

Administration

FAQ

Overview

When to use Testboxes

How it works

Get started

Agent integration

Pricing

FAQ

Getting Started

Performance

Testboxes

Observability

Administration

FAQ

​Overview

​When to use Testboxes

​How it works

​Get started

​Agent integration

​Pricing

​FAQ

Overview

When to use Testboxes

How it works

Get started

Agent integration

Pricing

FAQ