Skip to main content

Overview

Monitors watch your GitHub Actions workflows and alert your team in Slack when something goes wrong. A monitor might be “alert me when this job fails 3 times in a row on main” or “notify me whenever this step is skipped.” You can also turn on VM retention, which keeps the runner VM alive when an alert fires so you can SSH in and see exactly what happened. The only setup is connecting your Slack workspace. Monitor list showing active monitors and their status

Basics

Creating a monitor

Go to Monitors in the sidebar and click New Monitor. The wizard has four steps:
  1. Pick the repository, workflow, job, and optionally a step to watch
  2. Set the alert condition: single event or consecutive events, which conclusions to match, severity, cooldown, and optional log pattern filters or VM retention
  3. Choose Slack channel(s) for notifications
  4. Review and create
Monitor creation wizard with job selection step Monitors can watch job-level or step-level conclusions. Adding a step filter narrows alerting to that step. Without one, the monitor watches the job as a whole.

Job selection

Pick what the monitor watches:
  • Repository - which repo to watch
  • Workflow - which workflow(s)
  • Job name - which job(s) within a workflow
You can also narrow further with:
  • Step name - watch a specific step within a job
  • Branch - only alert on events from certain branches
  • Exclude branch - skip events from specific branches (e.g., staging or dependabot)
All filters support glob patterns:
PatternMatches
*Everything
deploy*Anything starting with “deploy”
*-testAnything ending with “-test”
buildExact match only

Alert condition

Alert after

  • Single event - alert each time a matching conclusion occurs
  • Consecutive events - only alert after N matching conclusions in a row, which cuts noise from flaky jobs

Cooldown

Monitors have a cooldown period (default 60 minutes) to prevent alert spam. After an alert fires, subsequent matches are suppressed until the cooldown expires. Resolving a monitor resets the cooldown immediately.

Log pattern filters

You can require that job or step logs match specific patterns before an alert triggers. Patterns use RE2 regex syntax with an optional case-sensitivity toggle. Up to 10 patterns per monitor. Useful when you only care about specific errors (e.g., OutOfMemory, SIGKILL) rather than all failures.

Notifications

Slack alerts

When a monitor fires, Blacksmith posts to your configured Slack channel(s) with:
  • Severity badge and monitor name (links to the monitor detail page)
  • Repository, workflow, job, and branch
  • Conclusion and timestamp
  • A View Job/Step button linking to GitHub Actions
  • A Resolve button that clears the alert state and resets the cooldown
  • A Mute button to mute the monitor from Slack
Slack alert message with severity badge and action buttons Alerts are color-coded by severity: red for critical, yellow for warning, blue for info.
Clicking a Slack button requires your Slack account to be linked to Blacksmith. If it isn’t, you’ll be prompted to connect it from your settings page.

Managing monitors

Muting

Mute a monitor to suppress alerts and VM retentions without deleting it. You can mute from the dashboard or the Slack alert, and unmute from the dashboard.

Resolving

Resolve a monitor to clear its alert state and reset the cooldown. You can resolve from the dashboard or the Slack alert.

Editing and deleting

All monitor settings (filters, conditions, notifications, retention) can be edited from the dashboard. Delete a monitor from its action menu.

VM retention

Monitors can keep VMs alive when an alert fires so you can SSH in and debug. The Slack alert includes an SSH command you can paste directly into your terminal, e.g. ssh -p 2222 [email protected]. Job-level retention keeps the VM alive for up to 8 hours after the job completes. To end retention early, go to the monitor’s page or run the following while SSH’d in: Linux:
sudo reboot
Windows:
shutdown /s /f /t 0
Step-level retention pauses the job at the matched step for up to 6 hours. To resume the job, run the following while SSH’d in:
continue_step
VM retention requires a cooldown of at least 60 minutes. Step-level monitors do not support VM retention on cancelled or skipped steps.
For more on SSH access, see SSH Access. For VM retention billing, see Pricing.

Severity levels

Pick a severity when creating a monitor:
  • Critical - the job is broken and someone needs to look now
  • Warning - something failed and should be investigated soon
  • Info - worth knowing about, not necessarily urgent
Severity controls the color of the Slack alert (red, yellow, blue) and the badge in the dashboard.

Permissions

  • Organization admins can create and manage monitors for any repository
  • Users with write access can create monitors for repositories they can access
  • All organization members can view monitors for repositories they can access

Pricing

Monitors are free. VM retention billing depends on the type:
  • Job-level retention keeps the VM alive after the job completes. Since the VM outlives the job, retention time is billed at the same rate as runner minutes.
  • Step-level retention pauses the job at the matched step while the job is still running. Retention time is billed as part of the job’s normal runner minutes.
For all other pricing, see our pricing page.

FAQ

Go to Settings > Integrations and click Link Account. Once connected, you can pick Slack channels when creating monitors.
Yes. Add a step name filter when creating a monitor. You can use exact matches or glob patterns (e.g., Run tests*).
A few things that help:
  • Use consecutive events to only alert after N failures in a row
  • Set a longer cooldown to limit how often a monitor can fire
  • Mute monitors during known maintenance windows
  • Narrow monitors to specific workflows, jobs, and branches
  • Add log pattern filters to only trigger on specific errors
Enable VM retention on your monitor. When it fires, the VM stays alive and the Slack alert includes an SSH command.