Agent Skill

FlakeMonster ships as an Agent Skill — a single SKILL.md file that works across Claude Code, Cursor, Codex, Copilot, Windsurf, and Antigravity.

What is an Agent Skill?

Agent Skills are instruction files that teach AI coding assistants new capabilities. Instead of requiring the developer to learn commands, flags, and workflows, a skill file contains everything the agent needs to operate the tool on the developer's behalf.

FlakeMonster's skill teaches the agent how to inject delays, run tests, analyze results, and report flaky tests — all through a single /flakemonster command. The agent handles the entire workflow: checking for stale state, determining which files to target, building the right command, parsing structured output, and cleaning up afterward.

The skill uses the open Agent Skills standard (agentskills.io), which defines a portable format for teaching capabilities to any compliant AI coding assistant.

Installation

One command installs the skill for each supported tool. The skill file is fetched directly from the FlakeMonster repository and placed in the tool-specific directory that the agent reads at startup.

Claude Code

Claude Code reads custom slash commands from the .claude/commands/ directory:

$ mkdir -p .claude/commands && \
  curl -sL https://raw.githubusercontent.com/growthboot/FlakeMonster/refs/heads/main/SKILL.md \
  -o .claude/commands/flakemonster.md

After installation, type /flakemonster in Claude Code followed by your test command.

Cursor

Cursor loads skills from the .cursor/skills/ directory:

$ mkdir -p .cursor/skills/flakemonster && \
  curl -sL https://raw.githubusercontent.com/growthboot/FlakeMonster/refs/heads/main/SKILL.md \
  -o .cursor/skills/flakemonster/SKILL.md

Codex

Codex reads skills from the .agents/skills/ directory:

$ mkdir -p .agents/skills/flakemonster && \
  curl -sL https://raw.githubusercontent.com/growthboot/FlakeMonster/refs/heads/main/SKILL.md \
  -o .agents/skills/flakemonster/SKILL.md

Copilot

Copilot loads skills from the .github/skills/ directory:

$ mkdir -p .github/skills/flakemonster && \
  curl -sL https://raw.githubusercontent.com/growthboot/FlakeMonster/refs/heads/main/SKILL.md \
  -o .github/skills/flakemonster/SKILL.md

Windsurf

Windsurf reads skills from the .windsurf/skills/ directory:

$ mkdir -p .windsurf/skills/flakemonster && \
  curl -sL https://raw.githubusercontent.com/growthboot/FlakeMonster/refs/heads/main/SKILL.md \
  -o .windsurf/skills/flakemonster/SKILL.md

Antigravity

Antigravity loads skills from the .agent/skills/ directory:

$ mkdir -p .agent/skills/flakemonster && \
  curl -sL https://raw.githubusercontent.com/growthboot/FlakeMonster/refs/heads/main/SKILL.md \
  -o .agent/skills/flakemonster/SKILL.md

Tip: Commit the skill file to your repository so every developer on the team gets it automatically. It works like any other dotfile config.

Usage

Once installed, type the /flakemonster command in any supported agent followed by your test command:

# Run your default test suite
$ /flakemonster npm test

# Run Playwright end-to-end tests
$ /flakemonster npx playwright test

# Run Jest with extra FlakeMonster flags
$ /flakemonster --runs 20 --mode hardcore npx jest

# Target a specific test file
$ /flakemonster npx playwright test tests/checkout.spec.ts

# Use node:test runner directly
$ /flakemonster node --test test/unit/*.test.js

The agent separates FlakeMonster flags (--runs, --mode, --seed, etc.) from the test command automatically. Everything that is not a FlakeMonster flag becomes the --cmd argument.

What the Skill Does

When you invoke /flakemonster, the skill instructs the agent to execute a complete flake detection workflow. The agent handles every step automatically:

Step 1: Validate Installation

The agent checks that flake-monster is available in the project. It looks for the package in devDependencies and tries npx flake-monster --version. If FlakeMonster is not installed, the agent tells you to run:

$ npm install --save-dev flake-monster

Step 2: Check for Stale Injections

Before injecting new delays, the agent looks for a leftover .flake-monster/manifest.json from a previous (possibly interrupted) session. If found, it means injected delay code is still present in the source files. The agent runs npx flake-monster restore to clean up before proceeding, and reports the stale session's seed and mode so you have context.

Important: Running a new injection on top of stale injected code would double-inject delays, corrupt the source, and produce meaningless results. The skill always ensures a clean starting state.

Step 3: Determine Source File Globs

The agent figures out which files to target for delay injection using this priority order:

Config file — reads include patterns from .flakemonsterrc.json or flakemonster.config.json
Project structure — scans for src/, lib/, or app/ directories and picks matching glob patterns based on the file extensions present
Ask the developer — if the project structure is ambiguous, the agent asks which source files to target

Step 4: Build the FlakeMonster Command

The agent constructs the full CLI command with appropriate flags. It uses --format json for structured output parsing. If a config file exists, the agent defers to config defaults for --runs and --mode unless you explicitly specified values.

For Playwright test commands, the agent automatically appends --reporter=json to the test command so FlakeMonster's built-in Playwright parser can process the output.

The constructed command looks like:

$ npx flake-monster test --format json --runs 10 --mode medium --cmd "npm test" "src/**/*.js"

Step 5: Execute Tests

The agent runs the command with a long timeout (10 minutes) since multiple test iterations can take time. It captures both stdout and stderr for complete analysis.

Step 6: Analyze Results

The agent parses the JSON output and reports what it finds:

Outcome	What the Agent Reports
All runs passed	Tests appear stable under delay injection. No timing sensitivity detected.
Flaky tests found	Each flaky test by name, its failure rate, and which seeds caused failures. Suggests re-running with `--seed <N> --runs 1 --keep-on-fail` for debugging.
Tests always fail	These are pre-existing bugs, not flakiness. Suggests fixing them first, then re-running.
Output not parseable	Falls back to run-level analysis using exit codes. Reports pass/fail counts without per-test breakdown.

Step 7: Clean Up

After reporting results, the agent runs npx flake-monster restore to remove all injected delay code from your source files. Your code is returned to its original state, with no traces of FlakeMonster left behind.

Why Agents Need This

AI coding agents do not experience flaky tests. An agent runs a test once, sees it pass, and moves on. But that test might be timing-sensitive — it could fail 10% of the time in CI when async operations resolve in a different order, when the database is under load, or when a network call takes 50ms longer than usual.

Without FlakeMonster, this is the typical agent workflow:

Agent writes or modifies code
Agent runs the test suite once
Tests pass
Agent commits the change
CI fails intermittently — the flaky test surfaces hours or days later

With FlakeMonster, the agent can stress-test its own code changes before committing:

Agent writes or modifies code
Agent runs /flakemonster npm test
FlakeMonster injects async delays and runs the suite 10 times with different seeds
A flaky test is detected — the agent sees exactly which test failed and under which seed
Agent fixes the timing-sensitive code
Agent commits the stable change

The key insight is that flaky tests are caused by timing assumptions, and FlakeMonster deliberately violates those assumptions by injecting deterministic delays. This turns a probabilistic problem (the test fails sometimes) into a deterministic one (the test fails with seed 948271536).

The skill file teaches the agent the complete workflow so that a single /flakemonster command is all a developer needs to type. The agent handles installation checks, stale state cleanup, file selection, command construction, result analysis, and source restoration automatically.

Think of it this way: FlakeMonster gives your AI coding agent the same superpower a senior engineer has — the instinct to ask "but does this pass reliably?" before calling a change done.

Previous GitHub Action Next How It Works