Agent Skill
FlakeMonster ships as an Agent Skill — a single SKILL.md file that works across Claude Code, Cursor, Codex, Copilot, Windsurf, and Antigravity.
What is an Agent Skill?
Agent Skills are instruction files that teach AI coding assistants new capabilities. Instead of requiring the developer to learn commands, flags, and workflows, a skill file contains everything the agent needs to operate the tool on the developer's behalf.
FlakeMonster's skill teaches the agent how to inject delays, run tests, analyze results, and report flaky tests — all through a single /flakemonster command. The agent handles the entire workflow: checking for stale state, determining which files to target, building the right command, parsing structured output, and cleaning up afterward.
The skill uses the open Agent Skills standard (agentskills.io), which defines a portable format for teaching capabilities to any compliant AI coding assistant.
Installation
One command installs the skill for each supported tool. The skill file is fetched directly from the FlakeMonster repository and placed in the tool-specific directory that the agent reads at startup.
Claude Code
Claude Code reads custom slash commands from the .claude/commands/ directory:
$ mkdir -p .claude/commands && \ curl -sL https://raw.githubusercontent.com/growthboot/FlakeMonster/refs/heads/main/SKILL.md \ -o .claude/commands/flakemonster.md
After installation, type /flakemonster in Claude Code followed by your test command.
Cursor
Cursor loads skills from the .cursor/skills/ directory:
$ mkdir -p .cursor/skills/flakemonster && \ curl -sL https://raw.githubusercontent.com/growthboot/FlakeMonster/refs/heads/main/SKILL.md \ -o .cursor/skills/flakemonster/SKILL.md
Codex
Codex reads skills from the .agents/skills/ directory:
$ mkdir -p .agents/skills/flakemonster && \ curl -sL https://raw.githubusercontent.com/growthboot/FlakeMonster/refs/heads/main/SKILL.md \ -o .agents/skills/flakemonster/SKILL.md
Copilot
Copilot loads skills from the .github/skills/ directory:
$ mkdir -p .github/skills/flakemonster && \ curl -sL https://raw.githubusercontent.com/growthboot/FlakeMonster/refs/heads/main/SKILL.md \ -o .github/skills/flakemonster/SKILL.md
Windsurf
Windsurf reads skills from the .windsurf/skills/ directory:
$ mkdir -p .windsurf/skills/flakemonster && \ curl -sL https://raw.githubusercontent.com/growthboot/FlakeMonster/refs/heads/main/SKILL.md \ -o .windsurf/skills/flakemonster/SKILL.md
Antigravity
Antigravity loads skills from the .agent/skills/ directory:
$ mkdir -p .agent/skills/flakemonster && \ curl -sL https://raw.githubusercontent.com/growthboot/FlakeMonster/refs/heads/main/SKILL.md \ -o .agent/skills/flakemonster/SKILL.md
Tip: Commit the skill file to your repository so every developer on the team gets it automatically. It works like any other dotfile config.
Usage
Once installed, type the /flakemonster command in any supported agent followed by your test command:
# Run your default test suite $ /flakemonster npm test # Run Playwright end-to-end tests $ /flakemonster npx playwright test # Run Jest with extra FlakeMonster flags $ /flakemonster --runs 20 --mode hardcore npx jest # Target a specific test file $ /flakemonster npx playwright test tests/checkout.spec.ts # Use node:test runner directly $ /flakemonster node --test test/unit/*.test.js
The agent separates FlakeMonster flags (--runs, --mode, --seed, etc.) from the test command automatically. Everything that is not a FlakeMonster flag becomes the --cmd argument.
What the Skill Does
When you invoke /flakemonster, the skill instructs the agent to execute a complete flake detection workflow. The agent handles every step automatically:
Step 1: Validate Installation
The agent checks that flake-monster is available in the project. It looks for the package in devDependencies and tries npx flake-monster --version. If FlakeMonster is not installed, the agent tells you to run:
$ npm install --save-dev flake-monster
Step 2: Check for Stale Injections
Before injecting new delays, the agent looks for a leftover .flake-monster/manifest.json from a previous (possibly interrupted) session. If found, it means injected delay code is still present in the source files. The agent runs npx flake-monster restore to clean up before proceeding, and reports the stale session's seed and mode so you have context.
Important: Running a new injection on top of stale injected code would double-inject delays, corrupt the source, and produce meaningless results. The skill always ensures a clean starting state.
Step 3: Determine Source File Globs
The agent figures out which files to target for delay injection using this priority order:
- Config file — reads
includepatterns from.flakemonsterrc.jsonorflakemonster.config.json - Project structure — scans for
src/,lib/, orapp/directories and picks matching glob patterns based on the file extensions present - Ask the developer — if the project structure is ambiguous, the agent asks which source files to target
Step 4: Build the FlakeMonster Command
The agent constructs the full CLI command with appropriate flags. It uses --format json for structured output parsing. If a config file exists, the agent defers to config defaults for --runs and --mode unless you explicitly specified values.
For Playwright test commands, the agent automatically appends --reporter=json to the test command so FlakeMonster's built-in Playwright parser can process the output.
The constructed command looks like:
$ npx flake-monster test --format json --runs 10 --mode medium --cmd "npm test" "src/**/*.js"
Step 5: Execute Tests
The agent runs the command with a long timeout (10 minutes) since multiple test iterations can take time. It captures both stdout and stderr for complete analysis.
Step 6: Analyze Results
The agent parses the JSON output and reports what it finds:
| Outcome | What the Agent Reports |
|---|---|
| All runs passed | Tests appear stable under delay injection. No timing sensitivity detected. |
| Flaky tests found | Each flaky test by name, its failure rate, and which seeds caused failures. Suggests re-running with --seed <N> --runs 1 --keep-on-fail for debugging. |
| Tests always fail | These are pre-existing bugs, not flakiness. Suggests fixing them first, then re-running. |
| Output not parseable | Falls back to run-level analysis using exit codes. Reports pass/fail counts without per-test breakdown. |
Step 7: Clean Up
After reporting results, the agent runs npx flake-monster restore to remove all injected delay code from your source files. Your code is returned to its original state, with no traces of FlakeMonster left behind.
Why Agents Need This
AI coding agents do not experience flaky tests. An agent runs a test once, sees it pass, and moves on. But that test might be timing-sensitive — it could fail 10% of the time in CI when async operations resolve in a different order, when the database is under load, or when a network call takes 50ms longer than usual.
Without FlakeMonster, this is the typical agent workflow:
- Agent writes or modifies code
- Agent runs the test suite once
- Tests pass
- Agent commits the change
- CI fails intermittently — the flaky test surfaces hours or days later
With FlakeMonster, the agent can stress-test its own code changes before committing:
- Agent writes or modifies code
- Agent runs
/flakemonster npm test - FlakeMonster injects async delays and runs the suite 10 times with different seeds
- A flaky test is detected — the agent sees exactly which test failed and under which seed
- Agent fixes the timing-sensitive code
- Agent commits the stable change
The key insight is that flaky tests are caused by timing assumptions, and FlakeMonster deliberately violates those assumptions by injecting deterministic delays. This turns a probabilistic problem (the test fails sometimes) into a deterministic one (the test fails with seed 948271536).
The skill file teaches the agent the complete workflow so that a single /flakemonster command is all a developer needs to type. The agent handles installation checks, stale state cleanup, file selection, command construction, result analysis, and source restoration automatically.
Think of it this way: FlakeMonster gives your AI coding agent the same superpower a senior engineer has — the instinct to ask "but does this pass reliably?" before calling a change done.