How It Works

FlakeMonster uses a source-to-source transformation approach — parsing your code into an AST, injecting delay nodes, then generating modified source. No runtime hooks, no monkey-patching.

Architecture Overview

FlakeMonster is built around a strict separation between language-agnostic orchestration and language-specific code manipulation. The codebase is divided into three layers:

Core (src/core/) — The engine, workspace manager, manifest tracker, profile system, seed derivation, reporter, config loader, and test output parsers. The core is entirely language-agnostic. It orchestrates the injection/restore/test lifecycle but never touches an AST directly.
Adapters (src/adapters/) — Language-specific implementations that handle all parsing, AST walking, node injection, and code generation. Each adapter conforms to a shared contract so the core can drive any language through the same interface.
Runtime (src/runtime/) — Minimal, self-contained per-language runtime files. These get copied into the target project during injection and provide the delay function that injected code calls at runtime.

Currently JavaScript is the only adapter, but the architecture is explicitly designed for future language support — TypeScript, Python, Go, and any other language with async primitives can be added by implementing the adapter contract.

The key insight is that the core never needs to understand syntax. It asks the adapter "inject delays into this source string" and gets back a modified source string. The engine handles everything else: file discovery, workspace management, manifest tracking, test execution, and result analysis.

The Adapter Contract

Each language adapter has two core transformation functions, plus metadata properties and a scan method for recovery:

inject(source, options) → InjectionResult — Parse the source string, walk the AST, inject delay statements at appropriate locations, and return the modified source along with injection point metadata.
remove(source) → RemovalResult — Strip all injected code from the source string and return the original source. This must handle code that has been reformatted by linters or editors since injection.
id, displayName, fileExtensions — Metadata that identifies the adapter and the file types it handles.
canHandle(filePath) — Returns whether a given file should be processed by this adapter.
getRuntimeInfo() — Returns the runtime source path and filename for this language.
scan(source) — Scans source for remaining traces of injected code, used by recovery mode.

The engine calls these methods without knowing anything about the underlying language, AST format, or parser. This is the boundary that makes multi-language support possible: a Python adapter would use a completely different parser and AST representation, but the engine would call it the same way.

The InjectionResult contains:

{
  "source":        "...modified code...",  // the transformed source string
  "points":        [/* InjectionPoint[] */], // metadata for every injection made
  "runtimeNeeded": true                    // whether the runtime import was added
}

The RemovalResult contains:

{
  "source":       "...original code...",  // the restored source string
  "removedCount": 5                       // number of injections stripped
}

AST Pipeline

The JavaScript adapter uses a five-stage pipeline to transform source code. Crucially, the injection path uses text-based string splicing rather than AST code generation, which preserves the original formatting exactly:

Acorn — Parses JavaScript source into an ESTree-compatible AST. Acorn is a fast, standards-compliant parser that handles modern JavaScript syntax including async/await, top-level await, and ES modules.
astravel — Attaches comments from the source to their corresponding AST nodes. This is used during parsing to understand the existing comment structure.
acorn-walk — Walks the AST to find injection targets: async function declarations, async arrow functions, async methods, and top-level module statements.
Compute insertions — For each target location, the injector computes a text insertion descriptor containing the character offset and the delay text to insert (a marker comment and await __FlakeMonster__(N)). No AST mutation happens at this stage.
String-splice — The insertion descriptors are applied back-to-front into the original source string, preserving all original formatting, whitespace, and comments exactly as written.

The pipeline in simplified form:

Source  →  Acorn.parse()  →  acorn-walk.simple()  →  computeInjections()  →  applyInsertions()  →  Modified Source

Comment Handling Gotcha

When re-parsing injected code for removal, astravel attaches marker comments to adjacent nodes rather than the removed ones. The remover must perform a full AST walk to strip orphaned marker comments after removing delay statements. This is handled by the stripMarkerComments() function in the remover.

Injected Code Anatomy

Each injection point consists of exactly two lines inserted into the source:

/* @flake-monster[jt92-se2j!] v1 */
await __FlakeMonster__(23);

These two lines have carefully chosen properties:

The comment is the marker stamp — the string @flake-monster[jt92-se2j!] v1 is a unique, greppable identifier that enables safe removal. The stamp jt92-se2j! is deliberately unusual to avoid false matches in real code. The v1 suffix is a version tag for future-proofing the removal logic.
The await expression calls the runtime's delay function with a deterministic millisecond value. The value (23 in this example) is computed at injection time from the seed derivation chain — it is a literal number, not a runtime computation.
The __FlakeMonster__ identifier uses double-underscore naming deliberately. This resists lint rules that might strip or rename ordinary variables, ensuring the injected code survives auto-formatting.

At the top of each injected file, the adapter also inserts a runtime import:

import { __FlakeMonster__ } from "./flake-monster.runtime.js";

The import path is computed relative from each injected file to the project root. For example, a file at src/api.js gets ../flake-monster.runtime.js, while a file at src/lib/utils.js gets ../../flake-monster.runtime.js.

Both constants — the DELAY_OBJECT identifier (__FlakeMonster__) and the MARKER_PREFIX (@flake-monster[jt92-se2j!] v1) — are defined in injector.js so they stay in sync across injection and removal.

The Runtime

The runtime is a single-line ES module:

export const __FlakeMonster__ = (ms) => new Promise(r => setTimeout(r, ms));

This is all there is. No PRNG, no hashing, no configuration. The runtime is a pure delay function — it takes a millisecond value and returns a Promise that resolves after that many milliseconds.

Key properties of the runtime:

Zero dependencies — the file has no imports and relies only on setTimeout, which is available in every JavaScript environment.
Works in Node.js and browsers — setTimeout is a universal API. The runtime works in Node.js, browsers, Deno, Bun, and any other JavaScript environment.
Copied, not linked — the runtime file is copied to the project root during injection as flake-monster.runtime.js. It does not depend on node_modules or any package resolution. This means injected code works even if FlakeMonster itself is not installed.
Deterministic at the source level — all randomness (seed derivation, delay computation) happens at injection time. The runtime simply executes the pre-computed delay values embedded in the source.

Why Source-to-Source?

FlakeMonster chose source-to-source transformation over alternatives like runtime monkey-patching, V8 hooks, or test framework plugins. Three reasons drove this decision:

1. Stable Debugging Surface

Injected code is visible in your editor, debugger, and stack traces. When a test fails, you can open the file and see exactly which delay caused the timing shift. There is no hidden runtime magic, no indirection through proxies, and no opaque stack frames from an interception layer. You can set a breakpoint on the injected await line and step through the timing perturbation directly.

2. Works in Browsers

Source-to-source output is plain JavaScript. It survives bundling (webpack, Vite, esbuild, Rollup), tree-shaking, and minification. The injected delays execute in the browser exactly as they do in Node.js. This matters for projects that test browser code — the same FlakeMonster injection works whether your test runner is Node-based or browser-based.

3. Language Agnostic

The approach generalizes to any language with async primitives. Each language just needs a parser and a code generator. The core engine does not depend on JavaScript-specific APIs like vm.Module, V8 inspector, or Node.js loader hooks. A Python adapter would parse Python source, inject await asyncio.sleep(N) calls, and regenerate Python source — the same pattern, different syntax.

Manifest

FlakeMonster creates .flake-monster/manifest.json during injection to track the state of all injected files. The manifest serves as the single source of truth for what has been modified:

{
  "version": 1,
  "createdAt": "2026-02-24T16:00:00Z",
  "seed": 12345,
  "mode": "medium",
  "files": {
    "src/api.js": { "injections": 3, "originalHash": "abc123" },
    "src/store.js": { "injections": 2, "originalHash": "def456" }
  }
}

The manifest is used for several purposes:

Tracking which files were injected — the files map lists every file that received delays, along with the number of injections and a hash of the original content.
Stale injection detection — if a file's current hash does not match the originalHash in the manifest, FlakeMonster knows the file has been modified since injection and can warn or refuse to operate on it.
Preventing double-injection — if a manifest already exists, FlakeMonster will not inject again unless the user explicitly restores first. This prevents cascading delays that would corrupt the source.
Restoration — the restore command reads the manifest to know which files need to be cleaned up and verifies the restoration was complete.

Removal

Removal is text-based, not AST-based. This is a deliberate design choice. After injection, the source files may pass through linters (ESLint, Prettier), editors (auto-format on save), or other tools that reformat the code. An AST-based remover would need to re-parse the potentially reformatted code and match nodes — a fragile process. Instead, FlakeMonster's remover uses line-by-line text matching on three patterns:

The stamp comment: lines containing @flake-monster[jt92-se2j!]
The delay identifier: lines containing await __FlakeMonster__(
The runtime import: lines containing import { __FlakeMonster__ }

Any line matching one of these patterns is removed. The stamp comment is the primary driver — its unusual format (jt92-se2j!) makes false positives effectively impossible in real codebases. The __FlakeMonster__ identifier provides a secondary match for cases where the comment was stripped but the delay statement survived.

After text-based removal, the remover also handles orphaned marker comments that astravel may have re-attached to adjacent nodes during any intermediate re-parsing. The stripMarkerComments() function performs a full scan to catch these edge cases.

Workspace vs In-Place

FlakeMonster supports two modes for where injection happens, each with distinct tradeoffs:

In-Place (default)

In-place mode modifies your source files directly. The injected delays are written into the actual files on disk.

Faster — no file copying overhead. Injection and restoration operate on the original files.
Better for manual debugging — you can open the injected files in your editor, set breakpoints on delay lines, and step through the timing perturbations.
Requires restoration — files must be restored to their original state after each test run. The test command handles this automatically, but if you use inject manually, you must run restore when done.

# In-place: test command handles inject + restore automatically
$ flake-monster test --cmd "npm test"

# In-place: manual workflow
$ flake-monster inject "src/**/*.js"
$ npm test
$ flake-monster restore

Workspace

Workspace mode creates isolated copies of your source files in .flake-monster/workspaces/run-N-seed-M/. The original files are never modified.

Safer — original files stay untouched. No risk of corrupted source if the process crashes mid-injection.
Isolated per run — each test run gets its own workspace directory with its own set of injected files. This allows parallel analysis of different timing profiles.
Preservable — use --keep-on-fail to preserve the workspace of failing runs. You can inspect the exact injected code that caused the failure after the run completes.
More disk space — every workspace is a full copy of the matched files. With 10 runs over a large project, this can add up.

# Workspace mode with failure preservation
$ flake-monster test --workspace --keep-on-fail --cmd "npm test"

Tip: Use workspace mode in CI environments where a crash during in-place restoration could leave your working tree in a dirty state. Use in-place mode during local development where the speed benefit matters and you have git to recover.

Implementation Note

Because .flake-monster/workspaces/ lives inside the project directory, Node's fs.cp would throw EINVAL when attempting to copy the project into its own subdirectory — even with a filter function. FlakeMonster works around this by using a manual recursive copy that explicitly skips the .flake-monster directory.

Flakiness Analysis

After all test runs complete, FlakeMonster analyzes the results and classifies each test into one of three categories:

Classification	Condition	Meaning
Flaky	Passes some runs, fails others	Timing-sensitive — likely a race condition, order dependency, or shared mutable state that behaves differently under varied async timing.
Stable	Passes all runs	No timing issues detected under the tested delay profiles. The test is resilient to the async perturbations FlakeMonster introduced.
Always-failing	Fails all runs	Pre-existing bug, not flakiness. The test fails regardless of timing, which means the failure is deterministic and unrelated to async ordering.

Flaky Rate

For each flaky test, FlakeMonster computes a flaky rate: the number of failed runs divided by the total number of runs. This gives you a quantitative measure of how sensitive the test is to timing variation.

flakyRate = failedRuns / totalRuns

For example, a test that fails 2 out of 10 runs has a 20% flaky rate. A test that fails 8 out of 10 runs has an 80% flaky rate — it almost always fails under timing perturbation, suggesting a severe race condition.

The flaky rate helps you prioritize fixes. A test with a 5% flaky rate might be tolerable in the short term, while a test with a 60% flaky rate is likely disrupting your CI pipeline regularly and should be fixed immediately.

Example Output

FlakeMonster v0.4.6  seed=12345  mode=medium  runs=10

Run  1/10 PASS (seed=3892047156)
Run  2/10 PASS (seed=1740283695)
Run  3/10 FAIL (seed=948271536)
Run  4/10 PASS (seed=2618493027)
Run  5/10 PASS (seed=741928365)
Run  6/10 FAIL (seed=3019482756)
Run  7/10 PASS (seed=1582937461)
Run  8/10 PASS (seed=2847193025)
Run  9/10 PASS (seed=4102938475)
Run 10/10 PASS (seed=938471625)

-- Results --

1 flaky test detected:

  cart > applies discount code
    Failed seeds: 948271536, 3019482756
    Flaky rate: 20%

14 stable tests
0 always-failing tests

The failed seeds let you reproduce each specific failure. Pass any of them back as --seed with --runs 1 to recreate the exact timing conditions that exposed the bug.

Previous Agent Skill Next Troubleshooting