workflow-builder
NewBuild, track, and resume hierarchical workflows. Every workflow is a tree of nodes (Goal → Phases → Tasks → Sub-tasks). When work hits a "we need data/tool we don't have yet" moment, spawn a SUB-WORKFLOW node — and resume the parent automatically when the sub completes. State persists per-project so any session (and any future agent) can pick up exactly where the last one paused. Use when you say "start a workflow", "we hit a fork", "resume X", "show me where we are in Y", or whenever beginning a non-trivial multi-step build that may discover unknown unknowns mid-flight.
Overview
Workflow Builder
Real builds aren't linear. They look like trees: a goal at the root, with phases that decompose into tasks, that occasionally fork into entirely new sub-workflows (when a parent task discovers it needs a tool/dataset/process that doesn't exist yet).
Running example used throughout this skill: "Ship a public `/reports` REST API." Midway through building the endpoint you discover there's no rate-limiter — and you can't expose a public API without one. That's a fork: you spawn a
rate-limitersub-workflow, build it, return to the API work, and later realise the rate-limiter is reusable and graduate it into a shared platform skill.This skill makes that tree explicit and persistent.
Why we need this
Three failure modes this skill prevents:
- Context loss when forks happen. "We need X to continue Y" → we build X over many turns → we forget Y was the parent. The state file pins the parent so we always know where to resume.
- Sub-workflows that turn out to be reusable get lost. The rate-limiter started as a one-off detour for the
/reportsAPI; it's actually shared middleware every endpoint needs. The skill captures this graduation explicitly. - No durable "where are we" answer. Every new session asks "what's the state of project X?" — the workflow tree is the answer, in YAML, in the project's own folder.
The four shapes (taxonomy)
Not everything is a workflow. There are four distinct shapes of work, with different state-keeping needs. Pick the right shape before scaffolding anything:
| Shape | Lifecycle | State location | Examples |
|---|---|---|---|
| One-shot task | Atomic — start to finish in one shot, no decomposition | Just chat / no file state | Reply to an email, fix a typo, ad-hoc query, kill a stray process |
| Workflow | Finite — has a "done" state, may fork, can graduate | <project>/workflow-state/ (this skill) | The /reports API build, the rate-limiter sub-workflow (during its build), a single feature ship |
| Operation | Recurring — same shape each cycle, runs on cadence | <project>/runs/<YYYY-MM>/workflow-state/ (one instance per run) | A monthly release, a nightly data-sync, a weekly report pipeline |
| Development | Open-ended — product that keeps evolving, never "done" | Conventional project folder (CONTEXT.md + PROGRESS.md + code) | The main app, a long-lived service, the SaaS product itself |
Graduation ladder (work moves UP as it matures):
One-shot task → Workflow → Operation → Development
atomic finite recurring open-ended
no state per-build per-instance continuousCrucial: an Operation IS a workflow that runs on a cadence. Same tree shape, instantiated each cycle. Phases under an operation can themselves be sub-workflows (graduated from earlier one-time builds) OR atomic tasks (a human clicking "approve" once a cycle) OR nested operations.
When does work move up the ladder?
- •One-shot → workflow: when mid-flight you realise it has ≥3 steps OR might span sessions OR has discoverable unknowns
- •Workflow → operation: when it finishes AND turns out to repeat on cadence — EXTRACT the pattern into a skill, leave the original workflow state as the historical record
- •Operation → development: when "running it" turns into "continuously evolving the product that runs it"
Things rarely move DOWN. A retired development becomes an operation (just run, no enhance); an unused operation becomes a one-shot fallback.
Skill vs Workflow — the most-asked meta-question
These are orthogonal. Don't conflate them.
| Skill | Workflow | |
|---|---|---|
| Grammatical shape | Noun (a capability) | Verb (an execution) |
| Where it lives | .claude/skills/<name>/ (versioned alongside code) | <project>/workflow-state/ (per-project state) |
| Lifecycle | Stable — refined over months, same across every invocation | Has a start, has a "done", may fork, may merge |
| Reusable? | Yes — invoke from any project, any session | No — this specific instance is one-time |
| Source of truth for | "How do we do X?" | "Where are we in this particular X?" |
| Analogy | Recipe in a cookbook | Tonight's dinner (with its own "garlic ✓, no olive oil" state) |
They interact like this: workflows USE skills. A single workflow can invoke many skills as it progresses — the /reports API build pulls in a testing skill for the handlers and a UI skill for the docs page. The workflow-builder skill orchestrates STATE; domain skills do WORK.
Skills are horizontal (capabilities — same across projects). Workflows are vertical (one specific delivery — pulls horizontal skills as needed).
The extraction pattern (how the skill library grows):
Workflow runs once → done, archive the state folder
Workflow recurs → EXTRACT the pattern into a skill (Phase 5 resume step covers this)
Skill matures → other workflows discover and use itThe rate-limiter is the proof case. Started as a fork off the /reports API. Finished. Realised every public endpoint needs it. We extracted the pattern into a shared middleware skill. The original workflow state lives on as historical record; the skill is the reusable capability.
When you finish a workflow, ask: "will this recur?" If yes → extract into a skill before closing.
When to use
- •Starting a new build that has more than ~3 steps and might discover gaps mid-flight → run the start sequence (phases 1+2)
- •Hitting a fork mid-task ("we need something we don't have") → run the FORK phase, spawn a sub-workflow, mark parent as
blocked-by - •Resuming after a break or in a new session → workflow.yaml is the source of truth; CONTEXT.md the human-readable index
- •Graduating a sub-workflow into a permanent recurring process → MOVE the node into a phase of another long-running workflow
When NOT to use
- •One-shot tasks (a single edit, a single PR) — overkill
- •Workflows that are already codified as a separate skill (those have their own phase model)
- •Pure data-exploration sessions where the goal IS the discovery, not a deliverable
The state lives WITH the project
Per workflow, state goes in <project>/workflow-state/:
<project-folder>/
├── CONTEXT.md ← human-readable project overview (existing)
└── workflow-state/
├── workflow.yaml ← the tree (single source of truth)
├── current-pointer.md ← short "you are here" note (auto-updated)
└── log.jsonl ← append-only history of node transitionsThis is intentional — the workflow state travels with the project, not with the skill. The skill is the grammar; each project's workflow.yaml is a sentence.
The five phases
| # | Phase | When | See |
|---|---|---|---|
| 1 | Define goal | First touch on a new workflow | phases/01-define-goal.md |
| 2 | Decompose | After goal is locked, before any execution | phases/02-decompose.md |
| 2.5 | Pseudocode the leaf (optional) | Between Decompose and Execute when the leaf has non-trivial control flow — write the language-neutral logic before touching code | (no per-phase doc; see notes below) |
| 3 | Execute node | Most of the time — work one leaf at a time | phases/03-execute-node.md |
| 4 | Detect & spawn fork | When a node can't progress without something the workflow doesn't have | phases/04-detect-fork.md |
| 5 | Resume after sub-completes | When a sub-workflow finishes — return to parent | phases/05-resume.md |
Stage Contracts
Every phase is a CONTRACT. Three fixed sections — Inputs / Process / Outputs — so anyone (or any future session) can read it in 30 seconds and know what loads, what runs, what comes out.
- id: phase-2-build-endpoint
title: Build the /reports REST handlers + auth check
status: planned
inputs:
- source: L3:reference # factory — stable across runs
file: docs/architecture.md
section: "## API conventions" # ← selective section routing
why: routing + error-envelope pattern every endpoint follows
- source: L4:prev-phase # product — this run's artifacts
file: ../phase-1-schema/output/reports-schema.md
section: full file
why: the response shape the handlers serialise
process: |
1. Add GET /reports + GET /reports/{id}
2. Wire the auth middleware + error envelope
3. Add handler tests, run them green
outputs:
- artifact: endpoint handlers
location: src/api/reports.py
format: python
- artifact: test report
location: workflow-state/phase-2-execute.md
format: markdownThe principles (why this shape works)
- L3 vs L4 — factory vs product. L3 inputs (reference material) are stable across runs and "internalised as constraints" — architecture docs, conventions, style guides, config. L4 inputs (working artifacts) are per-run and "processed as input" — previous-phase outputs, source material the user supplied. Mixing them in the same input row wastes tokens and dilutes attention. (This L3/L4 split is adapted from the Interpreted Context Methodology — the value is the discipline, not the labels.)
- Selective section routing. Don't say "read architecture.md." Say "read the API conventions section of architecture.md." Most reference docs are 80% rationale and 20% rules-for-this-task. Loading the whole file is the equivalent of bringing every cookbook to make one omelette.
- Phase N's outputs become Phase N+1's L4 inputs. This is the handoff. A human can open the output file, edit it, and the next phase picks up the edited version. No state-management layer needed — the filesystem IS the orchestration.
- One-way cross-references. A phase declares what it loads from elsewhere. The "elsewhere" never declares which phases load it. Prevents reference-tracking from blowing up as the workflow library grows.
Back-compat with older workflows
Pipeline-script docstring contract
Workflow phases declare contracts in YAML. The scripts that execute those phases should declare the SAME contract in their module docstring — so when you open the file directly (without the workflow.yaml in view) you can still see in 5 seconds what loads, what runs, what writes.
The shape — first thing in the docstring, before anything else:
"""Phase N — <one-line goal>.
<2-3 sentence summary of what this pass does and why it exists.>
Reads:
<relative path> # L3 (reference) — what it provides
<relative path> # L4 (working artifact) — which prior phase produced it
Writes:
<relative path> — <one-line description>
Methodology:
- <key decision 1>
- <key decision 2>
- <gotchas, threshold values, fallback chains>
"""Worked example (a generic ETL leaf):
"""Phase 3 — Build the feature matrix.
Join raw events with the customer dimension, derive per-customer aggregates,
and write a single parquet the model phase consumes.
Reads:
data/raw/events.parquet # L4 (Phase 2 output — cleaned events)
data/dim/customers.csv # L3 (stable: customer dimension)
config/feature_spec.yaml # L3 (stable: which aggregates to compute)
Writes:
data/output/feature_matrix.parquet — one row per customer, model-ready
workflow-state/phase-3-execute.md — row counts + null-rate report
Methodology:
- Left-join on customer_id; unmatched events dropped (logged count)
- Aggregates: 7/30/90-day windows; missing → 0 with a presence flag
- Fails loudly if null-rate on any key feature > 5%
"""Why this matters. Three failure modes it prevents:
- Drift between yaml contract and code. Re-stating Reads/Writes in the docstring makes mismatches obvious during review (the yaml says "writes X" but the script writes "Y" — a diff hunk shows it).
- Path archaeology. A new contributor opening the script knows immediately where outputs land and where dependencies live — no need to grep for
Path("...")calls scattered through 300 lines. - Refactor safety. When restructuring a data folder, the docstring is the manifest you grep for to find every script that reads/writes the affected paths.
When is a script too small to need this? If it's <50 lines AND only reads/writes one file each AND is not part of a multi-phase pipeline, a prose docstring is fine. Everything in a scripts/ folder that's part of a numbered or named phase: structured contract.
Node anatomy (workflow.yaml schema)
goal:
id: <short-slug>
title: <one-line>
status: planning | executing | blocked | completed
why: <one-paragraph why-this-matters>
phases:
- id: <slug>
title: <one-line>
status: planned | in-progress | blocked | done | skipped
completed_at: <iso-date or null>
# Stage contract — see "Stage Contracts" section above
inputs: # what to load
- source: L3:reference | L4:prev-phase | L4:user-upload
file: <path>
section: <"full file" | specific section header>
why: <one line>
process: | # ordered steps
1. ...
2. ...
outputs: # what this produces + where it lands
- artifact: <name>
location: <path>
format: <markdown | yaml | json | code | ...>
notes: <free text> # optional — only for stuff that doesn't fit above
tasks:
- id: <slug>
status: ...
# tasks may be leaves OR may have sub-tasks (nested same shape)
forks:
- id: <slug> # the fork's local id within the parent
spawned_at: <iso-date>
spawned_from: <parent-node-id> # which node triggered this
reason: <why we forked>
sub_workflow: <path> # link to the sub-workflow's own state folder
status: open | merged | abandoned
return_to: <parent-node-id> # where we resume when sub-workflow closes
current_node: <node-id> # where execution should pick up
last_updated: <iso-date>Invariants
- Every node has a parent (except the goal). No orphaned tasks.
- `current_node` is always one specific leaf-or-fork. Not "somewhere in Phase 3" — a specific id.
- Forks are first-class. They go in
forks:, not silently in a task. Every fork must list itsreturn_to. - Sub-workflows are full workflows themselves. Same schema, their own state folder, their own log. A sub-workflow can fork further.
- Status of a phase = function of its children. A phase is
doneonly when all tasks under it aredoneorskipped. The skill auto-rolls this up — never set manually. - Log is append-only. Every state transition writes a JSONL line. Never delete history; mark
abandonedinstead.
Graduating a sub-workflow
When a sub-workflow finishes AND turns out to be reusable (recurring monthly, recurring per-feature, etc.), don't leave it as a one-off in the parent's forks: list. Graduate it into a longer-running workflow or a skill:
- Find the right host (a recurring
*-workflowskill, or extract a brand-new skill) - Add a phase to the host that points at the sub-workflow's outputs
- Mark the original fork in the parent as
merged → graduated_to: <host>/phase-N - The sub-workflow's state folder stays — it becomes the template for future runs
The rate-limiter IS this. It started as a fork off the /reports API; it's now a shared middleware skill every endpoint pulls in.
When you finish a workflow, ask: "will this recur?" If yes → extract into a skill before closing.
How to use this in conversation
- •"start workflow for <project>" → run phases 1+2: scaffold
workflow-state/, write skeletonworkflow.yaml, ask the 4 clarifying questions (goal / why / output / first 3 phases) - •"we hit a fork: <reason>" → run phase 4: create fork node, pause parent, spawn sub-workflow folder, set new
current_node - •"resume X" OR "where are we in X" → read
workflow.yaml, render the tree, point atcurrent_node - •"<sub> is done, back to <parent>" → run phase 5: close fork, mark
merged, restore parentcurrent_node - •"graduate <sub> to <host>" → run the graduation playbook
The skill NEVER executes the underlying work itself. It just keeps the tree honest.
Done means (when has the workflow-builder skill actually succeeded?)
The skill's invocation is complete when:
- •✅
<project>/workflow-state/workflow.yamlexists with a valid tree (goal → phases → tasks) - •✅
current_nodepoints at a SPECIFIC leaf id (not "Phase 3, somewhere") - •✅
current-pointer.mdis a one-paragraph human note that says "you are here, doing X, next is Y" - •✅
log.jsonlhas at least one transition line for the current session - •✅ Status fields on every node reflect actual state (no
in-progresstask that hasn't been touched in days)
The workflow itself is complete when:
- •✅ All phases have status
doneorskipped - •✅ All forks are
mergedorabandoned(noopenforks left) - •✅ A "what shipped" note exists somewhere durable — PROGRESS.md entry, CHANGELOG, or memory file
- •✅ The "will this recur?" question has been asked. If yes → graduated to a skill; if no → state folder stays as historical record
If a workflow current_node hasn't moved in 14 days, it's stuck — surface it, don't let it rot silently.
Anti-patterns — habits that wreck the tree
- •Treating phases as a rigid plan. They're a current best-guess. As work progresses, splits/merges/reordering is expected. Don't refuse to restructure because "the plan said X."
- •Skipping the FORK phase when you hit unknown unknowns. If a task can't progress without data/tool that doesn't exist, that's a FORK — spawn a sub-workflow with
return_topinned. Don't silently switch to building the missing thing inside the parent — the parent'scurrent_nodeends up lying about what's actually happening. - •Forking too eagerly. Not every detour needs a sub-workflow. If the diversion is <30min and stays inside the same project, just do it inline. Forks are for things that have their own decomposition and might span sessions.
- •Updating `workflow.yaml` from prose memory instead of reading the file first. If two sessions update the same workflow without reading the current state, you get conflicts. Always load the YAML at start of session; write back at end.
- •Letting `current_node` and reality diverge. When you spend 20 min on something that isn't
current_node, EITHER move the pointer first OR explicitly note "deferring current_node to do X". Silent drift makes "where are we?" answers wrong. - •Closing a fork without writing `return_to`'s parent back to in-progress. The fork closing implies the parent should resume. If the parent stays
blocked, the next session won't know to pick it up. - •Building infrastructure for a one-shot. If the work is genuinely atomic (single email, single config edit, single PR), don't scaffold
workflow-state/. Use the shape taxonomy honestly. - •Graduating prematurely. A sub-workflow that's run once is not yet a recurring pattern. Wait for the second or third instance before extracting it into a skill. False-positive graduations clutter the skill library.
- •Forgetting the "why" on `goal:`. Future sessions need it. "Build the API" doesn't survive a 2-month pause; "Build the API so partners can pull reports without us exporting CSVs by hand" does.
Templates
- •templates/workflow.yaml — empty skeleton for a new workflow
- •templates/CONTEXT-section.md — markdown snippet to drop into the project's CONTEXT.md so the workflow tree shows up in human-readable form too
Install & Usage
mkdir -p .claude/skillsAdd the configuration to .claude/skills/workflow-builder.md
/workflow-builderSecurity Audits
Frequently Asked Questions
What is workflow-builder?
Build, track, and resume hierarchical workflows. Every workflow is a tree of nodes (Goal → Phases → Tasks → Sub-tasks). When work hits a "we need data/tool we don't have yet" moment, spawn a SUB-WORKFLOW node — and resume the parent automatically when the sub completes. State persists per-project so any session (and any future agent) can pick up exactly where the last one paused. Use when you say "start a workflow", "we hit a fork", "resume X", "show me where we are in Y", or whenever beginning a non-trivial multi-step build that may discover unknown unknowns mid-flight.
How to install workflow-builder?
To install workflow-builder: create the skills directory (mkdir -p .claude/skills), then add the config to .claude/skills/workflow-builder.md. Finally, /workflow-builder in Claude Code.
What is workflow-builder best for?
workflow-builder is a other categorized under General. It is designed for: agent. Created by LuckyCody.