BeClaude

agy-loop-blueprint

New
1GitHub TrendingGeneralby tuanhung303

Set up a self-improvement loop that drives Google Antigravity CLI (agy -p) to iteratively improve a piece of work on a topic, with the agent rewriting the next goal based on its own reflection. Use when the user says "iterate on X with AGY", "loop until done", "self-improving research", "build a blueprint loop", or wants a multi-iteration agent that visibly improves between rounds. Do not use for one-shot AGY calls, static code review, or non-AGY agents.

First seen 6/21/2026

Summary

This skill sets up a self-improvement loop where an agent iteratively refines a piece of work, rewriting its own goals based on reflection between rounds.

  • It's ideal for deep research, iterative code improvement, or any task that benefits from multiple visible passes with persistent state on disk.

Overview

AGY Self-Improvement Loop — Blueprint

Turn a topic into a multi-iteration loop that visibly self-improves. Each iteration, the agent (AGY) reads the current goal, the next goal, and the previous reflection, then rewrites the artifact AND the next goal. The runner persists state on disk between iterations.

The full design rationale lives in `DESIGN.md`. The operating manual is `README.md`. This SKILL.md is the onboarding contract for a new agent or human.


Inputs to collect

Collect only the missing information:

  1. Topic: one sentence — what the loop is improving.
  2. First current goal: what iteration 1 must produce (concrete,

falsifiable, citeable).

  1. First next goal: the pivot target for iteration 2 (a likely

deepening, not a new topic).

  1. Completion criteria: what "track done" means (5–7 bullets, all

mechanically checkable where possible).

  1. Optional: per-track criteria overrides that sharpen the global

defaults in tracks.yaml.

If the user has a vague idea, ask 2–3 questions. Do not start a long interview.


When this skill is the right tool

Use it when ALL of these are true:

  • The work benefits from multiple passes (depth, not breadth).
  • The user wants visible evidence of improvement (history, reflections,

byte counts).

  • The topic has fetchable, citable sources (docs, papers, repos).
  • AGY (/Users/__blitzzz/.local/bin/agy) is the agent available.

Do NOT use it when:

  • One AGY call is enough (no loop needed).
  • The topic is a static code review (the artifact is the diff, not a

growing doc).

  • The agent is not AGY (the prompt template and exit codes assume

agy -p semantics).


Procedure

1. Decide the seed

The seed is tracks/<id>/current.md (iter-1 goal) plus tracks/<id>/next.md (iter-2 pivot). Both must be specific and citable. The seed is the quality ceiling for the loop — a vague seed produces a vague loop.

2. Use the template

The repo already contains a working example under tracks/adk-effective-usage/. Two paths:

  • Fresh start. Delete tracks/adk-effective-usage/, create

tracks/<your-id>/, write fresh meta.yaml / current.md / next.md / completion.md. See `README.md`.

  • Fork and adapt. Copy tracks/adk-effective-usage/ to a new folder,

rewrite the goals and completion criteria. The runner will treat it as a new track as long as it is registered in tracks.yaml.

3. Configure tracks.yaml

tracks.yaml is the single source of truth. Edit:

  • active_track — which folder to drive (exactly one).
  • loop.model — the AGY model id (default claude-sonnet-4-6).
  • loop.max_iterations — hard cap (default 5; raise for deeper research).
  • loop.print_timeout_seconds — wall-clock cap per AGY invocation.
  • tracks[] — register every track folder.
  • default_criteria.{reflection,validation,completion} — global defaults

that per-track meta.yaml may override.

4. Preview with --dry-run

Always preview before the first real run:

bash
./loop.sh --dry-run

The dry-run prints the prompt AGY would see, including the current goal, the next goal, all three criteria lists, the previous reflection, and the required-deliverables block. If the prompt looks wrong, fix the seed files. Do not invoke AGY on a bad prompt — that wastes time and pollutes history.

5. Run the loop

bash
./loop.sh                # uses tracks.yaml defaults
./loop.sh --max-iterations 3

loop.sh calls runner.sh repeatedly until completion, max iters, stuck, or an error. SIGINT prints a resume message; re-run to pick up where it left off. State on disk is the source of truth; the runner is idempotent.

6. Inspect

bash
# Latest iteration's audit log (full prompt + AGY stdout + extracted sections)
ls -t tracks/<id>/history/iter-*.md | head -1 | xargs cat

# Last reflection (the handoff to the next iter)
cat tracks/<id>/reflection.md

# Latest artifact
ls -t tracks/<id>/artifacts/v*.md | head -1 | xargs cat

# Track state
cat tracks/<id>/meta.yaml

The audit trail is the proof that the loop self-improved. next.md between iteration N and iteration N+1 MUST reference at least one specific claim from iteration N's ## Reflection. If it does not, the loop did not actually self-improve.

7. Iterate manually or stop

  • To force another iteration without changing goals: ./runner.sh.
  • To steer: edit tracks/<id>/current.md or next.md directly (the

runner will pick up the new content next invocation).

  • To stop: SIGINT once, then loop.sh no longer resumes.

8. Commit and ship

The runner is non-git by design (see DESIGN.md §11 open question #1). You commit snapshots of the artifacts and history at whatever cadence matches your review tolerance. The .gitignore excludes v*.md and iter-*.md by default to keep git status clean during a run.


Output contract

A successful run delivers:

  • A working loop with N iterations (1 ≤ N ≤ loop.max_iterations).
  • An artifact at tracks/<id>/artifacts/vN.md for each completed

iteration.

  • A complete audit trail at tracks/<id>/history/iter-NNN.md.
  • A reflection.md capturing the last iteration's reflection.
  • A meta.yaml with status: done (or stuck / error).

The runner exits with one of seven codes (see DESIGN.md §4 / §7):

CodeMeaning
0iteration done; more may follow
1config error (no AGY call)
2max iterations reached
3stuck (last 2 iters byte-identical)
4AGY crashed
5AGY produced no usable output
6timeout

loop.sh propagates non-zero exit codes so a CI step or human caller can tell apart "track done" (0) from "track stuck" (3) from "AGY crashed" (4).


Folder layout (the contract)

code
agy-loop-blueprint/
├── DESIGN.md                         # design rationale
├── README.md                         # human operating manual
├── SKILL.md                          # this file — onboarding for new agents
├── DEMO.md                           # 5-iter run that ships with the repo
├── tracks.yaml                       # the only root-level config
├── runner.sh                         # one iteration
├── loop.sh                           # auto-loop wrapper
├── .gitignore                        # ignores v*.md and iter-*.md by default
└── tracks/
    └── <track-id>/
        ├── meta.yaml                 # track state + per-track criteria overrides
        ├── current.md                # current goal
        ├── next.md                   # next goal (rewritten by AGY each iter)
        ├── completion.md             # track-done criteria
        ├── reflection.md             # last iteration's reflection
        ├── artifacts/                # evolving artifact, one file per iteration
        │   └── vN.md
        └── history/                  # per-iteration audit logs
            └── iter-NNN.md

tracks.yaml is the only file that may override loop-level controls (max_iterations, model, print_timeout_seconds). tracks/<id>/meta.yaml is the only file that may override the three criteria. Everything else is human prose that AGY reads.


Failure handling

When the loop stops non-zero, the failure mode is in the runner's exit code and in meta.yaml.status. Recovery path:

  • Exit 1 (config error). Fix tracks.yaml or folder paths. Re-run

./loop.sh --dry-run to confirm the prompt is now valid.

  • Exit 4 (AGY crashed). Inspect history/iter-NNN-fail.md. The

counter did not advance; re-running resumes the same step. If the crash repeats, check the model id in loop.model.

  • Exit 5 (empty / malformed output). AGY succeeded technically but

produced no usable work. Sharpen the validation criteria in tracks/<id>/meta.yaml (e.g. add "artifact must be >2000 bytes" or "## Reflection section must quote at least one URL"). Re-run.

  • Exit 6 (timeout). Raise loop.print_timeout_seconds in

tracks.yaml. Re-run.

  • Exit 3 (stuck). The last two iterations produced byte-identical

artifacts and next.md — the loop is not making progress. Either rewrite current.md to push the agent in a new direction, OR sharpen the completion criteria if the loop is actually done and AGY just didn't realize it.

When in doubt, the audit log is in history/iter-NNN.md (or iter-NNN-fail.md for failed iterations). The runner never deletes history; you can always reconstruct what happened.


Examples

Use case 1 — fresh research topic

"I want to understand the state of LLM agent frameworks in 2026. Build

me a loop."

Steps:

  1. Create tracks/llm-agents-2026/.
  2. Write meta.yaml, current.md (iter-1: map the major frameworks and

their distinguishing features), next.md (iter-2: deepen the framework the reflection identifies as least understood), completion.md (5–7 bullets: coverage, density, citation, etc.).

  1. Register the track in tracks.yaml, set active_track to it.
  2. ./loop.sh --dry-run to verify the prompt.
  3. ./loop.sh to run. Inspect history/ and artifacts/ after each

iter. Stop when complete or when exit 2 / 3 is hit.

Use case 2 — fork from the shipped demo

The repo ships with tracks/adk-effective-usage/ already driven to status: done. To start a parallel track on a related topic:

  1. cp -R tracks/adk-effective-usage tracks/adk-vs-langgraph
  2. Rewrite current.md, next.md, completion.md, and the criteria in

meta.yaml to the new topic.

  1. Register the new id in tracks.yaml:tracks[] and flip

active_track to it.

  1. Re-run.

Use case 3 — re-run the shipped demo from scratch

To see the loop self-improve with your own eyes:

  1. `mavis-trash tracks/adk-effective-usage/artifacts/v*.md

tracks/adk-effective-usage/history/iter-*.md tracks/adk-effective-usage/reflection.md`

  1. Reset tracks/adk-effective-usage/meta.yaml: iteration: 0,

status: active, termination_reason: "".

  1. Reset tracks/adk-effective-usage/next.md to the original seed

(from the first-iteration current.md's "what comes next" section, or from DESIGN.md §10.3).

  1. ./loop.sh

Self-improving mechanism (the principle)

The loop is self-improving because next.md after iteration N is written by AGY at the end of iteration `N`, derived from the ## Reflection section AGY itself produced. The runner never writes next.md. That makes the loop second-order: each iteration's target is a function of the previous iteration's self-evaluation, not a hand-written prompt.

Minimum audit evidence of self-improvement: next.md between iteration N and iteration N+1 references at least one specific claim from iteration N's ## Reflection. The ## Next.md diff section in history/iter-NNN.md is the auditable proof. If the reference is missing, the loop did not actually self-improve — it just iterated.


Additional resources

  • `DESIGN.md` — full design rationale, criteria semantics,

failure-mode details, self-improving mechanism paragraph, and the brainstorm decisions behind every choice.

  • `README.md` — human operating manual: how to run, how

to add a track, exit-code table, what success looks like.

  • `DEMO.md` — the 5-iteration run that ships with the

repo, including a per-iteration timeline and an honest answer to "did the loop self-improve?".

  • tracks.yaml — the schema; copy it as a starting point for new

projects.

  • runner.sh / loop.sh — the machinery; both are pure bash +

python3 -c for YAML parse, no other deps.

Install & Usage

1
Create the skills directory
mkdir -p .claude/skills
2
Download the skill file

Add the configuration to .claude/skills/agy-loop-blueprint.md

3
Invoke in Claude Code
/agy-loop-blueprint

Use Cases

Iteratively improve a research paper draft with each round refining arguments and citations.
Refactor a codebase module across multiple iterations, with each round addressing previous issues.
Generate a comprehensive technical report by deepening analysis through successive loops.
Optimize a configuration or prompt by running multiple improvement cycles with reflection.
Build a complex artifact like a design document or blueprint that evolves based on self-critique.
Automate a multi-step debugging process where each iteration fixes found bugs and sets new goals.

Usage Examples

1

/agy-loop-blueprint iterate on my research paper about quantum error correction with AGY

2

Set up a self-improving loop to refactor the authentication module, aiming for 90% test coverage

3

Loop until done: improve the API design document with AGY, deepening each iteration

View source on GitHub
code-reviewagent

Security Audits

LicenseUnknownSourceWarnRepositoryPass

Frequently Asked Questions

What is agy-loop-blueprint?

This skill sets up a self-improvement loop where an agent iteratively refines a piece of work, rewriting its own goals based on reflection between rounds. It's ideal for deep research, iterative code improvement, or any task that benefits from multiple visible passes with persistent state on disk.

How to install agy-loop-blueprint?

To install agy-loop-blueprint: create the skills directory (mkdir -p .claude/skills), then add the config to .claude/skills/agy-loop-blueprint.md. Finally, /agy-loop-blueprint in Claude Code.

What is agy-loop-blueprint best for?

agy-loop-blueprint is a other categorized under General. It is designed for: code-review, agent. Created by tuanhung303.

What can I use agy-loop-blueprint for?

agy-loop-blueprint is useful for: Iteratively improve a research paper draft with each round refining arguments and citations.; Refactor a codebase module across multiple iterations, with each round addressing previous issues.; Generate a comprehensive technical report by deepening analysis through successive loops.; Optimize a configuration or prompt by running multiple improvement cycles with reflection.; Build a complex artifact like a design document or blueprint that evolves based on self-critique.; Automate a multi-step debugging process where each iteration fixes found bugs and sets new goals..