Skip to main content
Plan. Correct. Remember.

Give AI coding agents a tighter spec loop.

spec-coding-skills helps agents define done before coding, correct drift with evidence, and reuse project memory instead of starting each task cold.

What changes in practice

Planning, correction, and memory stay connected instead of getting improvised task by task.

  • Define scope before implementationTurn fuzzy requests into implementation-ready specs with clear scope and acceptance criteria.
  • Correct with evidence, not guessworkUse tests, logs, CI output, and review feedback to fix drift against the current spec.
  • Reuse project memoryCapture decisions and root causes so the next task starts with real context.
Install when you are readyOne command, all three skills.
npx skills add H2Sxxa/spec-coding-skills --all
Core loop

Plan -> Correct -> Remember

Three skills that reinforce one another instead of acting like disconnected prompts.

Repository control

Root SPEC.md

Repositories can override language, validation commands, and knowledge-base paths without forking the skills.

Demo benchmark

+88.9 pts

In a small 3-prompt local demo, skill-guided outputs matched the expected workflow structure far more consistently.

Three skills

One loop for planning, correction, and durable memory.

Each skill is useful on its own, but the real value appears when they pass context to one another across the life of a task.

spec-plan

Turn vague requests into implementation-ready specs

Capture scope boundaries, assumptions, acceptance criteria, validation steps, execution guardrails, and blocking questions before code starts drifting.

Open docs
spec-crlp

Debug against evidence instead of guessing

Use lint, tests, logs, CI output, and review feedback to run a correction loop that is grounded in the current spec and current code.

Open docs
spec-index

Build project memory that stays searchable

Capture decisions, root causes, fix patterns, pitfalls, and validation rules so the next task starts with context instead of amnesia.

Open docs
Default workflow

How the loop behaves in a real repository.

The flow stays lightweight in day-to-day use: read local conventions, plan against explicit done conditions, and close the gap when runtime evidence disagrees.

01

Read the repository contract

When the target repository defines a root SPEC.md, the skills use it to honor local language, validation, and knowledge-base conventions.

02

Retrieve historical context

spec-index looks up relevant decisions, root causes, and pitfalls before planning or correction work starts.

03

Define done before implementation

spec-plan turns fuzzy requests into an implementation-ready spec with testable acceptance criteria and execution guardrails.

04

Implement in the normal coding loop

The skills do not replace coding. They tighten the contract around what should happen and how the work will be verified.

05

Correct with real feedback

If reality diverges from the spec, spec-crlp runs a fix-verify-repeat loop using evidence instead of moving the goalposts.

06

Persist reusable lessons

Reusable findings flow back into spec-index so later tasks inherit project memory rather than repeating the same mistakes.

Demo benchmark

A small signal that the workflow changes agent behavior.

In a 3-prompt local Codex demo, the skill-guided outputs matched the expected planning, correction, and memory structures far more consistently than baseline generic outputs.

Overall demo result

11.1% → 100.0%

+88.9 pts

Planning an existing feature

What improved
Baseline

33.3%

With skills

100.0%

Adds scope boundaries, acceptance criteria, validation planning, and memory context before implementation starts.

Correcting a failing test

What improved
Baseline

0.0%

With skills

100.0%

Pushes the agent toward evidence-driven debugging with explicit root cause, validation, and memory capture.

Saving a reusable root cause

What improved
Baseline

0.0%

With skills

100.0%

Turns one-off debugging knowledge into searchable project memory with structure, tags, and reuse conditions.

What it measures

The benchmark checks whether the agent produced the workflow artifacts that make real development safer: clear scope, testable acceptance criteria, structured correction steps, and retrievable project memory.

What it does not claim

It is not a statistical statement about final code quality. Treat it as a compact before-and-after demo of process quality rather than a universal benchmark claim.

Learn faster

Everything you need to evaluate or adopt the project.

Start with the overview, verify the local workflow, then inspect the compact benchmark before rolling the skills into a real repository.

Overview

Understand the three skills, their responsibilities, and the default workflow they create together.

Read section

Testing

See the smoke tests, local verification steps, and the small eval workflow used to validate this project.

Read section

Demo benchmark

Review the 3-prompt local benchmark and what the current numbers do and do not claim.

Read section