AgentOps vs GSD (Get Shit Done)

GSD v1.34 is a full-featured spec-driven development framework for AI coding agents. It installs as slash commands into 7 runtimes (Claude Code, Gemini CLI, OpenCode, Codex, Copilot, Cursor, Antigravity) and solves "context rot" by spawning fresh-context subagents for each task. Trusted by engineers at Amazon, Google, Shopify, and Webflow.

Comparison updated April 2026. See the GSD repo for current features.

At a Glance

Aspect	GSD	AgentOps
Philosophy	"Ship fast — fresh context per agent"	"Operational layer for coding agents; technically a context compiler"
Core strength	Multi-agent orchestration with context isolation, multi-runtime support	Cross-session memory, validation gates, knowledge flywheel
GitHub	glittercowboy/get-shit-done	boshu2/agentops
Latest	v1.34.2 (April 2026)	v2.39.0 (April 2026)
Scale	53 commands, 46 workflows, 16 agents	50+ skills, compiled CLI, hooks, schemas
Primary use	Spec-driven development with phased execution	Ongoing codebase work with persistent memory

Three-Layer Comparison

Product Layer	GSD	AgentOps
Context Compiler (L1)	Fresh context per agent (isolation, no retrieval)	Decay-ranked retrieval + fresh context per wave (isolation AND memory)
Validation Gates (L2)	Post-build review agent	Pre-build `/pre-mortem` + post-build `/vibe` with multi-model council
Knowledge Flywheel (L3)	No persistent learning across sessions	Extract → score → promote → inject; session 50 knows what session 1 learned

What GSD Does Well

1. Fresh Context Per Agent

GSD's core innovation. Every spawned agent gets a clean 200K context window. Orchestrators stay thin, agents are disposable. This eliminates context rot — the quality degradation that happens as an AI fills its context window during long sessions.

2. Wave-Based Parallel Execution

Plans are grouped into dependency waves. Plans within a wave run in parallel (each with a fresh agent), waves run sequentially. Includes STATE.md file locking with atomic creation and spin-wait jitter.

Wave 1: [Plan A, Plan B, Plan C]  ← parallel, fresh 200K each
           ↓ (all complete)
Wave 2: [Plan D, Plan E]          ← parallel, depends on Wave 1
           ↓ (all complete)
Wave 3: [Plan F]                  ← sequential, depends on Wave 2

3. Model Cost Tiers

Four profiles (quality/balanced/budget/inherit) with per-agent model assignments. Each profile maps agents to opus/sonnet/haiku. This means routine plan-checking can run on budget models while critical execution stays on quality models.

4. Auto-Repair on Task Failure

When a task fails during execution, GSD auto-classifies the failure as RETRY (with adjustment), DECOMPOSE (break into sub-steps), or PRUNE (remove and escalate). Budget-controlled with a default of 2 attempts. This is structured recovery, not blind retries.

5. Comprehensive Validation Pipeline

Not just "human verification" anymore. GSD v1.27 has:

8-dimension plan checker (max 3 iterations)
Nyquist validation audit (test coverage mapping)
Post-execution verification agent
User acceptance testing with auto-diagnosis
Cross-phase verification debt audit

6. Seven Runtime Support

Installs to Claude Code, Gemini CLI, OpenCode, Codex, Copilot, Cursor, and Antigravity. The installer transforms file content per runtime at install time (tool name mapping, agent frontmatter, hook events, path conventions).

7. Advisor Mode and Fast Path (New in v1.34)

/gsd:advisor provides research-backed discussion without execution. /gsd:fast skips planning for trivial tasks. Multi-repo workspace support added for cross-project orchestration.

7. Deep State Management

Full .planning/ directory with 20+ artifact types:

PROJECT.md, REQUIREMENTS.md, ROADMAP.md, STATE.md
Per-phase directories with research, plans, summaries, verification, UAT
Session handoff (HANDOFF.json, continue-here.md)
Persistent threads, seeds, debug knowledge base, todos

Where GSD Falls Short

No Cross-Session Learning

GSD has persistence (state files, handoffs, threads) but no knowledge flywheel. There is no mechanism to extract what was learned in one session and inject it into the next. Every session starts with the same agent intelligence — the system does not get smarter over time.

┌─────────────────────────────────────────────────────────────────┐
│                         GSD                                     │
│                                                                 │
│  Session 1: discuss → plan → execute → verify → Done            │
│                                                   ↓             │
│                                              (state saved)      │
│                                                   ↓             │
│  Session 2: resume-work → (same state, same intelligence)       │
│                                                                 │
│  Session 100: (agents are no smarter than session 1)            │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                      AGENTOPS                                   │
│                                                                 │
│  Session 1: research → plan → pre-mortem → crank → vibe → retro │
│                                              ↓                  │
│                                      (learnings extracted)      │
│                                              ↓                  │
│                                      (scored and stored)        │
│                                                                 │
│  Session 2: (inject prior knowledge) → better starting point    │
│                                                                 │
│  Session 100: (agent is a domain expert with scored knowledge)  │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

No Strategic Goals or Direction

GSD executes phases within a project but has no mechanism for measuring progress toward higher-level objectives. No equivalent to GOALS.md, ao goals measure, or /evolve.

No Pre-Implementation Failure Prevention

GSD validates after execution (verify-work, UAT). The plan checker validates plan quality but does not simulate failure modes before implementation begins. AgentOps runs /pre-mortem to catch issues before code is written.

No Issue Graph or Dependency Tracking

GSD has wave-based parallelism for plans within a phase, but no cross-phase issue graph. There is no equivalent to beads (issue tracking with dependencies), which means no mechanical way to track blocked work across phases.

No Multi-Model Validation Council

GSD's review command supports cross-AI peer review (Gemini, Claude, Codex), but this is a review tool, not a multi-perspective validation council that runs adversarial analysis from different viewpoints simultaneously.

Feature Comparison

Feature	GSD	AgentOps	Winner
Multi-runtime support	✅ 7 runtimes	⚠️ 4 supported runtime paths, less broad than GSD	GSD
Fresh context per agent	✅ Core design	⚠️ Swarm workers	GSD
Model cost tiers	✅ 4 profiles	❌ Not yet	GSD
Auto-repair on failure	✅ RETRY/DECOMPOSE/PRUNE	⚠️ Crank retries	GSD
Context rot detection	✅ Hooks at 35%/25%	❌ Not yet	GSD
Prompt injection defense	✅ Advisory hook	❌ Not yet	GSD
Wave-based parallelism	✅ Built in	✅ Crank waves	Tie
Plan validation	✅ 8-dimension checker	✅ Pre-mortem + council	Tie
Human-in-loop gates	✅ Configurable gates	✅ Multiple gates	Tie
State persistence	✅ .planning/ directory	✅ .agents/ directory	Tie
Cross-session learning	❌ No flywheel	✅ Extract → score → inject	AgentOps
Knowledge maturity	❌ No scoring	✅ Maturity tracking + decay	AgentOps
Pre-mortem simulation	❌ Post-execution only	✅ Before implementation	AgentOps
Multi-model council	❌ Sequential review	✅ Multi-perspective	AgentOps
Issue graph execution	❌ Phase-scoped waves	✅ Beads + dependencies	AgentOps
Strategic goals	❌ No goal tracking	✅ GOALS.md + evolve	AgentOps
Compiled CLI	❌ Node.js tools	✅ Go binary (ao)	AgentOps

Workflow Comparison

GSD Workflow

/gsd:new-project     →  PROJECT.md, REQUIREMENTS.md, ROADMAP.md, config.json
         ↓
/gsd:discuss-phase   →  Capture decisions (/comparisons/CONTEXT)
         ↓
/gsd:plan-phase      →  Research → plan → 8-dimension verify (max 3 iterations)
         ↓
/gsd:execute-phase   →  Wave-based parallel execution (fresh agent per plan)
         ↓                  └── Node repair on failure (RETRY/DECOMPOSE/PRUNE)
/gsd:verify-work     →  UAT with auto-diagnosis
         ↓
/gsd:ship            →  Create PR from phase work
         ↓
       [next phase or complete-milestone]

AgentOps Workflow

/research     →  Explore codebase + inject prior knowledge
     ↓
/plan         →  Break into dependency-tracked issues (beads)
     ↓
/pre-mortem   →  Simulate failure modes before building
     ↓
/crank        →  Execute unblocked waves → validate → commit
     ↓
/vibe         →  Multi-aspect code validation (council optional)
     ↓
/post-mortem  →  Extract learnings → score → store for next session

Architecture Comparison

Aspect	GSD	AgentOps
Commands	53 prompt-based slash commands	50+ skill definitions
Agents	16 specialized (fresh context each)	Skill-driven (swarm for parallelism)
CLI tooling	Node.js (`gsd-tools.cjs`, 15 modules)	Go binary (`ao`, structured subcommands)
Hooks	5 JS hooks (statusline, context monitor, prompt guard, workflow guard, update check)	Shell hooks (session lifecycle, tool gates, knowledge injection)
State	`.planning/` (Markdown + JSON)	`.agents/` (Markdown + JSON)
Config	`.planning/config.json` (40+ options)	`.agentops.json` + GOALS.md
Install	npm package, 3000-line installer	Shell script + Go binary
Parallelism	Wave-based with file locking	Wave-based via crank + swarm

Overhead Comparison

                    SETUP TIME              ONGOING OVERHEAD
                    ══════════              ════════════════

GSD:                ████████░░░░░░░░        ████████░░░░░░░░
                    (npm install + init)    (moderate — .planning/ management)

AgentOps:           ████████░░░░░░░░        ████████░░░░░░░░
                    (install + init)        (moderate — hooks + .agents/)


                    SESSION VALUE           LONG-TERM VALUE
                    ═════════════           ═══════════════

GSD:                ████████████████        ████████░░░░░░░░
                    (strong execution)      (state persists, no learning)

AgentOps:           ████████████████        ████████████████
                    (strong execution)      (knowledge compounds)

Trade-off: GSD optimizes for execution quality per session. AgentOps optimizes for cumulative intelligence across sessions.

Use Case Fit

GSD is Best For

Use Case	Why
Greenfield projects	Strong project setup + phased execution
Multi-runtime teams	7 runtimes with one install
Cost-sensitive work	Model cost tiers control spend
Complex single-phase work	Wave parallelism + auto-repair
Teams standardizing process	Clear phases with configurable gates

AgentOps is Best For

Use Case	Why
Long-running codebases	Knowledge flywheel compounds value
Repeated maintenance	Learns from past sessions
Complex multi-phase work	Issue graph + dependency execution
Risk-averse engineering	Pre-mortem + council + vibe gates
Strategic direction	GOALS.md + evolve loop

When to Choose GSD

You work across multiple AI runtimes and need one workflow
You want model cost control at the per-agent level
Your work is project-scoped (clear start and end)
You value fresh context per agent for quality in long sessions
You want auto-repair when tasks fail during execution

When to Choose AgentOps

You work on the same codebase across many sessions
You want the system to get smarter over time
You want failure prevention before implementation, not just verification after
You want dependency-tracked issue execution across work phases
You value strategic goal tracking and measured progress

Can They Work Together?

Partially. GSD and AgentOps both manage state directories and workflow orchestration, so running both simultaneously would create friction. However:

GSD's fresh-context-per-agent pattern is a technique AgentOps' swarm could adopt
GSD's model cost tiers solve a problem AgentOps does not yet address
AgentOps' knowledge flywheel fills GSD's biggest gap (no cross-session learning)

The most practical combination: use GSD for greenfield projects where you need fast phased execution, then bring AgentOps in when the project enters maintenance and long-term development where accumulated knowledge matters.

The Bottom Line

Dimension	GSD	AgentOps
Philosophy	Fresh context, fast execution	Knowledge compounds
Overhead	Moderate	Moderate
Persistence	State files (no learning)	Knowledge flywheel
Validation	8-dimension plan check + UAT	Pre-mortem + council + vibe
Parallelism	Wave-based, fresh agents	Wave-based, swarm workers
Cost control	4-tier model profiles	Not yet
Best for	Strong execution per session	Cumulative intelligence across sessions

GSD is a serious framework for structured AI-assisted development. AgentOps differentiates on the knowledge flywheel — the system that makes every session smarter than the last.

The Honest Assessment

GSD is not a lightweight tool anymore. It is a comprehensive development framework with 53 commands, 16 agents, wave-based parallelism, auto-repair, model cost tiers, and deep state management. Dismissing it as "simple meta-prompting" is inaccurate.

Where GSD wins: Execution quality within a session. Fresh context per agent, cost control, auto-repair, and 7-runtime portability.

Where AgentOps wins: Intelligence across sessions. The knowledge flywheel (extract, score, inject, decay) has no equivalent in GSD. After 50 sessions on the same codebase, AgentOps is operating with accumulated domain knowledge while GSD agents start fresh every time.

Session 1:   GSD and AgentOps roughly equal
Session 10:  AgentOps has a library of scored learnings
Session 50:  AgentOps agents get injected domain expertise; GSD agents do not
Session 100: AgentOps is a domain expert; GSD is still starting from its .planning/ state

← vs. SDD · Back to Comparisons