JustLearn
AI-Powered Development: Developer Track
Intermediate2 hours

Lesson 6: Agent Workflows in Action — Live Demo

Course: AI-Powered Development (Dev Track) | Duration: 2 hours | Level: Intermediate

Overview

This session is entirely demonstration-driven. You will watch three complete workflows run from start to finish, see real terminal output and file contents, understand exactly where each approach breaks down, and finish with two hands-on exercises you run yourself.

By the end you will know — concretely, not theoretically — when to reach for a single agent, when to use sub-agent orchestration with GSD, and when to invest in a full team-agent methodology like BMAD.

Part 1: Demo 1 — Single Agent Bug Fix (20 min)

The scenario

A FastAPI application is returning 422 Unprocessable Entity on a POST /users endpoint. The developer cannot reproduce it locally from the docs but it happens consistently in staging.

The code under investigation

python
# app/routes/users.py
from fastapi import APIRouter, HTTPException
from app.models import UserCreate, UserResponse
from app.database import get_db
from sqlalchemy.orm import Session
from fastapi import Depends
 
router = APIRouter(prefix="/users", tags=["users"])
 
@router.post("/", response_model=UserResponse)
async def create_user(user: UserCreate, db: Session = Depends(get_db)):
    existing = db.query(User).filter(User.email == user.email).first()
    if existing:
        raise HTTPException(status_code=400, detail="Email already registered")
    db_user = User(
        email=user.email,
        username=user.username,
        hashed_password=hash_password(user.password)
    )
    db.add(db_user)
    db.commit()
    db.refresh(db_user)
    return db_user
python
# app/models.py
from pydantic import BaseModel
from typing import Optional
 
class UserCreate(BaseModel):
    email: str
    username: str
    password: str
    full_name: Optional[str]   # <-- optional field, no default
 
class UserResponse(BaseModel):
    id: int
    email: str
    username: str

The full conversation with Claude Code

Turn 1 — User prompt:

code
POST /users is returning 422 in staging. Works fine when I test with the
full payload. Failing when the frontend calls it without the full_name field.
Here's the request body from the logs:

{"email": "alice@example.com", "username": "alice", "password": "hunter2"}

Turn 2 — Agent reads files:

The agent uses the Read tool on app/models.py, app/routes/users.py, and app/database.py. It also reads requirements.txt to check the Pydantic version.

code
Reading app/models.py...
Reading app/routes/users.py...
Reading app/database.py...
Reading requirements.txt...  pydantic==1.10.13

Turn 3 — Agent finds the bug:

code
Found the issue. In app/models.py, the `full_name` field is declared as:

    full_name: Optional[str]

In Pydantic v1, `Optional[str]` means the type can be str or None, but
it does NOT automatically set a default of None. The field is still
required. You need:

    full_name: Optional[str] = None

The 422 happens because FastAPI/Pydantic v1 treats this field as required
even though Optional suggests otherwise. Pydantic v2 changed this
behaviour — Optional[str] now implies None as default.

Turn 4 — Agent writes the fix:

python
# app/models.py  (after fix)
from pydantic import BaseModel
from typing import Optional
 
class UserCreate(BaseModel):
    email: str
    username: str
    password: str
    full_name: Optional[str] = None   # explicit default added

Turn 5 — Agent writes and runs a test:

python
# tests/test_users.py  (generated)
from fastapi.testclient import TestClient
from app.main import app
 
client = TestClient(app)
 
def test_create_user_without_full_name():
    response = client.post("/users/", json={
        "email": "alice@example.com",
        "username": "alice",
        "password": "hunter2"
    })
    assert response.status_code == 200, response.json()
 
def test_create_user_with_full_name():
    response = client.post("/users/", json={
        "email": "bob@example.com",
        "username": "bob",
        "password": "hunter2",
        "full_name": "Bob Smith"
    })
    assert response.status_code == 200
    assert response.json()["email"] == "bob@example.com"

Terminal output:

code
$ pytest tests/test_users.py -v
PASSED tests/test_users.py::test_create_user_without_full_name
PASSED tests/test_users.py::test_create_user_with_full_name
2 passed in 0.34s

Where context starts degrading

This five-turn conversation consumed roughly 8,000 tokens — trivial for a 200K context window. The single-agent approach was the right choice here.

Watch for the warning signs that you are approaching the degradation zone:

SignalApproximate token rangeEffect
Agent re-reads files it already read40K–80KContext overhead growing
Earlier instructions ignored80K–120KAttention spreading thin
Agent introduces regressions mid-session120K–160KOriginal intent drifting
Agent contradicts its own earlier output160K+Session should be reset

Rule of thumb: For a single bug fix, a single agent is fast and accurate. For a feature that requires reading 15+ files and writing 10+ new ones, you need a different pattern.

Part 2: Demo 2 — Sub-Agent Pattern with GSD (30 min)

What GSD is

GSD (Get Shit Done) is a workflow framework for Claude Code. It orchestrates planning and execution by spawning fresh sub-agents — each with a clean 200K context window — so that task 50 in a project has identical quality to task 1.

Install via:

bash
npx get-shit-done-cc@latest

Step 1: Initialize the project

User runs:

code
/gsd:new-project

What happens:

GSD enters a deep-questioning interview. It wants to understand exactly what you are building before writing a single line of plan.

Sample exchange:

code
GSD: What are you building? Describe it in plain language.

You: A REST API for a task management app. Users can create workspaces,
     invite team members, create tasks, assign them, and track completion.
     Needs auth. Will be consumed by a React frontend.

GSD: What does "done" look like for v1? What's explicitly out of scope?

You: v1 = working CRUD for workspaces, tasks, and users. Auth via JWT.
     Out of scope: notifications, file attachments, recurring tasks.

GSD: Do you want me to spawn research agents to investigate the standard
     stack for this domain?

You: Yes

GSD then spawns 4 parallel researcher agents to investigate authentication patterns, FastAPI best practices, database design for task management, and JWT implementation. Each runs independently with a clean context.

Output files created:

code
.planning/
├── PROJECT.md
├── REQUIREMENTS.md
├── ROADMAP.md
├── STATE.md
├── config.json
└── research/
    ├── auth-patterns.md
    ├── fastapi-conventions.md
    ├── task-db-design.md
    └── jwt-implementation.md

Sample ROADMAP.md output:

markdown
# Project Roadmap
 
## Milestone 1.0.0
 
### Phase 1 — Foundation (REQ-001, REQ-002)
Database models, migrations, project scaffolding
 
### Phase 2 — Authentication (REQ-003)
JWT login/logout, token refresh, middleware
 
### Phase 3 — Workspace API (REQ-004, REQ-005)
Workspace CRUD, member invite flow
 
### Phase 4 — Task API (REQ-006, REQ-007, REQ-008)
Task CRUD, assignment, status transitions
 
### Phase 5 — Integration & Polish (REQ-009)
End-to-end tests, OpenAPI docs, error normalisation

Step 2: Plan a phase

User runs:

code
/gsd:plan-phase 1

GSD creates a concrete, atomic plan for Phase 1:

code
.planning/phases/01-foundation/01-01-PLAN.md

Sample 01-01-PLAN.md:

markdown
---
wave: 1
---
 
# Plan 01-01: Project Scaffolding and Database Foundation
 
## Goal
Stand up a working FastAPI application with SQLAlchemy models, Alembic
migrations, and a working database connection.
 
## Tasks
 
### Task 1 — Project structure
Create the directory layout:
app/, app/models/, app/routes/, app/services/, tests/, alembic/
 
### Task 2 — Dependencies
Create requirements.txt with:
- fastapi==0.111.0
- sqlalchemy==2.0.30
- alembic==1.13.1
- psycopg2-binary==2.9.9
- python-jose[cryptography]==3.3.0
- passlib[bcrypt]==1.7.4
- pydantic==2.7.1
- uvicorn==0.29.0
 
### Task 3 — Database models
Create User, Workspace, WorkspaceMember, Task models in app/models/
 
### Task 4 — Alembic migration
Generate and apply initial migration
 
### Task 5 — Health check endpoint
GET /health returns {"status": "ok", "db": "connected"}
 
## Verification
- `pytest tests/test_health.py` passes
- `alembic current` shows head migration applied
- All model tables exist in the database
 
## Success criteria
A developer can clone the repo, run `docker-compose up`, and hit
`/health` with a 200 response.

Step 3: Execute the phase

User runs:

code
/gsd:execute-phase 1

This is where sub-agent spawning happens. The executor reads all PLAN.md files in Phase 1, groups them by wave (from frontmatter), and executes plans within each wave in parallel using the Task tool. Each spawned agent receives:

  • The specific PLAN.md as its sole instruction set
  • A fresh 200K context window — no conversation history from your session
  • No knowledge of what happened in any other task

Terminal view during execution:

code
[GSD] Phase 1 — Foundation
[GSD] Wave 1: executing 01-01 in parallel
  [Task agent 01-01] Reading PLAN.md...
  [Task agent 01-01] Creating project structure...
  [Task agent 01-01] Writing requirements.txt...
  [Task agent 01-01] Creating SQLAlchemy models...
  [Task agent 01-01] Running alembic init...
  [Task agent 01-01] Running pytest tests/test_health.py... PASSED
  [Task agent 01-01] Writing SUMMARY.md...
[GSD] Phase 1 complete. Updating ROADMAP.md, STATE.md...
[GSD] Git commit: "feat(phase-1): foundation — scaffolding, models, migrations"

Each plan produces a SUMMARY.md automatically:

markdown
# Summary: 01-01 Project Scaffolding
 
## Completed
- Created full directory structure
- requirements.txt with pinned versions
- User, Workspace, WorkspaceMember, Task SQLAlchemy models
- Alembic initial migration applied
- GET /health endpoint verified passing
 
## Files Created
- app/main.py
- app/database.py
- app/models/user.py
- app/models/workspace.py
- app/models/task.py
- alembic/versions/001_initial.py
- tests/test_health.py
 
## Atomic git commit
feat(phase-1): foundation — scaffolding, models, migrations

Why task 50 has the same quality as task 1

With a single long-running session, context degrades. By the time you reach task 50, the agent is carrying the weight of 49 previous tasks — file reads, failed attempts, interim decisions, corrections. Its effective attention on task 50 is a fraction of what it gave task 1.

GSD's sub-agent architecture solves this structurally:

code
Your session (director)
│
├─ /gsd:plan-phase → planner agent (fresh context, writes PLAN.md)
│
└─ /gsd:execute-phase
   ├─ Task agent for 02-01 (fresh 200K context, reads only PLAN.md)
   ├─ Task agent for 02-02 (fresh 200K context, reads only PLAN.md)
   └─ Task agent for 02-03 (fresh 200K context, reads only PLAN.md)

Each agent starts fresh. The quality is bounded by the quality of the PLAN.md, not the accumulated baggage of the session.

Quick mode for ad-hoc tasks

For small tasks that do not belong to a phase:

code
/gsd:quick

GSD asks what you want done, spawns a planner sub-agent to write a PLAN.md into .planning/quick/NNN-slug/, then spawns an executor sub-agent to run it. The result is an atomic git commit for a single small task, with the same context-isolation guarantees as a full phase.

Add flags for extra quality gates:

code
/gsd:quick --discuss --research --full

Part 3: Demo 3 — Team Agent with BMAD (25 min)

What BMAD is

BMAD Method is a structured multi-agent framework modeled on agile development roles. Instead of one agent doing everything, you engage a cast of named specialist agents sequentially, each producing a specific artifact that the next agent consumes.

Install:

bash
npx bmad-method install

The nine core agents

AgentNamePrimary output
AnalystMaryBrainstorming, research, product brief
PMJohnPRD.md
ArchitectWinstonarchitecture.md
Scrum MasterBobsprint-status.yaml, story files
DeveloperAmeliaImplemented code, code review
QA EngineerQuinnAutomated tests
Quick FlowBarryEnd-to-end solo dev path
UX DesignerSallyUX designs, user flows
Technical WriterPaigeDocs, specs, diagrams

The four-phase workflow

Phase 1 — Analysis (optional)

You engage Mary (Analyst). She runs brainstorming and market/domain research. Her output is a product brief that confirms you are solving the right problem.

Phase 2 — Planning (required)

You engage John (PM) and invoke bmad-create-prd. He interviews you and produces PRD.md — a document containing functional requirements, non-functional requirements, and acceptance criteria for every feature.

Sample PRD.md excerpt:

markdown
# Product Requirements Document — TaskFlow API
 
## Functional Requirements
 
### FR-001: User Registration
- System shall accept email, username, and password
- Email must be unique across all users
- Password must be minimum 8 characters
- Acceptance criteria: POST /users returns 201 with user object,
  400 if email already exists
 
### FR-002: JWT Authentication
- System shall issue access tokens (15 min expiry) and refresh tokens (7 day)
- Acceptance criteria: POST /auth/login returns both tokens,
  401 on invalid credentials, 403 on expired access token

Phase 3 — Solutioning

You engage Winston (Architect) who runs bmad-create-architecture. He reads PRD.md and produces architecture.md containing Architecture Decision Records (ADRs), system design, component relationships, and technology choices.

Sample architecture.md excerpt:

markdown
# Architecture — TaskFlow API
 
## ADR-001: Database
Decision: PostgreSQL via SQLAlchemy 2.0 async
Rationale: FR-005 requires concurrent workspace updates; async ORM
prevents blocking under load. Alembic for migrations.
 
## ADR-002: Authentication
Decision: python-jose for JWT, passlib bcrypt for hashing
Rationale: Mature libraries, widely used in FastAPI ecosystem,
adequate for FR-002 token requirements.
 
## Component Diagram
┌─────────────┐    ┌──────────────┐    ┌──────────────┐
│   FastAPI   │───▶│   Services   │───▶│  PostgreSQL  │
│   Routes    │    │   Layer      │    │   (async)    │
└─────────────┘    └──────────────┘    └──────────────┘
       │                  │
       ▼                  ▼
┌─────────────┐    ┌──────────────┐
│  Pydantic   │    │    Redis     │
│  Schemas    │    │  (sessions)  │
└─────────────┘    └──────────────┘

John (PM) then runs bmad-create-epics-and-stories to break the architecture into epics and story files, informed by the technical constraints Winston identified.

Phase 4 — Implementation

Bob (Scrum Master) runs bmad-create-story to generate individual story files. Amelia (Developer) implements each story via bmad-dev-story. Quinn (QA) generates tests. Amelia runs bmad-code-review before handoff.

The handoff file chain

code
Mary produces → product-brief.md
                        │
John consumes → PRD.md
                        │
Winston consumes → architecture.md + ADRs
                        │
John consumes → epics/ (one file per epic)
                        │
Bob consumes → story-files/ (one file per story)
                        │
Amelia consumes → implemented code + tests

The project-context.md file is auto-loaded by every agent in every workflow. It holds the technology stack, naming conventions, and implementation rules — the shared memory that keeps all agents consistent.

markdown
# project-context.md
 
## Stack
- FastAPI 0.111 / Python 3.12
- SQLAlchemy 2.0 async / Alembic
- Pydantic v2
- PostgreSQL 16
 
## Naming conventions
- Routes: snake_case, plural nouns (/workspace**s**/, /task**s**/)
- Models: PascalCase (User, WorkspaceMember)
- Services: module per domain (user_service.py, workspace_service.py)
- Tests: test_{module}_{function}.py
 
## Implementation rules
- All database operations through service layer, never directly in routes
- Never return raw SQLAlchemy models; always map to Pydantic response schemas
- All endpoints require auth except /health, /auth/login, /users (POST)

Party mode

When facing a difficult architectural decision, you invoke bmad-party-mode. Two to three agents debate the decision in the same session — for example, Winston (Architect) and John (PM) may disagree on whether to introduce a message queue in v1. Party mode surfaces trade-offs that a single agent would rationalize away.

When to use BMAD

BMAD's ceremony is its value. Use it when:

  • The project will span weeks or months
  • Multiple real humans need to review artifacts before implementation starts
  • You need auditable decisions (ADRs) for compliance or team alignment
  • The cost of a wrong architectural decision is high
  • You want to simulate a full engineering team's review process

Do not use BMAD for a bug fix, a quick feature addition, or a weekend project. The upfront investment in PRD.md and architecture.md only pays off at scale.

Part 4: Key Decision Framework (15 min)

Comparison table

DimensionSingle AgentGSD Sub-AgentBMAD Team Agent
Setup time0 min15–30 min2–4 hours
Ideal task sizeMinutes–hoursHours–daysDays–weeks
Context windowDegrades over sessionFresh per taskFresh per role
ConsistencyDrifts in long sessionsGuaranteed by PLAN.mdGuaranteed by handoff files
Artifacts producedCode onlyPLAN.md, SUMMARY.md, git commitsPRD, arch, epics, stories, code
Best forBug fixes, isolated featuresMulti-phase projects, high task countLarge greenfield, team alignment
Token costLowMedium (planner + executor per task)High (many specialist agents)

Token cost comparison for a 10-feature project

Single agent (one long session):

code
Session = ~300K tokens total
Tasks 1–5: good quality (fresh context)
Tasks 6–10: degrading quality (context saturated)
Total rework estimated: 2–3 tasks re-done = ~60K extra tokens
Effective cost: ~360K tokens, inconsistent quality

GSD sub-agent:

code
/gsd:new-project session: ~15K tokens
10 × /gsd:plan-phase (planner agent): ~5K tokens each = 50K
10 × /gsd:execute-phase (executor agent): ~20K tokens each = 200K
Total: ~265K tokens, consistent quality across all 10 tasks

BMAD team agent:

code
Mary (analysis): ~30K tokens
John (PRD): ~40K tokens
Winston (architecture): ~50K tokens
John (epics/stories): ~30K tokens
Bob (story files, 10 stories): ~50K tokens
Amelia (implementation, 10 stories): ~200K tokens
Quinn (tests): ~40K tokens
Total: ~440K tokens, highest artifact quality, auditable decisions

Time comparison

PhaseSingle AgentGSDBMAD
Setup / planning030 min3 hours
Per feature executionFastFastModerate
Total for 10 featuresFast start, slow finishConsistent throughoutSlow start, reliable finish
Debugging reworkHigh (context drift)Low (clean execution)Very low (clear specs)

The decision rule

code
Is the task a single bug or isolated change?
  YES → Single agent. Start typing.

Is the task a multi-step feature or a project with 5+ distinct deliverables?
  YES → GSD. Run /gsd:new-project.

Is this a greenfield product, a multi-week project, or does it require
team review and auditable architectural decisions?
  YES → BMAD. Install and start with Mary or John.

Part 5: Hands-on — Run GSD Quick (20 min)

Setup

If you do not have GSD installed:

bash
npx get-shit-done-cc@latest

Verify installation:

bash
ls ~/.claude/claude_desktop_config.json   # macOS
# or check that /gsd:help is available in Claude Code

Exercise: Run /gsd:quick on a real task

Pick a real small task from your current project, or use this exercise task:

Exercise task: Add a /version endpoint to any FastAPI or Express application that returns the app name, version from package.json or pyproject.toml, and the current UTC timestamp.

Step 1: Open Claude Code in your project directory and run:

code
/gsd:quick

Step 2: When GSD asks what you want done, describe the task:

code
Add a GET /version endpoint that returns:
{
  "app": "my-app",
  "version": "1.0.0",   // read from pyproject.toml
  "timestamp": "2024-01-15T10:30:00Z"
}
No auth required.

Step 3: Observe what GSD creates:

GSD spawns a planner sub-agent. Watch the .planning/quick/ directory:

bash
ls .planning/quick/
# 001-add-version-endpoint/
#   PLAN.md

Read the generated PLAN.md before the executor runs. Notice:

  • It is concrete and atomic
  • It specifies exact file paths to modify
  • It includes a verification step
  • It specifies what the git commit message should be

Step 4: The executor runs

GSD spawns an executor sub-agent. Observe that it:

  • Reads only the PLAN.md (not your conversation)
  • Makes the file changes
  • Runs the verification
  • Writes SUMMARY.md
  • Creates an atomic git commit

Step 5: Verify the result:

bash
git log --oneline -3
# Should show a clean commit like:
# feat: add GET /version endpoint with app name, version, timestamp
 
git show HEAD --stat
# Shows exactly the files changed
 
curl http://localhost:8000/version
# {"app": "my-app", "version": "1.0.0", "timestamp": "2024-01-15T10:30:00Z"}

What to look for

  • The sub-agent had no memory of your conversation — it worked purely from the PLAN.md
  • The commit is atomic: one logical change, one commit
  • The PLAN.md is durable: you could hand it to a human developer and they could execute it without you

Part 6: Hands-on — Spawn a Sub-Agent Task (10 min)

Exercise: Use Claude Code's Agent tool directly

This exercise gives you direct experience with how GSD's executor works under the hood. You will manually invoke the Task tool (called "Agent" in Claude Code's tool list) to spawn a sub-agent.

Step 1: In Claude Code, write this exact prompt:

code
Use the Task tool to spawn a sub-agent with these instructions:

"Read the file app/models.py and count the total number of fields
across all Pydantic models defined in that file. Write the result
to .planning/field-count.txt in the format:
  ModelName: N fields
  Total: N fields"

The sub-agent should not return results to this conversation —
it should write them to the file directly.

Step 2: Observe the execution

You will see Claude Code invoke the Agent/Task tool. The sub-agent:

  • Receives only the instruction string you gave it
  • Has no access to your conversation history
  • Has a fresh 200K context window
  • Reads app/models.py independently
  • Writes the output file
  • Returns a completion status to your session

Step 3: Verify:

bash
cat .planning/field-count.txt
# UserCreate: 4 fields
# UserResponse: 3 fields
# Total: 7 fields

Step 4: Reflect on what just happened

You are now the orchestrator. You:

  • Held the intent ("I want to know the field count")
  • Wrote a clear atomic instruction
  • Delegated execution to a fresh agent
  • Received the result

This is exactly what /gsd:execute-phase does for every plan, at scale.

Writing good sub-agent instructions

The quality of sub-agent output is directly proportional to the quality of the instruction. Compare:

Vague instruction (poor output):

code
Look at the models and check if they are okay.

Atomic instruction (reliable output):

code
Read app/models.py. For each Pydantic BaseModel subclass:
1. List all fields with their types and defaults
2. Flag any Optional fields that lack a default value (Pydantic v1 gotcha)
3. Write findings to .planning/model-audit.txt
Format: ModelName.field_name: type [DEFAULT: value | MISSING]

The atomic instruction produces deterministic, repeatable results regardless of which agent runs it or when.

Checkpoint

Before moving to the next session, verify you can answer these questions without notes:

  • What are the three patterns for agent workflows, and what is the decision rule for each?
  • What does GSD's /gsd:execute-phase do structurally — what tool does it use, and why does each task get fresh context?
  • In BMAD, what is the handoff chain from idea to code? Name the agents and the files they produce.
  • At what approximate token range does a single-agent session start degrading, and what are the observable symptoms?
  • What is the core difference between GSD's PLAN.md and BMAD's story.md?
  • You have a task: add 3 new API endpoints to an existing service. Which pattern do you use and why?

Key Takeaways

1. Context isolation is the core engineering insight. Long-running sessions degrade. Every pattern in this lesson — GSD sub-agents, BMAD specialist agents — is ultimately a mechanism to give each discrete unit of work a fresh context window.

2. Planning artifacts are executable programs. A well-written PLAN.md or story.md is not documentation — it is a program that a sub-agent executes deterministically. The quality of the output is bounded by the quality of the plan, not the intelligence of the executor.

3. Ceremony scales with commitment. Single agent for minutes, GSD for hours to days, BMAD for days to weeks. Do not use a team-agent framework for a bug fix. Do not use a single session for a 20-phase project.

4. The atomic commit is the unit of progress. GSD's executor produces one atomic git commit per plan. This is not incidental — it is the mechanism that makes sub-agent work auditable, revertible, and composable.

5. You are always the orchestrator. None of these frameworks remove you from the loop. They formalize your role: you provide intent and review artifacts, agents provide execution and detail. The better you articulate intent (in prompts, PLAN.md files, PRD requirements), the more reliably agents execute.

Resources

  • GSD framework: npx get-shit-done-cc@latest — run /gsd:help inside Claude Code for the full command reference
  • BMAD Method: npx bmad-method install — run bmad-help for context-aware guidance on next steps
  • GSD Discord: /gsd:join-discord
  • Next lesson: Session A3.3 — Orchestrator Patterns and Failure Modes

Session A3.2 | Module 3: Agent Architecture | AI-Powered Development (Dev Track)

Concept Map

Try it yourself

Write Python code below and click Run to execute it in your browser.