Automated Workflow for AI-Assisted Coding

Tech

May 19

The real reason your local AI fails is not always the model. It is usually the workflow.

Most people ask a local model to do too much at once: understand the idea, clarify the requirements, design the architecture, write the code, test the work, fix the bugs, and decide what comes next. Then, when the result falls apart, they blame the model. The better approach is to separate the work. Use a stronger model as the Architect to create the plan, use a local or cheaper model as the Builder to execute focused tasks, and use a General Contractor layer to manage the queue, verify the output, and bring the Architect back in only when something breaks. The rest of this document breaks down how to turn that idea into an automated workflow for building with AI more reliably.

Purpose

This document describes a structured workflow for using multiple AI models in a controlled build process. The goal is to reduce wasted premium-model usage, improve output quality from local or lower-cost models, and keep the overall project moving through a repeatable system.

The workflow separates the work into three roles:

Architect: The strongest reasoning model. It clarifies the problem, creates the plan, resolves ambiguity, and fixes difficult blockers.
Builder: The execution model. It performs tightly scoped implementation tasks based on clear instructions.
General Contractor: The orchestration layer. It owns the task queue, dispatches work, verifies results, escalates problems, records progress, and keeps the project moving.

The key idea is simple:

Plan with the best model. Build with the cheaper or local model. Verify every step. Escalate only when needed.

Core Principle

Most failed AI builds are not only model failures. They are often planning failures.

A vague prompt forces the model to guess. A detailed plan reduces guessing. Smaller tasks reduce drift. Verification prevents hidden errors from accumulating. Escalation keeps premium-model usage focused on the places where it matters most.

The system should not rely on one model to understand the whole project, design the architecture, implement every feature, test the result, and decide what to do next. Those responsibilities should be separated.

Role Definitions

1. The Architect

The Architect is the high-capability model used for planning and diagnosis.

Examples:

Claude
Codex
GPT-class reasoning model
Other high-context, high-reasoning models

The Architect is responsible for:

Interviewing the user or product owner
Clarifying the goal
Identifying requirements
Surfacing edge cases
Defining architecture
Creating the build plan
Breaking the work into small tasks
Diagnosing blockers when the Builder fails
Rewriting unclear or failed tasks

The Architect should not be used for every implementation step if the goal is to preserve usage limits or reduce cost. It is most valuable during planning, correction, and escalation.

2. The Builder

The Builder is the execution model.

Examples:

Local model through Ollama
Open-weight model running locally
Open-weight model running in the cloud
Lower-cost cloud model

The Builder is responsible for:

Completing one task at a time
Following the provided task instructions
Staying within the allowed file scope
Producing code changes
Stopping after the assigned task

The Builder should not decide what task comes next. It should not rewrite the architecture. It should not broaden the scope. It should execute the work it was given.

3. The General Contractor

The General Contractor is the workflow controller.

This can begin as a human-controlled process, but the goal is to automate it as an orchestration layer inside an Agent Harness.

The GC owns:

The plan
The task queue
The current project state
The context sent to each model
The verification process
Retry decisions
Escalation decisions
Project memory and logging
Commits and checkpoints

The GC does not need to be the smartest model in the system. It needs to be disciplined, stateful, and tool-aware.

The GC’s core questions are:

What task is next?
What context does the Builder need?
Did the Builder stay within scope?
Did the output pass verification?
Should the Builder retry?
Should the Architect be called in?
What should be committed?
What should be recorded in the ledger?

The Four-Stage Workflow

Stage 1: Clarify the Problem with the Architect

The process starts with the Architect. The user does not begin by asking for a full implementation. Instead, the Architect interviews the user until the project is clear.

The Architect should ask about:

Product goal
Users
Core features
Non-goals
Technical constraints
Platform requirements
File structure
Data flow
State management
Edge cases
Testing expectations
Definition of done

The purpose of this stage is to remove ambiguity before implementation begins.

Core essence

Spend premium-model reasoning upfront to avoid wasted execution later.

Stage 2: Produce the Blueprint

Once the Architect understands the project, it produces the build materials.

The blueprint may include:

Product brief
User stories
Architecture plan
File structure
Data model
State model
Implementation plan
Testing plan
Build task list
Local versus cloud execution strategy

The most important artifact is the build plan.

The build plan should be broken into small, ordered tasks. Each task should be narrow enough for a Builder model to complete in one focused pass.

A good task includes:

Task ID
Title
Objective
Dependencies
Relevant files
Allowed files
Constraints
Implementation instructions
Definition of done
Verification commands
Stop condition

Core essence

Convert vague intent into executable instructions.

Stage 3: Dispatch Tasks to the Builder

The GC gives the Builder one task at a time.

The Builder receives only the context required to complete the task. It should not receive the entire planning conversation unless needed. It should not receive broad permission to continue through the whole backlog.

A typical Builder instruction looks like this:

Task 04: Add enemy spawn timing

Context:
- Existing game loop is in src/gameLoop.js
- Enemy model is in src/Enemy.js
- Spawn config lives in src/config.js

Instructions:
- Add timed enemy spawning every 2 seconds
- Use the spawn interval from config.js
- Do not modify player movement
- Do not change scoring
- Do not begin the next task

Definition of done:
- Enemies spawn repeatedly
- Spawn rate uses the configured interval
- Existing behavior still works
- No console errors

Stop condition:
- Summarize the changed files and stop

The GC then reviews the output and runs verification.

Core essence

Cheaper or local models work best when the scope is small, specific, and controlled.

Stage 4: Escalate Only When Needed

If the Builder gets stuck, fails verification, changes the wrong files, or misunderstands the task, the GC escalates to the Architect.

The Architect receives a focused escalation packet rather than the whole project.

A good escalation packet includes:

Architect escalation

Task:
Task 04: Add enemy spawn timing

Expected behavior:
Enemies should spawn every 2 seconds using config.spawnInterval.

Actual behavior:
The Builder added spawning inside the render loop, causing hundreds of enemies per second.

Relevant files:
- src/gameLoop.js
- src/config.js
- src/Enemy.js

Verification result:
npm test failed on enemy spawn timing.

Question:
Diagnose the issue and provide corrected implementation instructions for the Builder.

The Architect diagnoses the issue and returns corrected guidance. The GC then hands that corrected task back to the Builder.

Core essence

Use the strongest model for blockers, ambiguity, and architectural corrections, not routine execution.

Automated Workflow After the Plan Is Complete

Once the plan exists, the GC can automate the execution loop.

The automated GC becomes a state machine with tools.

The GC is not necessarily doing all the work itself, but it owns every step.

The clean distinction is:

The GC owns the workflow. The Builder performs execution. The Architect performs planning and diagnosis. Tools perform verification.

Minimum Viable Version

The first useful version of this system does not need to be fully autonomous. It only needs to reliably move through a plan one task at a time.

MVP loop

Load a completed build plan.
Split the plan into task cards.
Select the next unblocked task.
Send the task packet to the Builder model.
Receive code changes or instructions from the Builder.
Apply the changes.
Run verification commands.
Decide whether the task passed or failed.
If it passed, commit the work and move to the next task.
If it failed, allow one Builder retry.
If it fails again, escalate to the Architect.
Receive corrected guidance from the Architect.
Send the corrected task back to the Builder.
Record every step in the project ledger.
Repeat until the task queue is complete.

MVP responsibility breakdown

StepOwnerPerformerLoad completed build planGCGCSplit plan into task cardsGCGC or Architect during planningPick next taskGCGCSend task to BuilderGCBuilder executesApply code changesGCBuilder proposes, GC appliesRun tests, build, and lintGCLocal tools executeDecide pass or failGCGC evaluates tool resultsRetry failed task onceGCBuilder retriesEscalate repeated failureGCArchitect diagnosesHand corrected task back to BuilderGCBuilder executesSave prompt, output, and test resultsGCGCCommit successful workGCGit executes under GC control

The GC owns the control flow even when another model or tool performs the actual work.

Core System Components

1. Plan Parser

The Plan Parser converts the Architect’s build plan into structured task cards.

The ideal task format is structured data, such as JSON or YAML, even if the Architect also produces a human-readable Markdown version.

Example:

  
    {
  "id": "task-04",
  "title": "Add enemy spawn timing",
  "dependencies": ["task-01", "task-02"],
  "files_allowed": [
    "src/gameLoop.js",
    "src/Enemy.js",
    "src/config.js"
  ],
  "instructions": "Add timed enemy spawning every 2 seconds using config.spawnInterval.",
  "definition_of_done": [
    "Enemies spawn repeatedly",
    "Spawn rate uses config value",
    "No console errors",
    "Existing tests still pass"
  ],
  "verification": [
    "npm test",
    "npm run lint",
    "npm run build"
  ]
}
  

The more structured the plan, the easier it is for the GC to automate execution.

2. Task Queue

The task queue tracks every task and its state.

Recommended task states:

pending
ready
in_progress
needs_review
passed
failed
retrying
blocked
escalated
complete

The GC should only dispatch tasks whose dependencies are complete.

For the first version, tasks should run sequentially. Parallel execution can come later, but it creates more risk around merge conflicts, duplicated work, and context drift.

A safe first version is:

one Builder
one task
one verification pass
one commit
repeat

3. Builder Runner

The Builder Runner sends task packets to the execution model.

The task packet should include:

Current task
Relevant files or file excerpts
Allowed files
Constraints
Definition of done
Verification expectations
Output format
Stop condition

The stop condition is critical.

Example:

Implement only this task.
Do not start the next task.
Do not change files outside the allowed file list.
When finished, summarize changed files and stop.

The Builder should be optimized for bounded execution, not open-ended reasoning.

4. Change Application Layer

The GC needs a safe way to apply Builder output.

Possible approaches:

Ask the Builder for unified diffs
Ask the Builder to edit files through a controlled tool
Apply patches inside a temporary branch
Require the Builder to summarize intended changes before applying them
Reject changes outside the allowed file list

For early versions, using Git branches and patch review is safer than allowing unrestricted file edits.

Recommended pattern:

create task branch
apply Builder changes
run verification
if pass: commit and merge
if fail: keep branch for retry or rollback

5. Verification Layer

The Verification Layer is the most important part of automation.

The GC should never rely only on the Builder saying the task is complete.

Verification may include:

Unit tests
Integration tests
Type checks
Linting
Build checks
Static analysis
Security scans
Browser smoke tests
Console error checks
Visual regression checks
Custom acceptance tests

For a web app, verification might run:

npm test
npm run lint
npm run typecheck
npm run build
npx playwright test

For a game or interactive UI, verification might include scripted browser checks:

open app
confirm canvas renders
confirm player moves
confirm enemies spawn
confirm score updates
confirm no console errors

The GC decides pass or fail based on verification results and the task definition of done.

6. Escalation Router

The Escalation Router decides when to call the Architect.

Escalation should happen when:

The Builder fails the same task more than once
Tests fail in a way the Builder cannot resolve
The task conflicts with architecture
The required files or functions do not exist
The Builder changes files outside the allowed scope
The Builder hallucinates missing systems
The output works technically but violates product intent
The task is revealed to be underspecified

The Architect should receive a compact escalation packet.

It should not be asked to restart the whole project unless the architecture itself is broken.

7. Project Ledger

The project ledger is the GC’s memory.

Every task should record:

Task ID
Task title
Status
Prompt sent to Builder
Model used
Files provided as context
Files changed
Verification commands run
Verification output
Retry count
Escalation reason, if any
Architect response, if any
Final outcome
Commit hash
Notes for future tasks

The ledger prevents the system from losing context. It also gives the Architect clean history when deeper diagnosis is needed.

Task Lifecycle

A single task moves through the system like this:

The GC owns the state transition.

The Builder does not decide that the task is complete. The verification layer and GC decide that.

Retry and Escalation Policy

A simple first policy:

Attempt 1: Send task to Builder. If verification passes: Commit and continue. If verification fails: Send failure output back to Builder for one retry. Attempt 2: Builder retries with failure context. If verification passes: Commit and continue. If verification fails again: Escalate to Architect. Architect: Diagnoses failure and rewrites task guidance. GC: Sends corrected task back to Builder.

This avoids wasting Architect calls on small mistakes while also preventing the Builder from getting stuck in an endless loop.

Context Strategy

The GC should control context carefully.

The Architect can receive broader context because it is responsible for planning and diagnosis.

The Builder should receive narrow context because it is responsible for execution.

Architect context

The Architect may receive:

Product goals
Full architecture
Build plan
Relevant task history
Current blocker
Failed outputs
Test results
Files involved in the issue

Builder context

The Builder should receive:

Current task only
Relevant file excerpts
Allowed files
Constraints
Definition of done
Verification expectations
Stop condition

This prevents the Builder from drifting, over-editing, or trying to redesign the project.

Commit Strategy

The GC should commit after every successful task.

Recommended commit format:

Task 04: Add enemy spawn timing

- Added spawn interval handling
- Integrated enemy creation into game loop
- Preserved player movement and scoring behavior
- Verified with npm test and npm run build

Benefits:

Easy rollback
Clear project history
Better debugging
Safer automation
Cleaner escalation context

Every task should produce a checkpoint before the next task begins.

Parallelization Strategy

Parallel execution should not be part of the first version unless tasks are highly isolated.

Safe candidates for parallel work:

Documentation
Tests for already-built modules
CSS polish in isolated components
Independent utility functions
Static data files
Non-overlapping modules

Unsafe candidates for parallel work:

Shared state changes
Core architecture changes
Routing changes
Data model changes
Global styling systems
Build configuration
Authentication
Database migrations

GC response:

Improve verification. Add task-specific acceptance checks, smoke tests, or regression tests.

Recommended Prompt Packets

Architect planning prompt

You are the Architect for this project.
Your job is to clarify the goal before implementation begins.
Interview me until the product requirements, technical constraints, architecture, edge cases, and definition of done are clear.
Do not start implementation.
When enough detail is known, produce a structured build plan with small implementation tasks.
Each task should be executable by a Builder model in one focused pass.

Architect blueprint prompt

Create the project blueprint.
Include:
- Product brief
- User stories
- Architecture plan
- File structure
- State/data model
- Build plan
- Task list
- Testing strategy

For each task, include:
- ID
- Title
- Objective
- Dependencies
- Allowed files
- Instructions
- Constraints
- Definition of done
- Verification commands
- Stop condition

Do not write implementation code yet.

Builder task prompt

You are the Builder for this project.
Complete only the task below.
Do not begin the next task.
Do not change files outside the allowed file list.
Follow the definition of done exactly.
When finished, summarize changed files and stop.

Task:
[task packet]

Relevant context:
[file excerpts or summaries]

Builder retry prompt

The previous attempt failed verification.
Do not restart the project.
Fix only the issue described below.
Stay within the allowed files.
Do not begin the next task.

Task:
[task packet]

Failure output:
[test/build/lint output]

Required correction:
[GC summary of what failed]

Architect escalation prompt

You are the Architect.
The Builder failed this task after retry.
Diagnose the issue and provide corrected implementation guidance for the Builder.
Do not rewrite the whole project unless absolutely necessary.
Focus only on unblocking this task.

Task:
[task packet]

Expected behavior:
[expected result]

Actual behavior:
[observed failure]

Relevant files:
[file excerpts]

Verification output:
[test/build/lint output]

Return:
- Diagnosis
- Corrected task instructions
- Any changes to definition of done
- Any updated verification steps

Implementation Shape for an Agent Harness

The Agent Harness can treat the GC as the central runtime.

A practical architecture could include:

Project Workspace
  - source code
  - build plan
  - task queue
  - project ledger
  - model configs
  - verification scripts

GC Orchestrator
  - plan parser
  - task scheduler
  - model router
  - context builder
  - patch manager
  - verification runner
  - escalation router
  - ledger writer
  - git manager

Model Providers
  - Architect model
  - Builder model
  - Optional reviewer model

Tooling
  - shell
  - git
  - test runner
  - browser automation
  - static analysis
  - file system

The GC does not need to be a giant monolithic agent. It can be a deterministic workflow engine with model calls at specific decision points.

Practical First Build

A strong first version could be a local CLI or desktop workflow:

agent-harness run-plan build-plan.md

The harness would:

Parse the build plan.
Display the task queue.
Pick the first ready task.
Create a Git branch for the task.
Send the task to the Builder.
Apply changes.
Run verification.
Commit if successful.
Retry once if failed.
Escalate to Architect if still failed.
Continue until complete.

A more advanced version could add:

Visual task board
Model selection per task
Cost tracking
Token tracking
Approval gates
Diff viewer
Rollback controls
Prompt history
Context preview
Project memory search
Multi-builder execution
Human approval checkpoints

Human Role in the Automated System

Even with automation, the human remains important.

The human should be able to:

Approve the original plan
Edit task definitions
Override model choices
Pause the workflow
Review diffs
Approve risky changes
Reject bad work
Modify escalation rules
Add new verification checks
Roll back to previous commits

The goal is not to remove the human entirely. The goal is to remove repetitive coordination work while preserving human judgment where it matters.

Final Summary

The automated GC workflow turns AI-assisted building into a managed production process.

The Architect creates the plan.

The GC turns the plan into a task queue, sends one task at a time to the Builder, applies changes, runs verification, commits successful work, and records everything in a ledger.

The Builder executes bounded tasks and stops.

When the Builder fails, the GC retries once. If the failure persists, the GC escalates to the Architect for diagnosis and corrected instructions. The corrected task then goes back to the Builder.

The system is not based on one giant prompt. It is based on controlled handoffs, narrow context, verification, checkpoints, and escalation.

That is the core pattern:

Plan → Queue → Dispatch → Build → Verify → Commit → Continue
                              ↓
                            Fail
                              ↓
                           Retry
                              ↓
                          Escalate
                              ↓
                         Correct task
                              ↓
                           Continue

The GC is the key layer. It keeps the job site moving without letting the Builder wander, without wasting the Architect, and without forcing the human to manually manage every small handoff.

AItech

Paul Boutin

Automated Workflow for AI-Assisted Coding

Purpose

Core Principle

Role Definitions

1. The Architect

2. The Builder

3. The General Contractor

The Four-Stage Workflow

Stage 1: Clarify the Problem with the Architect

Core essence

Stage 2: Produce the Blueprint

Core essence

Stage 3: Dispatch Tasks to the Builder

Core essence

Stage 4: Escalate Only When Needed

Core essence

Automated Workflow After the Plan Is Complete

Minimum Viable Version

MVP loop

MVP responsibility breakdown

Core System Components

1. Plan Parser

2. Task Queue

3. Builder Runner

4. Change Application Layer

5. Verification Layer

6. Escalation Router

7. Project Ledger

Task Lifecycle

Retry and Escalation Policy

Context Strategy

Architect context

Builder context

Commit Strategy

Parallelization Strategy

Failure Modes to Watch For

1. The Builder starts redesigning

2. The Builder skips requirements

3. The Builder loops on the same failure

4. The task is underspecified

5. Verification is too weak

Recommended Prompt Packets

Architect planning prompt

Architect blueprint prompt

Builder task prompt

Builder retry prompt

Architect escalation prompt

Implementation Shape for an Agent Harness

Practical First Build

Human Role in the Automated System

Final Summary

Your Home Computer Matters Again

Questions? Say hi.