An end-to-end, production-ready agentic coding loop that continuously drafts, formats, tests, and reviews code until quality gates pass. It orchestrates specialized LLM agents, local developer tooling, and git-friendly utilities so you can ship reliable patches on autopilot.
Paste in your task (Jira, GitHub, or free text) and a repository (URL or local path), and watch GPT and Claude coders collaborate to deliver a ready-to-commit patch!
Agent
protocol.The flowchart below maps the agents that participate in a single iteration and how state moves between them.
flowchart TD
U[Developer Task] --> OR["AgenticCodingPipeline<br/>(Iterative Orchestrator)"]
OR -->|state| GPT[GPT Coding Agent]
OR -->|state| CLAUDE[Claude Coding Agent]
GPT --> OR
CLAUDE --> OR
OR --> F[Ruff Formatter]
F --> OR
OR --> T["Claude Test Author<br/>Pytest Runner"]
T -->|tests pass?| OR
OR --> Q[Gemini QA Reviewer]
Q -->|PASS?| OR
OR -->|status| OUT[Ready-to-commit Patch]
OR -. feedback .-> GPT
OR -. feedback .-> CLAUDE
The sequence diagram highlights the concrete API calls triggered when you run the bundled CLI.
sequenceDiagram
participant Dev as Developer
participant CLI as CLI Runner
participant Pipe as Pipeline
participant GPT as GPT Coder
participant CLAUDE as Claude Coder
participant Ruff as Ruff Formatter
participant Pytest as Pytest Runner
participant Gemini as Gemini QA
Dev->>CLI: Provide coding task
CLI->>Pipe: pipeline.run(task)
Pipe->>GPT: run(state)
GPT-->>Pipe: proposed_code
Pipe->>CLAUDE: run(state)
CLAUDE-->>Pipe: refined proposed_code
Pipe->>Ruff: run(state)
Ruff-->>Pipe: formatted proposed_code
Pipe->>Pytest: run(state)
Pytest-->>Pipe: tests_passed + test_output
Pipe->>Gemini: run(state)
Gemini-->>Pipe: qa_passed + qa_output
Pipe-->>CLI: status + feedback
CLI-->>Dev: Summarise outcome
The state machine makes the retry logic explicit so you can reason about success and failure conditions.
stateDiagram-v2
[*] --> Drafting
Drafting --> Drafting: Additional coder agents
Drafting --> Formatting
Formatting --> Testing
Testing --> Drafting: tests_passed = false
Testing --> QA: tests_passed = true
QA --> Drafting: qa_passed = false
QA --> Completed: qa_passed = true
Drafting --> Failed: proposed_code is empty
QA --> Failed: retries exhausted
Testing --> Failed: retries exhausted
Completed --> [*]
Failed --> [*]
flowchart TD
%% Nodes
T[Task]
ACP["AgenticCodingPipeline<br/>(max 3 iterations)"]
SD["Shared dict state:<br/>{task, proposed_code, tests_passed,<br/>qa_passed, feedback, ...}"]
CA["Coding agents<br/>(OpenAI + Claude)"]
FA["Formatting agents<br/>(Ruff --fix)"]
TA["Testing agents<br/>(Claude writes tests → pytest run)"]
QA["QA agents<br/>(Gemini review)"]
DEC{"qa_passed = true<br/>OR<br/>max iterations reached?"}
DONE["Completion<br/>status = "completed""]
%% Main flow
T --> ACP
ACP -. maintains .- SD
ACP -->|produce / refine proposed_code| CA
CA --> FA
FA -->|normalize style before verification| TA
TA -->|set tests_passed, attach test_output| QA
QA -->|set qa_passed, attach qa_output| DEC
DEC -- Yes --> DONE
DEC -- No (feedback) --> CA
$PATH
:
ruff
for formatting.pytest
for executing generated suites.OPENAI_API_KEY
for GPT coders.ANTHROPIC_API_KEY
for Claude coders/testers.GOOGLE_API_KEY
for Gemini QA review.tools/git.py
.# 1. Create an isolated environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# 2. Install repo dependencies and developer tooling
pip install --upgrade pip
pip install -e . # exposes shared agentic_ai clients
pip install ruff pytest # ensure local tools are present on PATH
Prefer Poetry? Swap the virtualenv commands with:
poetry install
poetry shell
Set credentials as environment variables (or drop them into a .env
loaded by your shell):
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-...
export GOOGLE_API_KEY=sk-...
# Optional: enable GitHub/Jira issue intake
export GITHUB_TOKEN=ghp_... # increases GitHub API rate limits
export JIRA_BASE_URL=https://yourcompany.atlassian.net
export JIRA_EMAIL=you@yourcompany.com
export JIRA_API_TOKEN=atlassian_pat_or_api_token
Each agent reads from these keys when instantiated, so they must be available before running the CLI or creating the pipeline. GitHub/Jira variables are only required if you plan to resolve tasks from issues/tickets.
Knob | Where to set | Default | Effect |
---|---|---|---|
OPENAI_API_KEY |
env var | – | Authenticates the GPT-based CodingAgent . |
ANTHROPIC_API_KEY |
env var | – | Powers the Claude-based coder and testing agent. |
GOOGLE_API_KEY |
env var | – | Enables Gemini QA review verdicts. |
GITHUB_TOKEN |
env var | optional | Increases GitHub API rate limits for issue fetches. |
JIRA_BASE_URL |
env var | optional | Base URL for your Jira (e.g., https://yourcompany.atlassian.net ). |
JIRA_EMAIL |
env var | optional | Account email used for Jira API authentication. |
JIRA_API_TOKEN |
env var | optional | Atlassian API token or personal access token. |
max_iterations |
AgenticCodingPipeline(max_iterations=...) |
3 |
Caps retries before marking the run as failed. |
coders /formatters /testers /reviewers |
Constructor args | see CLI defaults | Controls which agents participate in the loop. |
cd Agentic-Coding-Pipeline
# Text task only
python run.py "Add pagination support to the API client"
# With repository (URL or local path)
python run.py --repo https://github.com/owner/repo.git "Implement caching for search"
python run.py --repo /path/to/local/repo "Refactor auth middleware"
# Resolve task from GitHub or Jira
python run.py --repo /path/to/repo --github owner/repo#123
python run.py --repo https://github.com/owner/repo.git --jira PROJ-456
--repo
accepts a Git URL or a local directory path. Git URLs are cloned shallowly to a temp workspace.--github
accepts https://github.com/owner/repo/issues/123
or owner/repo#123
(uses GITHUB_TOKEN
if provided).--jira
accepts a Jira URL like https://yourcompany.atlassian.net/browse/KEY-123
or a bare key KEY-123
(requires JIRA_BASE_URL
, JIRA_EMAIL
, JIRA_API_TOKEN
).The CLI streams human-friendly progress logs and prints the final status. Failed tests or QA commentary appear inline for quick triage.
from agents.coding import CodingAgent
from agents.formatting import FormattingAgent
from agents.testing import TestingAgent
from agents.qa import QAAgent
from pipeline import AgenticCodingPipeline
pipeline = AgenticCodingPipeline(
coders=[CodingAgent(name="gpt"), CodingAgent(name="claude")],
formatters=[FormattingAgent(name="formatter")],
testers=[TestingAgent(name="tests")],
reviewers=[QAAgent(name="qa")],
max_iterations=5,
)
result = pipeline.run("Implement a prime sieve")
print(result["status"], result.get("feedback"))
The return value is a serializable dict suitable for downstream orchestration (CI jobs, MCP routing, etc.).
A zero-build Vue 3 chat-style UI ships with the pipeline.
uvicorn agentic_ai.app:app --reload
http://127.0.0.1:8000/coding
to access the UI.Files:
Agentic-Coding-Pipeline/ui/index.html
– UI markup powered by Vue via CDN.Agentic-Coding-Pipeline/ui/app.js
– Markdown rendering helper.Agentic-Coding-Pipeline/ui/styles.css
– Dark, minimal theme.The backend exposes both streaming and non-streaming endpoints for the coding pipeline.
/api/coding/stream
– SSE-style stream useful for chat UIs./api/coding/run
– One-shot execution returning final JSON.Request body (both endpoints):
{
"repo": "https://github.com/owner/repo.git" | "/path/to/local/repo" | null,
"github": "https://github.com/owner/repo/issues/123" | "owner/repo#123" | null,
"jira": "https://your.atlassian.net/browse/KEY-123" | "KEY-123" | null,
"task": "Free-text task" | null
}
Streaming response events:
event: log
with data: <human-friendly progress>\n
event: done
with data: {"status": "completed|failed", "repo": {...}, "task": {...}}
Implementation:
/coding
, /coding/app.js
, /coding/styles.css
.src/agentic_ai/app.py
and delegate to shared services at Agentic-Coding-Pipeline/services.py
.Use the monorepo SDKs to call the coding endpoints directly:
import { AgenticAIClient } from "../clients/ts/src/client";
const c = new AgenticAIClient({ baseUrl: "http://127.0.0.1:8000" });
await c.codingStream({ repo: "/path/to/repo", task: "Add pagination", onEvent: (ev) => console.log(ev.event, ev.data) });
from clients.python.agentic_ai_client import AgenticAIClient
import anyio
async def run():
async with AgenticAIClient("http://127.0.0.1:8000") as c:
await c.coding_stream(repo="/path/to/repo", github="owner/repo#123", on_event=lambda ev, data: print(ev, data))
anyio.run(run)
See root README “Client SDKs” for more capabilities and examples.
Repository input:
.git
) → shallow clone to a temp folder.Task input (priority):
1) GitHub issue (URL or owner/repo#123
). Uses GITHUB_TOKEN
if provided.
2) Jira ticket (URL or KEY-123
). Requires JIRA_BASE_URL
, JIRA_EMAIL
, JIRA_API_TOKEN
.
3) Free-text task. The first line is treated as the title; the rest as description.
Shared logic lives in Agentic-Coding-Pipeline/services.py
and is used by both CLI and HTTP API, ensuring consistent behavior across interfaces.
max_iterations
).The shared state dictionary evolves as agents run. Understanding the keys makes it easy to plug in dashboards or custom logic.
Key | Producer | Consumer(s) | Description |
---|---|---|---|
task |
CLI / caller | All agents | Original human request seeded at pipeline start. |
proposed_code |
Coding agents, formatter | Testers, reviewers | Latest candidate solution being evaluated. |
tests_passed |
Testing agents | Orchestrator loop | Boolean signal to continue to QA. Failures trigger iteration feedback. |
test_output |
Testing agents | Humans / coders | Raw pytest stdout+stderr, preserved for diagnosis or re-prompting. |
qa_passed |
QA agents | Orchestrator loop | Indicates whether QA cleared the change. |
qa_output |
QA agents | Humans / coders | Reviewer commentary (PASS or actionable issues). |
feedback |
Orchestrator | Coders, humans | When tests/QA fail, the orchestrator surfaces the raw output as feedback for the next iteration. |
status |
Orchestrator | Callers | Final lifecycle marker: completed or failed . |
reason |
Orchestrator | Callers | Populated when a coder agent returns no code to explain the early failure. |
Agentic-Coding-Pipeline/
├── README.md # This guide
├── __init__.py # Package marker
├── pipeline.py # Iterative orchestration logic
├── run.py # CLI entry point
├── services.py # Shared repo/task intake + streaming runner
├── ui/ # Zero-build Vue-based chat UI
│ ├── index.html # Mounted at /coding
│ ├── app.js # Markdown helper
│ └── styles.css # UI styles
├── agents/ # Role-specific LLM wrappers
│ ├── base.py # Agent protocol + base dataclass
│ ├── coding.py # Code synthesis agents
│ ├── formatting.py # Ruff-backed formatter agent
│ ├── testing.py # Test generation + execution agent
│ └── qa.py # LLM review agent
├── tools/ # Optional utilities
│ ├── git.py # git commit helper
│ └── test_runner.py # Standalone pytest runner helper
└── tests/
└── test_pipeline.py # Orchestration regression tests
Role | Default LLM client | Prompt strategy | Key state inputs | Key outputs |
---|---|---|---|---|
CodingAgent |
OpenAIClient / ClaudeClient |
Seed or improve a Python solution depending on whether proposed_code already exists. |
task , proposed_code |
Updated proposed_code |
FormattingAgent |
Ruff CLI | Runs ruff --fix against a temp file and reloads the formatted contents. |
proposed_code |
Normalized proposed_code |
TestingAgent |
ClaudeClient |
Generates pytest suites covering the solution, executes them, and stores stdout/stderr. | proposed_code |
tests_passed , test_output |
QAAgent |
GeminiClient |
Requests a PASS/FAIL verdict with commentary on issues found. | proposed_code |
qa_passed , qa_output |
All agents share the lightweight Agent
protocol, so custom roles (docs writers, security scanners, benchmark runners) can drop in without changing the orchestrator.
Understanding default prompt templates helps tailor model behaviour.
Swap or augment these strings in custom agents to target different languages, frameworks, or review policies.
test_output
for debugging or feedback to coders.tools.test_runner.run_pytest()
if you need to reuse the execution primitive in bespoke agents or CI hooks.tests/test_pipeline.py
mocks the LLMs to validate that the orchestrator converges on a completed state.Capture formatted snippets with git helpers for deterministic commits:
from tools.git import commit
commit(["path/to/file.py"], "feat: apply automated patch")
The helper stages provided paths and creates a commit using standard git plumbing.
status
checks to build unattended pipelines (e.g., auto-PR bots) once QA passes.run.py
demonstrates how to wire agents with explicit LLM clients, making it easy to lift into Airflow, Dagster, or bespoke schedulers.pipeline.run()
and inspect feedback
to implement exponential backoff, diff-based heuristics, or fallback model selection.proposed_code
, test_output
, and qa_output
to S3 or issue comments to give humans full context on automated changes.This pipeline registers with the shared mcp
package. The FastAPI-backed MCPServer exposes a unified toolbox (web search, browsing, direct LLM calls) so any pipeline in the monorepo can dispatch coding tasks remotely or as part of a larger workflow.
To plug this pipeline into the MCP network:
AgenticCodingPipeline
in your MCP task handler.pipeline.run(task)
and route the resulting state (status, feedback, artifacts) back to the caller or next graph node.LLMClient
implementations when constructing agents (e.g., Azure OpenAI, local models).formatters
, testers
, or reviewers
lists with additional agents (docs generation, security scans, benchmarks).max_iterations
or inject heuristics (stop early on repeated identical feedback, escalate on critical QA failures).proposed_code
(e.g., dict of path → content) and customize formatter/tester agents accordingly.tests_passed
, qa_passed
) provide natural metric dimensions.max_iterations
when cost thresholds are hit; fallback to cheaper coders or disable QA temporarily for low-risk changes.tools.git.commit
in production pipelines.feedback
, giving subsequent iterations or humans concrete guidance.failed
, preserving the last known state for inspection.Symptom | Likely cause | Fix |
---|---|---|
ModuleNotFoundError: agentic_ai.llm |
Repo dependencies not installed | Run pip install -e . from repo root or poetry install . |
ruff: command not found |
Ruff not installed in current environment | pip install ruff or poetry run ruff --version to verify. |
Pytest exits with import errors | Generated code expects extra dependencies | Update prompts to constrain imports or pre-install needed packages. |
QA always fails with “PASS” missing | Reviewer prompt/casing changed | Ensure reviewer returns a string containing PASS on success or tweak condition accordingly. |
Pipeline stops after coder stage | An agent returned an empty string | Inspect reason for "coder did not return code" and adjust prompts or guardrails. |
Iterations never succeed | Feedback not consumed by coders | Make coder prompts reference feedback to incorporate failures when you extend the pipeline. |
GitHub issue not resolved | Missing/low GitHub rate limit | Set GITHUB_TOKEN to raise limits or try again later. |
Jira ticket not resolved | Missing Jira credentials | Set JIRA_BASE_URL , JIRA_EMAIL , JIRA_API_TOKEN . |
Can I run only one coder? Yes. Provide a single CodingAgent
(or even a custom agent) in the coders
list.
How do I persist generated code to disk? Have an agent write proposed_code
to the desired file path before QA, or call the git helper after completion.
Can the pipeline edit multi-file projects? The sample agents operate on a single code snippet, but the shared state supports richer payloads (e.g., dict of file paths). Add formatters/testers that understand your structure.
How do I integrate with CI? Wrap AgenticCodingPipeline.run()
inside a job that prepares credentials and tools, then treat the returned state as the artifact for subsequent stages (commit, PR, deployment).
Where do I tweak prompts? Each agent defines its own prompt string—modify them directly or subclass the agent to inject dynamic templates.
Happy shipping! 🚀