Agentic-AI-Pipeline

Agentic Coding Pipeline (with Multi-LLM Pair Programming)

An end-to-end, production-ready agentic coding loop that continuously drafts, formats, tests, and reviews code until quality gates pass. It orchestrates specialized LLM agents, local developer tooling, and git-friendly utilities so you can ship reliable patches on autopilot.

Paste in your task (Jira, GitHub, or free text) and a repository (URL or local path), and watch GPT and Claude coders collaborate to deliver a ready-to-commit patch!

Python Poetry OpenAI Anthropic Google Gemini Pytest Ruff Git CLI Makefile Automation Code%20Quality MCP Open%20Source


Contents


What you get


Architecture

The flowchart below maps the agents that participate in a single iteration and how state moves between them.

flowchart TD
    U[Developer Task] --> OR["AgenticCodingPipeline<br/>(Iterative Orchestrator)"]
    OR -->|state| GPT[GPT Coding Agent]
    OR -->|state| CLAUDE[Claude Coding Agent]
    GPT --> OR
    CLAUDE --> OR
    OR --> F[Ruff Formatter]
    F --> OR
    OR --> T["Claude Test Author<br/>Pytest Runner"]
    T -->|tests pass?| OR
    OR --> Q[Gemini QA Reviewer]
    Q -->|PASS?| OR
    OR -->|status| OUT[Ready-to-commit Patch]
    OR -. feedback .-> GPT
    OR -. feedback .-> CLAUDE

The sequence diagram highlights the concrete API calls triggered when you run the bundled CLI.

sequenceDiagram
    participant Dev as Developer
    participant CLI as CLI Runner
    participant Pipe as Pipeline
    participant GPT as GPT Coder
    participant CLAUDE as Claude Coder
    participant Ruff as Ruff Formatter
    participant Pytest as Pytest Runner
    participant Gemini as Gemini QA

    Dev->>CLI: Provide coding task
    CLI->>Pipe: pipeline.run(task)
    Pipe->>GPT: run(state)
    GPT-->>Pipe: proposed_code
    Pipe->>CLAUDE: run(state)
    CLAUDE-->>Pipe: refined proposed_code
    Pipe->>Ruff: run(state)
    Ruff-->>Pipe: formatted proposed_code
    Pipe->>Pytest: run(state)
    Pytest-->>Pipe: tests_passed + test_output
    Pipe->>Gemini: run(state)
    Gemini-->>Pipe: qa_passed + qa_output
    Pipe-->>CLI: status + feedback
    CLI-->>Dev: Summarise outcome

The state machine makes the retry logic explicit so you can reason about success and failure conditions.

stateDiagram-v2
    [*] --> Drafting
    Drafting --> Drafting: Additional coder agents
    Drafting --> Formatting
    Formatting --> Testing
    Testing --> Drafting: tests_passed = false
    Testing --> QA: tests_passed = true
    QA --> Drafting: qa_passed = false
    QA --> Completed: qa_passed = true
    Drafting --> Failed: proposed_code is empty
    QA --> Failed: retries exhausted
    Testing --> Failed: retries exhausted
    Completed --> [*]
    Failed --> [*]
flowchart TD
    %% Nodes
    T[Task]
    ACP["AgenticCodingPipeline<br/>(max 3 iterations)"]
    SD["Shared dict state:<br/>{task, proposed_code, tests_passed,<br/>qa_passed, feedback, ...}"]
    CA["Coding agents<br/>(OpenAI + Claude)"]
    FA["Formatting agents<br/>(Ruff --fix)"]
    TA["Testing agents<br/>(Claude writes tests → pytest run)"]
    QA["QA agents<br/>(Gemini review)"]
    DEC{"qa_passed = true<br/>OR<br/>max iterations reached?"}
    DONE["Completion<br/>status = &quot;completed&quot;"]

    %% Main flow
    T --> ACP
    ACP -. maintains .- SD
    ACP -->|produce / refine proposed_code| CA
    CA --> FA
    FA -->|normalize style before verification| TA
    TA -->|set tests_passed, attach test_output| QA
    QA -->|set qa_passed, attach qa_output| DEC
    DEC -- Yes --> DONE
    DEC -- No (feedback) --> CA

Prerequisites


Install

# 1. Create an isolated environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# 2. Install repo dependencies and developer tooling
pip install --upgrade pip
pip install -e .           # exposes shared agentic_ai clients
pip install ruff pytest    # ensure local tools are present on PATH

Prefer Poetry? Swap the virtualenv commands with:

poetry install
poetry shell

Configure

Set credentials as environment variables (or drop them into a .env loaded by your shell):

export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-...
export GOOGLE_API_KEY=sk-...
# Optional: enable GitHub/Jira issue intake
export GITHUB_TOKEN=ghp_...                      # increases GitHub API rate limits
export JIRA_BASE_URL=https://yourcompany.atlassian.net
export JIRA_EMAIL=you@yourcompany.com
export JIRA_API_TOKEN=atlassian_pat_or_api_token

Each agent reads from these keys when instantiated, so they must be available before running the CLI or creating the pipeline. GitHub/Jira variables are only required if you plan to resolve tasks from issues/tickets.

Configuration reference

Knob Where to set Default Effect
OPENAI_API_KEY env var – Authenticates the GPT-based CodingAgent.
ANTHROPIC_API_KEY env var – Powers the Claude-based coder and testing agent.
GOOGLE_API_KEY env var – Enables Gemini QA review verdicts.
GITHUB_TOKEN env var optional Increases GitHub API rate limits for issue fetches.
JIRA_BASE_URL env var optional Base URL for your Jira (e.g., https://yourcompany.atlassian.net).
JIRA_EMAIL env var optional Account email used for Jira API authentication.
JIRA_API_TOKEN env var optional Atlassian API token or personal access token.
max_iterations AgenticCodingPipeline(max_iterations=...) 3 Caps retries before marking the run as failed.
coders/formatters/testers/reviewers Constructor args see CLI defaults Controls which agents participate in the loop.

Run

CLI (batteries included)

cd Agentic-Coding-Pipeline
# Text task only
python run.py "Add pagination support to the API client"

# With repository (URL or local path)
python run.py --repo https://github.com/owner/repo.git "Implement caching for search"
python run.py --repo /path/to/local/repo "Refactor auth middleware"

# Resolve task from GitHub or Jira
python run.py --repo /path/to/repo --github owner/repo#123
python run.py --repo https://github.com/owner/repo.git --jira PROJ-456

The CLI streams human-friendly progress logs and prints the final status. Failed tests or QA commentary appear inline for quick triage.

Programmatic usage

from agents.coding import CodingAgent
from agents.formatting import FormattingAgent
from agents.testing import TestingAgent
from agents.qa import QAAgent
from pipeline import AgenticCodingPipeline

pipeline = AgenticCodingPipeline(
    coders=[CodingAgent(name="gpt"), CodingAgent(name="claude")],
    formatters=[FormattingAgent(name="formatter")],
    testers=[TestingAgent(name="tests")],
    reviewers=[QAAgent(name="qa")],
    max_iterations=5,
)
result = pipeline.run("Implement a prime sieve")
print(result["status"], result.get("feedback"))

The return value is a serializable dict suitable for downstream orchestration (CI jobs, MCP routing, etc.).


Web UI

A zero-build Vue 3 chat-style UI ships with the pipeline.

uvicorn agentic_ai.app:app --reload

Files:


HTTP API

The backend exposes both streaming and non-streaming endpoints for the coding pipeline.

Request body (both endpoints):

{
  "repo": "https://github.com/owner/repo.git" | "/path/to/local/repo" | null,
  "github": "https://github.com/owner/repo/issues/123" | "owner/repo#123" | null,
  "jira": "https://your.atlassian.net/browse/KEY-123" | "KEY-123" | null,
  "task": "Free-text task" | null
}

Streaming response events:

Implementation:


Client SDKs

Use the monorepo SDKs to call the coding endpoints directly:

import { AgenticAIClient } from "../clients/ts/src/client";
const c = new AgenticAIClient({ baseUrl: "http://127.0.0.1:8000" });
await c.codingStream({ repo: "/path/to/repo", task: "Add pagination", onEvent: (ev) => console.log(ev.event, ev.data) });
from clients.python.agentic_ai_client import AgenticAIClient
import anyio

async def run():
    async with AgenticAIClient("http://127.0.0.1:8000") as c:
        await c.coding_stream(repo="/path/to/repo", github="owner/repo#123", on_event=lambda ev, data: print(ev, data))
anyio.run(run)

See root README “Client SDKs” for more capabilities and examples.

Repo & Task Intake

Repository input:

Task input (priority): 1) GitHub issue (URL or owner/repo#123). Uses GITHUB_TOKEN if provided. 2) Jira ticket (URL or KEY-123). Requires JIRA_BASE_URL, JIRA_EMAIL, JIRA_API_TOKEN. 3) Free-text task. The first line is treated as the title; the rest as description.

Shared logic lives in Agentic-Coding-Pipeline/services.py and is used by both CLI and HTTP API, ensuring consistent behavior across interfaces.


How it works (step-by-step)

  1. Intake – Resolve task (GitHub, Jira, or text) and analyze repository context (URL clone or local path).
  2. First coder pass – GPT-based agent drafts an initial solution (or improves an existing snippet).
  3. Second coder pass – Claude-based agent refines the proposal, incorporating earlier feedback or failures.
  4. Formatter pass – Ruff auto-fixes lint/style deviations before any tests run.
  5. Test synthesis – Claude drafts pytest suites tailored to the generated code.
  6. Local execution – Pytest runs in an isolated temp directory; stdout/stderr are captured for diagnostics.
  7. QA verdict – Gemini reviews the candidate patch, emitting PASS/FAIL plus commentary.
  8. Loop or finish – Failures store feedback and trigger another iteration (up to max_iterations).

State contract

The shared state dictionary evolves as agents run. Understanding the keys makes it easy to plug in dashboards or custom logic.

Key Producer Consumer(s) Description
task CLI / caller All agents Original human request seeded at pipeline start.
proposed_code Coding agents, formatter Testers, reviewers Latest candidate solution being evaluated.
tests_passed Testing agents Orchestrator loop Boolean signal to continue to QA. Failures trigger iteration feedback.
test_output Testing agents Humans / coders Raw pytest stdout+stderr, preserved for diagnosis or re-prompting.
qa_passed QA agents Orchestrator loop Indicates whether QA cleared the change.
qa_output QA agents Humans / coders Reviewer commentary (PASS or actionable issues).
feedback Orchestrator Coders, humans When tests/QA fail, the orchestrator surfaces the raw output as feedback for the next iteration.
status Orchestrator Callers Final lifecycle marker: completed or failed.
reason Orchestrator Callers Populated when a coder agent returns no code to explain the early failure.

Project structure

Agentic-Coding-Pipeline/
├── README.md                     # This guide
├── __init__.py                   # Package marker
├── pipeline.py                   # Iterative orchestration logic
├── run.py                        # CLI entry point
├── services.py                   # Shared repo/task intake + streaming runner
├── ui/                           # Zero-build Vue-based chat UI
│   ├── index.html                # Mounted at /coding
│   ├── app.js                    # Markdown helper
│   └── styles.css                # UI styles
├── agents/                       # Role-specific LLM wrappers
│   ├── base.py                   # Agent protocol + base dataclass
│   ├── coding.py                 # Code synthesis agents
│   ├── formatting.py             # Ruff-backed formatter agent
│   ├── testing.py                # Test generation + execution agent
│   └── qa.py                     # LLM review agent
├── tools/                        # Optional utilities
│   ├── git.py                    # git commit helper
│   └── test_runner.py            # Standalone pytest runner helper
└── tests/
    └── test_pipeline.py          # Orchestration regression tests

Agents (roles & prompts)

Role Default LLM client Prompt strategy Key state inputs Key outputs
CodingAgent OpenAIClient / ClaudeClient Seed or improve a Python solution depending on whether proposed_code already exists. task, proposed_code Updated proposed_code
FormattingAgent Ruff CLI Runs ruff --fix against a temp file and reloads the formatted contents. proposed_code Normalized proposed_code
TestingAgent ClaudeClient Generates pytest suites covering the solution, executes them, and stores stdout/stderr. proposed_code tests_passed, test_output
QAAgent GeminiClient Requests a PASS/FAIL verdict with commentary on issues found. proposed_code qa_passed, qa_output

All agents share the lightweight Agent protocol, so custom roles (docs writers, security scanners, benchmark runners) can drop in without changing the orchestrator.


Prompt reference

Understanding default prompt templates helps tailor model behaviour.

Swap or augment these strings in custom agents to target different languages, frameworks, or review policies.


Test orchestration


Formatting & patch hygiene


Tooling & integration patterns


MCP integration

This pipeline registers with the shared mcp package. The FastAPI-backed MCPServer exposes a unified toolbox (web search, browsing, direct LLM calls) so any pipeline in the monorepo can dispatch coding tasks remotely or as part of a larger workflow.

To plug this pipeline into the MCP network:

  1. Import AgenticCodingPipeline in your MCP task handler.
  2. Instantiate agents with the credentials available inside the MCP worker pod (often via Kubernetes secrets).
  3. Invoke pipeline.run(task) and route the resulting state (status, feedback, artifacts) back to the caller or next graph node.
  4. Optionally mount shared storage so test artifacts or generated files persist across MCP tool invocations.

Extending & customization


Operations & observability


Quality control & failure handling


Troubleshooting

Symptom Likely cause Fix
ModuleNotFoundError: agentic_ai.llm Repo dependencies not installed Run pip install -e . from repo root or poetry install.
ruff: command not found Ruff not installed in current environment pip install ruff or poetry run ruff --version to verify.
Pytest exits with import errors Generated code expects extra dependencies Update prompts to constrain imports or pre-install needed packages.
QA always fails with “PASS” missing Reviewer prompt/casing changed Ensure reviewer returns a string containing PASS on success or tweak condition accordingly.
Pipeline stops after coder stage An agent returned an empty string Inspect reason for "coder did not return code" and adjust prompts or guardrails.
Iterations never succeed Feedback not consumed by coders Make coder prompts reference feedback to incorporate failures when you extend the pipeline.
GitHub issue not resolved Missing/low GitHub rate limit Set GITHUB_TOKEN to raise limits or try again later.
Jira ticket not resolved Missing Jira credentials Set JIRA_BASE_URL, JIRA_EMAIL, JIRA_API_TOKEN.

FAQ

Can I run only one coder? Yes. Provide a single CodingAgent (or even a custom agent) in the coders list.

How do I persist generated code to disk? Have an agent write proposed_code to the desired file path before QA, or call the git helper after completion.

Can the pipeline edit multi-file projects? The sample agents operate on a single code snippet, but the shared state supports richer payloads (e.g., dict of file paths). Add formatters/testers that understand your structure.

How do I integrate with CI? Wrap AgenticCodingPipeline.run() inside a job that prepares credentials and tools, then treat the returned state as the artifact for subsequent stages (commit, PR, deployment).

Where do I tweak prompts? Each agent defines its own prompt string—modify them directly or subclass the agent to inject dynamic templates.


Happy shipping! 🚀