npm GitHub
 Local-first · Multi-agent · Programmable

A runtime. Not a chat wrapper.

Forge ships its own scheduler, sandbox, permission system, state machine, iterative tool-use executor, four-tier memory, and plugin ecosystem. You pick the model. You approve the actions. Everything is inspectable, replayable, and yours.

0 ms
doctor cold-start
0 KB
UI shell · 0 CDN
0 s
provider probe
0 · 0
providers · families
0
tests · 100% pass
Wiki contents

Jump anywhere

01 · Overview

What is Forge?

A TypeScript CLI runtime for local-first agentic software engineering. Every piece below lives in src/. Node 20+. Ships via npm and a multi-arch Docker image.

/ ORCHESTRATION

Agentic loop

Classify → plan → approve → execute with iterative tool-use → validate → review → complete → learn. Failures escalate to diagnose — never a silent loop.

/ STATE

Inspectable everything

Tasks JSON, sessions JSONL, events JSONL. Conversations are JSONL with O_APPEND concurrency. Prompt hashes are deterministic.

/ SAFETY

Default-deny permissions

Every tool call classified by risk × side-effect × sensitivity. Paths realpath-confined. Shell risk-rated; critical hard-blocked. Credentials in OS keychain.

/ MODELS

Bring your own LLM

Auto-detects Ollama, LM Studio, vLLM, llama.cpp on default ports. 41 model families classified for role routing; auto-substitutes when your configured model isn't installed.

02 · Features

Every capability, highlighted.

Every feature below is in src/. Grep from any claim to a file.

Forge REPL Interface
Forge REPL Interface
Forge Web Dashboard
Forge Web Dashboard
/ 01

Local-first

Auto-detects Ollama, LM Studio, vLLM, llama.cpp. Hosted Anthropic / OpenAI / Azure / Groq / LocalAI / Together / Fireworks are opt-in.

ollamalmstudiovllmllama.cpp
/ 02

Iterative executor

Model sees every tool result (stdout / stderr / exit) and adapts within a step. Mode-capped turn budgets.

adaptivebounded
/ 03

Validation gate

Post-step typecheck / lint failures re-enter the loop as tool results — fixed before the next step runs.

tsceslint
/ 04

DAG planner

Plans have step dependencies, risk annotations, explicit tool calls. Auto-fixer repairs common issues; cycles rejected.

topo-sort
/ 05

Reviewer + debugger

Reviewer gates completion. On terminal failure, debugger agent diagnoses root cause before marking failed.

diagnose
/ 06

4-tier memory

Hot (session) · warm (SQLite recent) · cold (lazy project index) · learning (patterns with decaying confidence).

SQLiteFTS5
/ 07

Default-deny permissions

Risk × side-effect × sensitivity classified at every call. --skip-permissions only waives routine prompts.

trust-calibrated
/ 08

Realpath sandbox

Every path resolved to realpath, confined to project root. Always-forbidden targets (SSH keys, AWS creds) hard-blocked.

symlink-proof
/ 09

Shell risk classifier

Commands rated before execution. rm -rf /, sudo, fork bombs, curl-to-shell hard-blocked.

sandbox
/ 10

OS keychain

macOS Security, libsecret, Windows DPAPI. AES-GCM encrypted fallback if unavailable.

DPAPIlibsecret
/ 11

Concurrent-writer safe

REPL + UI + subagents edit the same conversation via POSIX O_APPEND + mkdir lockfile fallback.

POSIX
/ 12

Prompt-injection defence

Untrusted content (web / MCP) fenced as data, never instructions. Redactor scrubs secrets before logs.

fenceredact
/ 13

MCP bridge

Model Context Protocol: stdio + HTTP-stream. OAuth 2.0 + PKCE or API-key auth. Tokens in keychain.

MCPOAuth2
/ 14

Skills & instructions

Markdown + YAML frontmatter in ~/.forge/skills/. Per-project overrides.

.md skills
/ 15

Live dashboard

HTTP + WebSocket UI. Vanilla JS, < 100 KB, zero CDN. Delta watchers ref-counted across tabs.

vanilla JS
/ 16

Router reliability

Per-provider rate limit, circuit breaker, prompt cache, USD cost ledger. 1.5 s provider probes.

breaker
/ 17

Release signing

Manifest signed with Ed25519. SHA-256 per artefact. npm publishes with provenance.

Ed25519provenance
/ 18

Multi-arch containers

Single Dockerfile serves CLI + UI. Non-root, HEALTHCHECK, OCI labels, ~355 MB.

amd64arm64
03 · Agentic loop

Classify · plan · approve · execute · validate · review · complete.

Source: src/core/loop.ts. Retry cap is 3. The debugger agent runs root-cause diagnosis before marking a task failed.

---
config:
  look: handDrawn
  theme: base
  themeVariables:
    fontSize: 16px
---
flowchart LR
  IN(("USER
prompt")) --> CLS["CLASSIFY
intent · risk · scope"] CLS --> PL["PLAN
DAG · steps · deps"] PL --> AP{"Approve?"} AP -- edit --> PL AP -- no --> CNCL(["cancelled"]) AP -- yes --> EX["EXECUTE
iterative tool-use"] EX --> VG{"Validate
(tsc · lint)"} VG -- fails + budget --> EX VG -- fails + out --> DX["DIAGNOSE"] DX --> FL(["failed"]) VG -- pass --> RV["REVIEW
reviewer agent"] RV -- bounce --> EX RV -- pass --> DONE(["completed"]) DONE --> LRN["LEARN
patterns updated"] classDef term fill:#0a1a14,stroke:#10b981,color:#d1fae5,stroke-width:2px classDef fail fill:#1a0909,stroke:#f87171,color:#fee2e2,stroke-width:2px classDef step fill:#0f1726,stroke:#38bdf8,color:#e0f2fe,stroke-width:1.8px classDef gate fill:#1a1634,stroke:#a78bfa,color:#ede9fe,stroke-width:1.8px classDef io fill:#0c1a24,stroke:#22d3ee,color:#cffafe,stroke-width:2px class IN io class CNCL,FL fail class DONE term class CLS,PL,EX,RV,DX,LRN step class AP,VG gate
04 · State machine

10 task statuses. Every move gated.

Enforced by LEGAL_TRANSITIONS in src/persistence/tasks.ts. Illegal moves throw state_invalid. Terminal states only re-enter via forge resume, which resets them to draft.

---
config:
  theme: base
  themeVariables:
    fontSize: 16px
---
stateDiagram-v2
  direction LR
  [*] --> draft
  draft --> planned : planner output
  draft --> cancelled : user
  planned --> approved : user approves
  planned --> blocked : missing deps
  planned --> cancelled
  approved --> scheduled
  approved --> cancelled
  scheduled --> running
  scheduled --> blocked
  scheduled --> cancelled
  running --> verifying
  running --> failed
  running --> blocked
  running --> cancelled
  verifying --> completed
  verifying --> failed
  verifying --> running : reviewer bounces
  completed --> draft : forge resume
  failed --> draft : forge resume
  blocked --> draft : forge resume
  cancelled --> draft : forge resume
  blocked --> cancelled
  completed --> [*]
  failed --> [*]
  cancelled --> [*]
05 · Executor

Iterative tool use, inside each step.

Model sees every tool result — stdout, stderr, exit, error — and can adapt. Source: src/agents/executor.ts.

---
config:
  theme: base
  themeVariables:
    fontSize: 15px
    actorFontSize: 14px
    messageFontSize: 13px
---
sequenceDiagram
  autonumber
  participant L as loop.ts
  participant E as executor
  participant M as model
  participant T as tool
  participant V as validator
  L->>E: runStep(step)
  loop up to maxExecutorTurns
    E->>M: prompt + JSON schema
    M-->>E: {actions, done?}
    alt done
      E-->>L: completed
    else actions
      E->>T: execute
      T-->>E: stdout / stderr / exit
      E->>E: digest + append
    end
  end
  opt files changed
    loop up to maxValidationRetries
      E->>V: typecheck / lint
      alt pass
        E-->>L: completed
      else fail
        E->>M: VALIDATION_FAILED
        M-->>E: corrective actions
        E->>T: execute
      end
    end
  end
06 · Memory

Four tiers. Decays over time.

Planner reads top-K learning patterns before every plan.

---
config:
  theme: base
  themeVariables:
    fontSize: 15px
---
flowchart TB
  Q["query
retrieve.ts"] --> H["🔥 HOT
in-session facts
cleared on task end"] Q --> W["☀️ WARM
recent tasks · SQLite
ages out"] Q --> C["❄️ COLD
project files · grep · AST
lazy-indexed"] Q --> L["🧠 LEARNING
patterns + confidence
decays if unused"] classDef t fill:#0f1726,stroke:#38bdf8,color:#e0f2fe,stroke-width:2px classDef src fill:#0c1a24,stroke:#22d3ee,color:#cffafe,stroke-width:2px class Q src class H,W,C,L t
07 · Providers & routing

Bring your own LLM. Forge auto-adapts.

6 providers, auto-detected on default ports. 41 model families classified for role routing.

---
config:
  theme: base
  themeVariables:
    fontSize: 15px
---
flowchart LR
  R["router
resolveModel"] --> AD["adapter
resolveLocalModel"] AD --> L1["🟢 ollama
:11434"] AD --> L2["🔵 lmstudio
:1234"] AD --> L3["🟠 vllm
:8000"] AD --> L4["🟡 llama.cpp
:8080"] R --> H1["⬛ anthropic"] R --> H2["⬛ openai-compat"] R --> RL["rate limit"] R --> CB["circuit breaker"] R --> PC["prompt cache"] R --> CT["USD ledger"] classDef route fill:#0f1726,stroke:#38bdf8,color:#e0f2fe,stroke-width:2px classDef local fill:#0a1a14,stroke:#10b981,color:#d1fae5,stroke-width:2px classDef hosted fill:#1a1634,stroke:#a78bfa,color:#ede9fe,stroke-width:2px classDef util fill:#16121a,stroke:#f472b6,color:#fce7f3,stroke-width:1.8px class R,AD route class L1,L2,L3,L4 local class H1,H2 hosted class RL,CB,PC,CT util

Model families → preferred roles

RolePreferred families
architect · reviewer · debuggerLlama 3.x / 4.x, Mixtral, Command-R+, DeepSeek V3 / R1, Mistral-Large
plannerQwen 2.5 / 3, Llama 3.x, DeepSeek V3, Gemma 3, Mistral-Nemo, Command-R, Phi 4
executor (code)DeepSeek-Coder, Qwen 2.5-Coder, CodeLlama, Codestral, StarCoder, Granite-Code
fastPhi 3 / 4, Gemma 2, TinyLlama, SmolLM, MiniCPM
08 · Safety model

Default-deny. Every tool call gated.

---
config:
  theme: base
  themeVariables:
    fontSize: 15px
---
flowchart TB
  REQ["tool call"] --> C["classify
risk · sideEffect · sensitivity"] C --> S{"path in sandbox?
cmd allow-listed?"} S -->|"no"| X["⛔ HARD-BLOCK
sandbox_violation"] S -->|"yes"| G{"risk × sideEffect"} G -->|"low / read"| A["✅ auto-allow"] G -->|"med / write"| K["❓ ask user"] G -->|"high / exec"| ST["🔒 ask · strict"] K --> F{"session flags?"} F -->|"allow-* flag"| A F -->|"non-interactive"| D["⛔ deny silently"] F -->|"interactive"| P["user prompt"] P -->|"allow"| A P -->|"deny"| D A --> E["execute"] E --> TR["trust calibration
auto-allow after N confirms"] classDef ok fill:#0a1a14,stroke:#10b981,color:#d1fae5,stroke-width:2px classDef bad fill:#1a0909,stroke:#f87171,color:#fee2e2,stroke-width:2px classDef gate fill:#1a1634,stroke:#a78bfa,color:#ede9fe,stroke-width:2px classDef step fill:#0f1726,stroke:#38bdf8,color:#e0f2fe,stroke-width:1.8px class A,E,TR ok class X,D bad class S,G,F gate class REQ,C,K,ST,P step
09 · Modes

Nine modes. Enforceable budgets.

Each mode is a runtime cap, not a hint. Read from src/core/mode-policy.ts.

ModeExecutor turnsValidation retriesMutationsMax auto-risk
fast20yeslow
balanced41yesmedium
heavy82yeshigh
plan0 → 10nolow
execute41yesmedium
audit30nolow
debug62yesmedium
architect31yesmedium
offline-safe31yesmedium
10 · CLI reference

24 subcommands. 55 slash commands in the REPL.

~ — forge --helpbash
# Core
forge                          # REPL (default)
forge init                     # create ~/.forge + ./.forge
forge run "<prompt>"           # full agentic loop
forge plan "<prompt>"          # plan-only
forge execute "<prompt>"       # auto-approve + execute
forge resume [taskId]          # resume any prior task
forge status                   # runtime state
forge doctor                   # health + role→model mapping

# State inspection
forge task list|search         # task history
forge session list|replay      # session JSONL
forge memory {hot|warm|cold}   # memory layers

# Models & config
forge model list
forge config get|set|path
forge cost                     # USD ledger

# Integrations
forge mcp list|add|remove
forge skills list|new
forge agents list
forge web {search|fetch}

# Ops
forge ui start                 # dashboard :7823
forge daemon start|stop|status
forge container up|down        # compose wrapper
forge bundle pack|unpack       # offline bundles
forge update                   # self-update
11 · Filesystem

XDG-aware. Per-project overrides.

---
config:
  theme: base
  themeVariables:
    fontSize: 14px
---
flowchart TB
  subgraph GLOBAL["global · ~/.forge"]
    G1[config.json]
    G2[instructions.md]
    G3[skills/]
    G4[agents/]
    G5[mcp/]
    G6[index.db]
    G7["projects · tasks · sessions · events"]
  end
  subgraph PROJECT["per-project · ./.forge"]
    P1[config.json]
    P2[instructions.md]
    P3[skills/]
    P4[agents/]
    P5[mcp/]
  end
12 · Skills & MCP

Extend without rebuilding.

Skill example

~/.forge/skills/conventional-commit.mdmd
---
name: conventional-commit
description: Enforce Conventional Commits.
triggers: [commit, git]
---

When writing commit messages, use Conventional Commits:
  feat(scope): …
  fix(scope): …
  refactor(scope): …

Add MCP connector

forge mcp addbash
forge mcp list
forge mcp add linear --transport stdio --command "mcp-linear-server"
forge mcp add postgres --transport http --url https://mcp.example/v1 --auth oauth2-pkce
forge mcp status
13 · Install

Three paths. Pick one.

01 / npm

global installbash
npm i -g @hoangsonw/forge
forge doctor
forge run "…"

View the published npm package

02 / Docker

zero local Nodebash
docker run --rm -it \
  -v forge-home:/data \
  -v "$PWD:/workspace" \
  ghcr.io/hoangsonw/forge-agentic-coding-cli:latest

03 / Compose

full stackbash
docker compose \
  -f docker/docker-compose.yml \
  up -d
# podman-compose works
14 · Container posture

Single image. CLI + UI + daemon.

---
config:
  theme: base
  themeVariables:
    fontSize: 14px
---
flowchart LR
  subgraph BUILD["Stage 1 · builder"]
    direction TB
    B1[node:20-slim] --> B2[npm ci
tsc + copy-assets] B2 --> B3[npm prune --omit=dev] end subgraph RUN["Stage 2 · runtime · ~355 MB"] direction TB R1[node:20-slim] --> R2[apt: git · ripgrep · tini] R2 --> R3[non-root uid 10001] R3 --> R4[pruned node_modules + dist] R4 --> R5[HEALTHCHECK · forge doctor] R5 --> R6[OCI labels] end BUILD -.dist + prod deps.-> RUN classDef s fill:#0f1726,stroke:#38bdf8,color:#e0f2fe,stroke-width:1.8px class B1,B2,B3,R1,R2,R3,R4,R5,R6 s
15 · CI/CD

9 jobs per PR. 6 release stages.

CI · every PR + push

---
config:
  theme: base
  themeVariables:
    fontSize: 14px
---
flowchart LR
  PR(["PR / push"]) --> FMT["🎨 format"]
  PR --> LINT["🧹 lint"]
  PR --> TYPE["🧠 typecheck"]
  PR --> TEST["🧪 test matrix
Ubuntu · macOS
Node 20 · 22"] TEST --> COV["📈 coverage"] TYPE --> BUILD["🏗️ build"] BUILD --> DOCKER["🐳 docker-build"] PR --> AUDIT["🔐 audit"] FMT --> S["📊 pipeline
status"] LINT --> S TYPE --> S TEST --> S BUILD --> S DOCKER --> S AUDIT --> S COV --> S classDef job fill:#0f1726,stroke:#38bdf8,color:#e0f2fe,stroke-width:2px classDef tg fill:#0c1a24,stroke:#22d3ee,color:#cffafe,stroke-width:2px classDef sum fill:#1a1634,stroke:#a78bfa,color:#ede9fe,stroke-width:2px class PR tg class FMT,LINT,TYPE,TEST,COV,BUILD,DOCKER,AUDIT job class S sum

Release · on v* tag

---
config:
  theme: base
  themeVariables:
    fontSize: 14px
---
flowchart LR
  T(["git tag v*"]) --> G["🧪 pre-release
gate"] G --> A["📦 artifacts
5 targets"] G --> D["🐳 docker
multi-arch → GHCR"] A --> M["📝 manifest
ed25519-signed"] M --> N["📤 npm publish
--provenance"] G --> R["📊 release
status"] A --> R D --> R M --> R N --> R classDef tg fill:#0c1a24,stroke:#22d3ee,color:#cffafe,stroke-width:2px classDef job fill:#0f1726,stroke:#38bdf8,color:#e0f2fe,stroke-width:2px classDef ship fill:#1a1409,stroke:#fb923c,color:#ffedd5,stroke-width:2px classDef sum fill:#1a1634,stroke:#a78bfa,color:#ede9fe,stroke-width:2px class T tg class G job class A,D,M,N ship class R sum
16 · Runtime metrics

What it actually costs to run.

All measured locally — reproducers in the table at the bottom. No synthetic benchmarks, no comparisons against straw-man tools.

doctor cold-start
173 ms
Full provider probe + SQLite open + role→model mapping.
--help cold-start
238 ms
Commander.js boot + command-tree discovery.
UI shell
89 KB
Vanilla JS, zero frameworks, zero CDN fetches.
provider probe timeout
1.5 s
Not 30 s hangs. Dead runtimes surface fast.

Startup times

cold, no state reused · lower is better · measured via `time node bin/forge.js <cmd>`
forge doctor
173ms
forge --help
238ms
forge status
190ms
forge model list
310ms
full test suite
3.3s

Executor turn budget per mode

hard runtime cap · from `src/core/mode-policy.ts` · higher = more iteration, more tokens
plan
1
fast
2
audit · architect · offline-safe
3
balanced · execute
4
debug
6
heavy
8

UI shell asset sizes

uncompressed · served from disk · no network fetches at runtime
index.html (landing)
~67KB
dashboard app.js
89KB
dashboard styles.css
40KB
dashboard index.html
22KB
TargetMeasuredReproducer
forge doctor cold-start173 mstime node bin/forge.js doctor --no-banner
forge --help cold-start238 mstime node bin/forge.js --help
full test suite~3.3 snpx vitest run
UI app.js uncompressed89 KBwc -c src/ui/public/app.js
container image~355 MBdocker images ghcr.io/hoangsonw/forge-agentic-coding-cli
CDN fetches at runtime0inspect app.js · no external URLs
provider probe timeout1.5 ssrc/models/openai.ts#isAvailable
17 · Agent-facing files

Works with every coding agent.

Context files so agents don't re-learn the repo every turn.

Ready to install?

Cold-start 173 ms. UI shell 89 KB · zero CDN. Providers probe with 1.5 s timeouts.