Production-Grade Agentic RAG Platform

RAG AI Portfolio Support System Wiki

This page documents the latest architecture and operations model for your upgraded project: a full-stack, deploy-ready, agentic RAG platform with real-time chat, API tool-chaining, progressive delivery, and infrastructure-as-code.

3
Core Runtime Services
4
Retrieval Strategies
Canary + BG
Progressive Deployment Modes
AWS + OCI
Cloud IaC Targets

Key Features

The RAG system now contains several key features that allows it to operate in a production-grade environment.

Agentic RAG Orchestration

Query understanding now drives backend tool selection and follow-up API chaining, with traceability exposed in UI and responses.

  • Tool planning from entities + context
  • Follow-up tool call expansion
  • API chain trace per response

Production Chat Experience

Modern React + MUI chat app with session controls, stream fallback, citation cards, and runtime telemetry.

  • WebSocket streaming + REST fallback
  • Session create/load/delete
  • Request and latency metadata

Hardened API Runtime

Flask API factory with validation, health/readiness probes, optional gateway auth, request IDs, and API rate limiting.

  • /livez, /readyz, /health
  • OpenAI-compatible endpoint
  • Upload and strategy endpoints

Multi-Layer Retrieval

Semantic, hybrid, multi-query, and decomposed strategies with cross-encoder reranking and evidence-backed generation.

  • Chroma + BM25 ensemble retrieval
  • Optional reranking pipeline
  • Source citation response model

Release Engineering Upgrade

Kubernetes overlays now support rolling, canary, and blue-green release workflows for AWS and OCI.

  • Argo Rollouts integration
  • Preview ingress for blue-green
  • Promote/abort runbooks

Unified Script Tooling

End-to-end automation scripts now cover setup, run, test, smoke, format, docker lifecycle, and deploy wrappers.

  • scripts/system.sh entrypoint
  • Health and chat smoke commands
  • Repo-wide format command

Current System Architecture

Updated architecture reflecting the professionalized platform.

Service Topology

graph TB
    User[End User]

    subgraph Client
      FE[frontend\nReact + Vite + NGINX]
    end

    subgraph App
      RAG[rag-app\nFlask + Socket.IO]
      BE[backend\nExpress + Swagger]
    end

    subgraph Data
      M[(MongoDB)]
      C[(chroma_db)]
      U[(uploads)]
      L[(logs)]
    end

    User --> FE
    FE --> RAG
    RAG --> BE
    BE --> M
    RAG --> C
    RAG --> U
    RAG --> L
            

Chat Processing And Tool-Chaining Flow

sequenceDiagram
    autonumber
    participant User
    participant FE as Frontend
    participant API as RAG API
    participant ENG as RAG Engine
    participant AG as Agentic Orchestrator
    participant BE as Backend API

    User->>FE: submit query
    FE->>API: POST /api/chat or socket event
    API->>ENG: retrieve_documents(strategy)
    ENG->>AG: plan + execute backend tool calls
    AG->>BE: /api/team, /api/investments, /api/sectors...
    BE-->>AG: structured domain payloads
    AG-->>ENG: api_data + api_chain_trace
    ENG-->>API: response + sources + metadata
    API-->>FE: structured chat payload
    FE-->>User: rendered answer + citations + trace
            

Retrieval Strategy Router

flowchart TD
    Q[Incoming query] --> S{Selected strategy}
    S -->|semantic| A[Vector retrieval]
    S -->|hybrid| B[Vector + BM25 ensemble]
    S -->|multi_query| C[Generate query variants]
    S -->|decomposed| D[Split into sub-queries]

    C --> B
    D --> B

    A --> R{Reranking enabled?}
    B --> R

    R -->|yes| X[Cross-encoder rerank]
    R -->|no| Y[Use retrieval scores]
    X --> G[LLM response generation]
    Y --> G
            

Platform Stack

Core technologies currently used in this repository.

Frontend

React 18 TypeScript MUI Axios Socket.IO Client Vite NGINX

RAG Runtime

Flask Flask-SocketIO LangChain Ollama Hugging Face ChromaDB FAISS BM25 PyTorch

Backend APIs

Node.js Express MongoDB Swagger TypeScript

Infra & Delivery

Docker Kubernetes Argo Rollouts Terraform AWS (EKS/ECR) OCI (OKE)

Quick Start Paths

Use scripts first for consistency, then compose or manual as needed.

1

Setup dependencies

scripts/system.sh setup
2

Start local services

scripts/system.sh dev-up --setup
scripts/system.sh dev-status
scripts/system.sh dev-logs all -f
3

Validate platform

scripts/system.sh health
scripts/system.sh smoke
4

Format and test

scripts/system.sh format
scripts/system.sh test
1

Bring up stack

docker compose up -d
# or
scripts/system.sh docker-up
2

Verify service endpoints

  • Frontend: http://localhost:3000
  • RAG API: http://localhost:5000
  • Backend Docs: http://localhost:3456/docs
3

Run smoke checks

scripts/system.sh health
scripts/system.sh smoke
1

Backend

cd backend
cp .env.example .env
npm install
npm run dev
2

RAG app

cd ..
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python run.py
3

Frontend

cd frontend
npm install
npm run dev

API Surface

Unified API layer for chat, sessions, tools, uploads, and backend domain data.

RAG API (Port 5000)

Core routes: /api/chat, /api/chat/completions, /api/session, /api/sessions, /api/upload, /api/strategies, /api/system/info, /api/tools, plus probes /health, /livez, /readyz.

Chat request (REST)

curl -s -X POST http://localhost:5000/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Summarize current portfolio opportunities and risks",
    "strategy": "hybrid"
  }'

OpenAI-compatible request

curl -s -X POST http://localhost:5000/api/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama2",
    "messages": [{"role":"user","content":"Give a portfolio health summary"}],
    "strategy": "hybrid"
  }'

Backend API (Port 3456)

Domain routes: team, investments, sectors, consultations, documents ZIP export, and scrape simulation endpoints used by agentic tool-calls.

Get demo auth token

curl -s http://localhost:3456/auth/token

Protected team profile lookup

TOKEN="psJN7z3J9q"
curl -s "http://localhost:3456/api/team?name=John%20Doe" \
  -H "Authorization: Bearer ${TOKEN}"

API docs

open http://localhost:3456/docs
# Unified contract in repo root:
# openapi.yaml

Deployment And Operations

Deployment overlays, progressive delivery control, and cloud IaC are now first-class.

Progressive Release Control Flow

flowchart TD
    Change[Release candidate] --> Build[Build and push immutable images]
    Build --> Apply[rollout.sh apply strategy overlay]
    Apply --> Observe[rollout status + probes + metrics]
    Observe --> Smoke[smoke-test.sh]
    Smoke --> Decision{Healthy?}
    Decision -->|Yes| Promote[rollout promote]
    Decision -->|No| Abort[rollout abort + rollback]
            

Rolling

Simple, steady replacement for low-risk releases.

deploy/k8s/overlays/aws + oci

Canary

Stepwise weighted exposure with pause/promote/abort controls.

Argo Rollouts

Blue-Green

Preview stack validation before explicit traffic cutover.

preview ingress + manual promotion

Cloud IaC

AWS (EKS/ECR/VPC) and OCI (OKE/VCN) Terraform modules included.

infra/terraform

Operations Quick Commands

Use release scripts directly or through scripts/system.sh wrappers.

# rolling deploy (aws)
deploy/scripts/rollout.sh rolling aws apply

deploy/scripts/rollout.sh rolling aws status

deploy/scripts/smoke-test.sh https://rag.example.com

# canary control
deploy/scripts/rollout.sh canary aws promote frontend
deploy/scripts/rollout.sh canary aws abort frontend

Documentation Hub

Everything below is aligned to the upgraded production project state.