This page documents the latest architecture and operations model for your upgraded project: a full-stack, deploy-ready, agentic RAG platform with real-time chat, API tool-chaining, progressive delivery, and infrastructure-as-code.
The RAG system now contains several key features that allows it to operate in a production-grade environment.
Query understanding now drives backend tool selection and follow-up API chaining, with traceability exposed in UI and responses.
Modern React + MUI chat app with session controls, stream fallback, citation cards, and runtime telemetry.
Flask API factory with validation, health/readiness probes, optional gateway auth, request IDs, and API rate limiting.
Semantic, hybrid, multi-query, and decomposed strategies with cross-encoder reranking and evidence-backed generation.
Kubernetes overlays now support rolling, canary, and blue-green release workflows for AWS and OCI.
End-to-end automation scripts now cover setup, run, test, smoke, format, docker lifecycle, and deploy wrappers.
Updated architecture reflecting the professionalized platform.
graph TB
User[End User]
subgraph Client
FE[frontend\nReact + Vite + NGINX]
end
subgraph App
RAG[rag-app\nFlask + Socket.IO]
BE[backend\nExpress + Swagger]
end
subgraph Data
M[(MongoDB)]
C[(chroma_db)]
U[(uploads)]
L[(logs)]
end
User --> FE
FE --> RAG
RAG --> BE
BE --> M
RAG --> C
RAG --> U
RAG --> L
sequenceDiagram
autonumber
participant User
participant FE as Frontend
participant API as RAG API
participant ENG as RAG Engine
participant AG as Agentic Orchestrator
participant BE as Backend API
User->>FE: submit query
FE->>API: POST /api/chat or socket event
API->>ENG: retrieve_documents(strategy)
ENG->>AG: plan + execute backend tool calls
AG->>BE: /api/team, /api/investments, /api/sectors...
BE-->>AG: structured domain payloads
AG-->>ENG: api_data + api_chain_trace
ENG-->>API: response + sources + metadata
API-->>FE: structured chat payload
FE-->>User: rendered answer + citations + trace
flowchart TD
Q[Incoming query] --> S{Selected strategy}
S -->|semantic| A[Vector retrieval]
S -->|hybrid| B[Vector + BM25 ensemble]
S -->|multi_query| C[Generate query variants]
S -->|decomposed| D[Split into sub-queries]
C --> B
D --> B
A --> R{Reranking enabled?}
B --> R
R -->|yes| X[Cross-encoder rerank]
R -->|no| Y[Use retrieval scores]
X --> G[LLM response generation]
Y --> G
Core technologies currently used in this repository.
Use scripts first for consistency, then compose or manual as needed.
scripts/system.sh setup
scripts/system.sh dev-up --setup
scripts/system.sh dev-status
scripts/system.sh dev-logs all -f
scripts/system.sh health
scripts/system.sh smoke
scripts/system.sh format
scripts/system.sh test
docker compose up -d
# or
scripts/system.sh docker-up
scripts/system.sh health
scripts/system.sh smoke
cd backend
cp .env.example .env
npm install
npm run dev
cd ..
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python run.py
cd frontend
npm install
npm run dev
Unified API layer for chat, sessions, tools, uploads, and backend domain data.
/api/chat,
/api/chat/completions, /api/session,
/api/sessions, /api/upload,
/api/strategies, /api/system/info,
/api/tools, plus probes /health,
/livez, /readyz.
curl -s -X POST http://localhost:5000/api/chat \
-H "Content-Type: application/json" \
-d '{
"query": "Summarize current portfolio opportunities and risks",
"strategy": "hybrid"
}'
curl -s -X POST http://localhost:5000/api/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama2",
"messages": [{"role":"user","content":"Give a portfolio health summary"}],
"strategy": "hybrid"
}'
curl -s http://localhost:3456/auth/token
TOKEN="psJN7z3J9q"
curl -s "http://localhost:3456/api/team?name=John%20Doe" \
-H "Authorization: Bearer ${TOKEN}"
open http://localhost:3456/docs
# Unified contract in repo root:
# openapi.yaml
Deployment overlays, progressive delivery control, and cloud IaC are now first-class.
flowchart TD
Change[Release candidate] --> Build[Build and push immutable images]
Build --> Apply[rollout.sh apply strategy overlay]
Apply --> Observe[rollout status + probes + metrics]
Observe --> Smoke[smoke-test.sh]
Smoke --> Decision{Healthy?}
Decision -->|Yes| Promote[rollout promote]
Decision -->|No| Abort[rollout abort + rollback]
Simple, steady replacement for low-risk releases.
deploy/k8s/overlays/aws + ociStepwise weighted exposure with pause/promote/abort controls.
Argo RolloutsPreview stack validation before explicit traffic cutover.
preview ingress + manual promotionAWS (EKS/ECR/VPC) and OCI (OKE/VCN) Terraform modules included.
infra/terraformUse release scripts directly or through scripts/system.sh wrappers.
# rolling deploy (aws)
deploy/scripts/rollout.sh rolling aws apply
deploy/scripts/rollout.sh rolling aws status
deploy/scripts/smoke-test.sh https://rag.example.com
# canary control
deploy/scripts/rollout.sh canary aws promote frontend
deploy/scripts/rollout.sh canary aws abort frontend
Everything below is aligned to the upgraded production project state.
Comprehensive platform overview, stack badges, lifecycle, APIs, and governance.
Deep system internals, state strategy, request flow, and release control plane.
Deterministic setup and validation flows for scripts, compose, and manual mode.
Transport fallback behavior, UI contracts, production build and NGINX runtime.
Route catalog, auth model, data schemas, container runtime, and hardening guidance.
Retrieval internals, orchestration logic, API contract, and runtime configuration matrix.
Kubernetes overlays, runbooks, and promotion workflow for production environments.
Unified automation commands including setup, dev, health, smoke, format, and deploy wrappers.
Notebook curriculum and reference assets for RAG/ML/NLP learning paths.
Repository and issue workflow