Welcome to DocuThinker! This is a full-stack application that integrates an AI-powered document processing backend, blue/green & canary deployment on an AWS infrastructure, and a React-based frontend. The app allows users to upload documents for summarization, generate key insights, chat with an AI, and do even more with the document’s content.
The DocuThinker app is designed to provide users with a simple, AI-powered document management tool. Users can upload a wide range of file types — PDFs, Word documents, Markdown, HTML, CSV/TSV, JSON, plain text, and dozens of code/config formats — and receive summaries, key insights, and discussion points. Additionally, users can chat with an AI using the document’s content for further clarification.
DocuThinker is built on the FERN-Stack architecture — Firebase, Express, React, and Node.js. The backend is a Node.js + Express API that uses Firebase Admin for authentication and Cloud Firestore for metadata, Supabase Storage for original files and offloaded document content, Google Gemini for all AI features, and Redis for caching. The frontend is built with React 18 and Material-UI, providing a responsive and user-friendly interface. Original uploaded files (PDF/DOCX) are streamed directly from the browser to a private Supabase bucket using a backend-minted signed upload URL, so large files bypass the serverless request-body limit entirely. See the Storage & Data Model section below for the full picture.
graph LR
U[Client's Browser] -->|HTTPS| N[NGINX - SSL, Routing, Caching]
N -->|static calls| A[React Frontend]
N -->|/api/* proxy| B[Express Backend]
A -->|REST API calls| N
A -->|"direct file upload (signed URL)"| SB[Supabase Storage]
B --> C[Firebase Auth]
B --> D[Firestore - metadata + subcollections]
B --> SB
B --> F[Redis Cache]
B --> G[Google Gemini AI]
B --> P[Passkeys / WebAuthn]
A --> H[Material-UI]
A --> I[React Router]
A --> SB
Feel free to explore the app, upload documents, and interact with the AI! For architecture details, setup instructions, and more, please refer to the sections below, as well as the ARCHITECTURE.md file.
[!TIP] Access the live app at https://docuthinker.vercel.app/ by clicking on the link or copying it into your browser! 🚀
We have deployed the entire app on Vercel and AWS. You can access the live app here.
aws/ directory. It’s a one-click deployment using AWS Fargate.[!IMPORTANT] The backend server may take a few seconds to wake up if it has been inactive for a while. The first API call may take a bit longer to respond. Subsequent calls should be faster as the server warms up.
DocuThinker offers a wide range of features to help users manage and analyze their documents effectively. Here are some of the key features of the app:
.docx), Markdown (.md/.markdown), HTML (.html/.htm), CSV/TSV, JSON, plain text (.txt/.log), and a broad set of code/config files (.xml .yaml/.yml .js/.jsx/.mjs/.ts/.tsx .py .java .c/.cpp .h .cs .go .rs .rb .php .sql .sh .css/.scss/.less .ini/.toml/.conf/.env .kt .swift .r .lua .pl) for AI-generated summaries. Text extraction happens client-side in the browser (see Multi-Format Upload & Extraction), and the AI always receives clean, plain text. Original files are uploaded directly from the browser to a private Supabase bucket via a signed upload URL, so large files bypass the serverless request-body limit.<iframe> of the signed URL (true pages, zoom, scroll), DOCX as mammoth HTML, Markdown via react-markdown, HTML sanitized with DOMPurify, CSV/TSV as a table, JSON/code as monospace, and plain text as pre-wrapped text. Works for live uploads and re-opened history alike.| Drag-Resizable Result Columns: The Original | Summary panes on the results view are split by a draggable divider (double-click to reset), with a full-screen drag overlay that keeps resizing smooth even over the PDF iframe. |
localStorage so the meter loads instantly on return visits.DocuThinker accepts far more than PDF and Word. Text extraction runs entirely client-side in the UploadModal component before anything leaves the browser, so the AI always receives clean, plain text while the viewer keeps a rich representation for display.
| Format | Extracted for the AI | Rendered in the viewer |
|---|---|---|
PDF (.pdf) |
Text via pdf.js with line/paragraph reconstruction | Native <iframe> of the signed Supabase URL |
Word (.docx) |
Plain text via mammoth (extractRawText) |
Structured HTML via mammoth.convertToHtml |
Markdown (.md, .markdown) |
Raw Markdown | Rendered with react-markdown |
HTML (.html, .htm) |
Tags stripped to plain text | Raw HTML sanitized with DOMPurify |
CSV / TSV (.csv, .tsv) |
Parsed rows | Parsed into an HTML table |
JSON (.json) |
Pretty-printed text | Pretty-printed monospace block |
Code / config (.xml .yaml/.yml .js/.jsx/.mjs .ts/.tsx .py .java .c/.cpp .h .cs .go .rs .rb .php .sql .sh .css/.scss/.less .ini/.toml/.conf/.env .kt .swift .r .lua .pl) |
File contents as text | Monospace code block |
Plain text (.txt, .log) |
File contents as text | Pre-wrapped text |
The same extracted { originalText, originalHtml } pair powers both the AI summary and the Rich Original-Document Viewer, and is offloaded to Supabase Storage as described below.
DocuThinker is built with 120+ technologies spanning frontend, backend, AI/ML, mobile, infrastructure, and DevOps. Below is the complete technology stack.
@supabase/supabase-js): Direct browser-to-bucket upload of original files via signed upload URL.documents subcollection.@supabase/supabase-js): Private bucket for original files and offloaded content JSON; signed upload/download URLs minted server-side with the service_role key.@google/generative-ai): Google Gemini integration with a dynamic model list plus rotation/fallback across models to absorb 429/503 errors. The document title and today’s real date are injected into AI prompts as context, and the summary prompt now produces easy-to-read, model-decided formatting rather than forced bullets/paragraphs.@simplewebauthn/server): Passkeys / WebAuthn registration and authentication ceremonies./graphql)./api-docs).documents subcollection (one record per document).{ originalText, originalHtml } content JSON; accessed via signed URLs.coralogix/coralogix provider.For a comprehensive deep-dive into the AI/ML architecture with visual diagrams, see AI_ML.md.
DocuThinker features a clean and intuitive user interface designed to provide a seamless experience for users. The app supports both light and dark themes, responsive design, and easy navigation. Here are some screenshots of the app:
Real signed-in captures from the React Native (Expo SDK 51) build, iPhone 16 Pro / iOS 18.5 on the top row and Pixel 6 / API 34 on the bottom. Full walkthrough lives in MOBILE_APPS.md, and the mobile-only deep dive is at mobile-app/README.md.
| Login | Register | Forgot password |
|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Home | Library | Profile |
|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Upload | Summary | Chat |
|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Account | Appearance | Connections |
|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
The DocuThinker app is organized into separate subdirectories for the frontend, backend, and mobile app. Each directory contains the necessary files and folders for the respective components of the app. Here is the complete file structure of the app:
DocuThinker-AI-App/
├── .beads/ # Beads task coordination system
│ ├── .status.json # Agent reservations & active bead tracking
│ ├── README.md # Beads workflow quick-reference
│ ├── active/ # Beads available for agents to pick up
│ ├── completed/ # Archive of finished beads
│ └── templates/
│ └── feature-bead.md # Template for new feature beads
├── .agent-sessions/ # Agent session history & coordination
│ ├── README.md # Session management guide
│ ├── SCHEMA.md # Session data structure specification
│ ├── config.json # Session configuration
│ ├── active/ # Sessions currently in progress
│ ├── completed/ # Archived finished sessions
│ └── templates/
│ ├── session-log.md # Standard session log template
│ ├── handoff-report.md # Agent-to-agent handoff template
│ └── escalation-report.md # Conflict / blocker escalation template
├── .claude/ # Claude Code workspace settings
├── .mcp.json # MCP server configuration
├── AGENTS.md # Agent behavior instructions
├── CLAUDE.md # Claude Code project instructions
├── ai_ml/ # AI/ML pipelines & services directory (Python)
├── orchestrator/ # Agentic orchestration layer (Node.js)
│ ├── core/
│ │ ├── supervisor.js # Intent classification, decomposition, dispatch
│ │ ├── circuit-breaker.js # Per-provider circuit breaker state machine
│ │ ├── agent-loop.js # Iterative tool-use agent loop
│ │ ├── handoff.js # Cross-agent context transfer
│ │ ├── batch-processor.js # Concurrent batch document processing
│ │ ├── cost-tracker.js # Token cost tracking with budget limits
│ │ ├── dlq.js # Dead letter queue with retry logic
│ │ ├── python-bridge.js # HTTP bridge to Python AI/ML service
│ │ ├── providers.js # Unified LLM client (Claude + Gemini)
│ │ └── tool-registry.js # Tool registration and dispatch
│ ├── context/
│ │ ├── token-budget.js # Context window management
│ │ ├── conversation-store.js # Auto-summarizing conversation memory
│ │ ├── observability.js # OTel-compatible context metrics
│ │ └── hybrid-rag.js # Keyword + semantic search with RRF
│ ├── prompts/
│ │ ├── system-prompts.js # 14 versioned system prompts
│ │ └── cache-strategy.js # 3-layer Anthropic prompt caching
│ ├── schemas/
│ │ └── ai-outputs.js # 12 Zod validation schemas
│ ├── mcp/
│ │ ├── server.js # MCP server exposing 13 tools
│ │ └── client.js # MCP client for external servers
│ ├── __tests__/
│ │ └── orchestrator.test.js # Integration tests (Jest)
│ ├── Dockerfile # Production container (node:20-alpine)
│ ├── package.json # Dependencies and scripts
│ └── index.js # Express server entry point (port 4000)
│
├── backend/
│ ├── middleware/
│ │ └── jwt.js # Authentication middleware with JWT for the app's backend
│ ├── controllers/
│ │ ├── controllers.js # Controls the flow of data and logic
│ │ └── passkeyController.js # Passkey (WebAuthn) ceremony + endpoints
│ ├── graphql/
│ │ ├── resolvers.js # Resolvers for querying data from the database
│ │ └── schema.js # GraphQL schema for querying data from the database
│ ├── models/
│ │ ├── models.js # Data models for interacting with the database
│ │ └── passkeyModel.js # Firestore access for passkeys & challenges
│ ├── services/
│ │ └── services.js # Models for interacting with database and AI/ML services
│ ├── views/
│ │ └── views.js # Output formatting for success and error responses
│ ├── redis/
│ │ └── redisClient.js # Redis client for caching data in-memory
│ ├── swagger/
│ │ └── swagger.js # Swagger documentation for API endpoints
│ ├── .env # Environment variables (git-ignored)
│ ├── firebase-admin-sdk.json # Firebase Admin SDK credentials (git-ignored)
│ ├── index.js # Main entry point for the server
│ ├── Dockerfile # Docker configuration file
│ ├── manage_server.sh # Shell script to manage and start the backend server
│ └── README.md # Backend README file
│
├── frontend/
│ ├── public/
│ │ ├── index.html # Main HTML template
│ │ └── manifest.json # Manifest for PWA settings
│ ├── src/
│ │ ├── assets/ # Static assets like images and fonts
│ │ │ └── logo.png # App logo or images
│ │ ├── components/
│ │ │ ├── ChatModal.js # Chat modal component
│ │ │ ├── Spinner.js # Loading spinner component
│ │ │ ├── UploadModal.js # Upload modal (direct Supabase upload via signed URL)
│ │ │ ├── DropboxFileSelectorModal.js # Dropbox import modal
│ │ │ ├── GoogleDriveFileSelectorModal.js # Google Drive Picker modal
│ │ │ ├── PasskeyPromptModal.js # Post-sign-up passkey enrollment prompt
│ │ │ ├── Navbar.js # Navigation bar component
│ │ │ ├── Footer.js # Footer component
│ │ │ ├── useErrorToast.js # Shared error-toast hook
│ │ │ └── GoogleAnalytics.js # Google Analytics integration component
│ │ ├── pages/
│ │ │ ├── Home.js # Upload + results + AI tools (resizable viewer, selection menu)
│ │ │ ├── DocumentsPage.js # Library: instant search, sort, type filter
│ │ │ ├── Profile.js # Profile / account / social-media management
│ │ │ ├── Passkeys.js # Add / rename / delete passkeys (WebAuthn)
│ │ │ ├── LandingPage.js # Welcome and information page
│ │ │ ├── Login.js # Login page (password + passkey)
│ │ │ ├── Register.js # Registration page
│ │ │ ├── ForgotPassword.js # Forgot password page
│ │ │ ├── HowToUse.js # Page explaining how to use the app
│ │ │ ├── PrivacyPolicy.js # Privacy policy page
│ │ │ ├── TermsOfService.js # Terms of service page
│ │ │ └── NotFoundPage.js # 404 page
│ │ ├── utils/
│ │ │ ├── auth.js # localStorage auth + event emitter
│ │ │ └── supabaseClient.js # Browser Supabase client (anon key) for direct uploads
│ │ ├── App.js # Main App component
│ │ ├── index.js # Entry point for the React app
│ │ ├── App.css # Global CSS 1
│ │ ├── index.css # Global CSS 2
│ │ ├── reportWebVitals.js # Web Vitals reporting
│ │ ├── styles.css # Custom styles for different components
│ │ └── config.js # Configuration file for environment variables
│ ├── .env # Environment variables file (e.g., REACT_APP_BACKEND_URL)
│ ├── package.json # Project dependencies and scripts
│ ├── craco.config.js # Craco configuration file
│ ├── Dockerfile # Docker configuration file
│ ├── manage_frontend.sh # Shell script for managing and starting the frontend
│ ├── README.md # Frontend README file
│ └── package.lock # Lock file for dependencies
│
├── mobile-app/ # React Native (Expo SDK 51) client
│ ├── app/ # File-based routes (expo-router)
│ │ ├── _layout.tsx # Root stack + auth gate (hydrate, redirect)
│ │ ├── login.tsx # Email/password → setAuth
│ │ ├── register.tsx # New account → /login
│ │ ├── upload.tsx # expo-document-picker + /upload
│ │ ├── summary.tsx # Renders /upload or /document-details
│ │ ├── chat.tsx # /chat round-trip with sessionId
│ │ └── (tabs)/ # Home, Library, Profile (bottom tabs)
│ ├── components/ # Screen primitives + UI kit (Card, Pill, …)
│ ├── constants/ # theme.ts, Colors.ts, static UI copy
│ ├── lib/
│ │ ├── auth.ts # AsyncStorage + emitter (mirrors web auth.js)
│ │ └── api.ts # fetch wrapper + endpoint map
│ ├── hooks/ # Custom hooks (useColorScheme)
│ ├── assets/ # Static assets (images, fonts)
│ ├── app.json # Expo config (scheme: docuthinker)
│ ├── babel.config.js # Babel configuration
│ ├── package.json # Project dependencies and scripts
│ └── tsconfig.json # TypeScript configuration
│
├── aws/ # AWS deployment assets (ECR/ECS/CloudFormation/CDK)
│ ├── README.md
│ ├── cloudformation/
│ │ └── fargate-service.yaml # Reference Fargate stack for backend + ai_ml services
│ ├── infrastructure/
│ │ ├── cdk-app.ts # CDK entrypoint
│ │ └── lib/docuthinker-stack.ts # CDK stack definition
│ └── scripts/
│ └── local-env.sh # Helper to mirror production env vars locally
│
├── kubernetes/ # Kubernetes configuration files
│ ├── manifests/ # Kubernetes manifests for deployment, service, and ingress
│ ├── backend-deployment.yaml # Deployment configuration for the backend
│ ├── backend-service.yaml # Service configuration for the backend
│ ├── frontend-deployment.yaml # Deployment configuration for the frontend
│ ├── frontend-service.yaml # Service configuration for the frontend
│ ├── firebase-deployment.yaml # Deployment configuration for Firebase
│ ├── firebase-service.yaml # Service configuration for Firebase
│ └── configmap.yaml # ConfigMap configuration for environment variables
│
├── nginx/
│ ├── nginx.conf # NGINX configuration file for load balancing and caching
│ └── Dockerfile # Docker configuration file for NGINX
│
├── images/ # Images for the README
├── .env # Environment variables file for the whole app
├── docker-compose.yml # Docker Compose file for containerization
├── jsconfig.json # JavaScript configuration file
├── package.json # Project dependencies and scripts
├── package-lock.json # Lock file for dependencies
├── postcss.config.js # PostCSS configuration file
├── tailwind.config.js # Tailwind CSS configuration file
├── render.yaml # Render configuration file
├── vercel.json # Vercel configuration file
├── openapi.yaml # OpenAPI specification for API documentation
├── manage_docuthinker.sh # Shell script for managing and starting the app (both frontend & backend)
├── .gitignore # Git ignore file
├── LICENSE.md # License file for the project
├── README.md # Comprehensive README for the whole app
└── (and many more files...) # Additional files and directories not listed here
Ensure you have the following tools installed:
GOOGLE_AI_API_KEY) for all AI features.env file - but you should obtain your own API keys for production).Additionally, basic fullstack development knowledge and AI/ML concepts are recommended to understand the app’s architecture and functionalities.
The backend and frontend each read from their own .env file (both git-ignored). The most important variables are listed below — see backend/.env and frontend/.env for the full set.
Backend (backend/.env)
| Variable | Purpose |
|---|---|
FIREBASE_* |
Firebase Admin service-account credentials (FIREBASE_PROJECT_ID, FIREBASE_PRIVATE_KEY, FIREBASE_CLIENT_EMAIL, FIREBASE_DATABASE_URL, …) for Auth + Firestore. |
GOOGLE_AI_API_KEY |
Google Gemini API key (model list, generation, audio). |
AI_INSTRUCTIONS |
Base system-prompt text prepended to every AI request. |
SUPABASE_URL |
Supabase project URL. |
SUPABASE_SERVICE_ROLE_KEY |
Server-side Supabase key for signing upload/download URLs and storing content (never exposed to the browser). |
SUPABASE_BUCKET |
Storage bucket name (defaults to docuthinker). |
REDIS_* |
Redis connection config for caching. |
Frontend (frontend/.env)
| Variable | Purpose |
|---|---|
REACT_APP_SUPABASE_URL |
Supabase project URL (browser client). |
REACT_APP_SUPABASE_ANON_KEY |
Public Supabase anon key for direct browser uploads. |
REACT_APP_SUPABASE_BUCKET |
Storage bucket name (must match the backend). |
REACT_APP_GOOGLE_DRIVE_API_KEY |
Google Drive Picker API key. |
REACT_APP_GOOGLE_DRIVE_CLIENT_ID |
Google OAuth client ID for Drive import. |
REACT_APP_API_BASE_URL |
Base URL of the backend API. |
[!IMPORTANT] Only the service-role Supabase key lives on the backend; the browser ever only sees the public anon key plus one-time, path-scoped signed upload tokens. Keep
SUPABASE_SERVICE_ROLE_KEYandfirebase-admin-sdk.jsonout of source control.
Clone the repository:
git clone https://github.com/hoangsonww/DocuThinker-AI-App.git
cd DocuThinker-AI-App/backend
Navigate to the frontend directory:
cd frontend
Install dependencies:
npm install
Or npm install --legacy-peer-deps if you face any peer dependency issues.
npm start
Build the Frontend React app (for production):
npm run build
yarn to install dependencies and run the app:
yarn install
yarn start
Or, for your convenience, if you have already installed the dependencies, you can directly run the app in the root directory using:
npm run frontend
This way, you don’t have to navigate to the frontend directory every time you want to run the app.
http://localhost:3000. You can now access it in your browser.[!NOTE] Note that this is optional since we are deploying the backend on Render. However, you can (and should) run the backend locally for development purposes.
backend) directory:
cd backend
npm install
Or npm install --legacy-peer-deps if you face any peer dependency issues.
npm run server
http://localhost:3000. You can access the API endpoints in your browser or Postman.backend directory. Feel free to explore the API endpoints and controllers.[!CAUTION] Note: Be sure to use Node v.20 or earlier to avoid compatibility issues with Firebase Admin SDK.
The mobile app is a real React Native (Expo SDK 51) client that authenticates against the same backend as the web. Accounts created via the web work on mobile and vice versa.
cd mobile-app
npm ci
npx expo start
Metro listens on http://localhost:8081.
i to launch (and bundle for) the booted iOS Simulator.a to launch (and bundle for) the booted Android AVD.xcrun simctl openurl booted "exp://127.0.0.1:8081"
adb shell am start -a android.intent.action.VIEW -d "exp://10.0.2.2:8081" host.exp.exponent
sequenceDiagram
participant Dev as Developer
participant CLI as npx expo start
participant Metro as Metro :8081
participant iOS as iOS Sim
participant AVD as Android AVD
Dev->>CLI: start
CLI->>Metro: bundle entry.js
Dev->>CLI: press i
CLI->>iOS: install Expo Go (SDK 51) if missing
CLI->>iOS: openurl exp://127.0.0.1:8081
iOS->>Metro: fetch bundle
Metro-->>iOS: iOS bundle
Dev->>CLI: press a
CLI->>AVD: install Expo Go (SDK 51) if missing
CLI->>AVD: am start exp://10.0.2.2:8081
AVD->>Metro: fetch bundle
Metro-->>AVD: Android bundle
[!TIP] Expo Go pins one SDK runtime per device. If your device has Go for a different SDK installed,
expo startwill prompt to reinstall the matching version. The swap is reversible — opening another project later prompts the swap back. Detailed troubleshooting lives inmobile-app/README.md.
The backend of DocuThinker provides several API endpoints for user authentication, document management, and AI-powered insights. These endpoints are used by the frontend to interact with the backend server:
| Method | Endpoint | Description |
|---|---|---|
| POST | /register |
Register a new user in Firebase Authentication and Firestore, saving their email and creation date. |
| POST | /login |
Log in a user and return a custom token along with the user ID. |
| POST | /passkey/register/options |
Begin passkey registration; returns WebAuthn creation options + a flowId. |
| POST | /passkey/register/verify |
Verify the authenticator attestation and store the new passkey credential. |
| POST | /passkey/authenticate/options |
Begin passkey login (email-scoped or discoverable/usernameless); returns options + flowId. |
| POST | /passkey/authenticate/verify |
Verify the assertion and return a Firebase custom token + user ID (same contract as /login). |
| GET | /passkeys/{userId} |
List all passkeys registered to a user (public metadata only). |
| PATCH | /passkeys/{userId}/{credentialId} |
Rename one of the user’s passkeys. |
| DELETE | /passkeys/{userId}/{credentialId} |
Delete one of the user’s passkeys. |
| POST | /document-upload-url |
Mint a one-time signed Supabase upload URL so the browser can upload the file bytes directly to the private bucket (bypasses the serverless body-size limit). |
| POST | /document-file |
Through-backend multipart fallback upload (parsed with formidable); stores the file in the Supabase bucket. |
| POST | /upload |
Summarize a document and, when userId is given, save the record to the user’s documents subcollection. Body: { userId, title, text, html, filePath, fileType }. |
| POST | /generate-key-ideas |
Generate key ideas from the document text. |
| POST | /generate-discussion-points |
Generate discussion points from the document text. |
| POST | /chat |
Chat with AI using the original document text as context. |
| POST | /process-audio |
Transcribe/answer over an uploaded audio file (voice chat) via Gemini. |
| POST | /refine-summary |
Refine an existing summary using free-form instructions. |
| POST | /forgot-password |
Reset a user’s password in Firebase Authentication. |
| POST | /verify-email |
Verify if a user’s email exists in Firestore. |
| GET | /documents/{userId} |
Retrieve all documents associated with the given userId (subcollection, merged with any legacy array). |
| GET | /documents/{userId}/{docId} |
Retrieve a specific document by userId and docId. |
| GET | /document-details/{userId}/{docId} |
Retrieve document details (title, original text/HTML, summary, signed fileUrl) by userId and docId. |
| GET | /search-documents/{userId} |
Server-side search across the user’s documents. |
| DELETE | /documents/{userId}/{docId} |
Delete a specific document and its Supabase objects by userId and docId. |
| DELETE | /documents/{userId} |
Delete all documents (and their stored objects) for the given userId. |
| POST | /update-email |
Update a user’s email in both Firebase Authentication and Firestore. |
| POST | /update-password |
Update a user’s password in Firebase Authentication. |
| GET | /days-since-joined/{userId} |
Get the number of days since the user associated with userId joined the service. |
| GET | /document-count/{userId} |
Retrieve the number of documents associated with the given userId. |
| GET | /users/{userId} |
Retrieve the email of a user associated with userId. |
| POST | /update-document-title |
Update the title of a document in Firestore. |
| PUT | /update-theme |
Update the theme of the app. |
| GET | /user-joined-date/{userId} |
Get date when the user associated with userId joined the service. |
| GET | /social-media/{userId} |
Get the social media links of the user associated with userId. |
| POST | /update-social-media |
Update the social media links of the user associated with userId. |
| POST | /sentiment-analysis |
Analyzes the sentiment of the provided document text. |
| POST | /bullet-summary |
Generates a summary of the document text in bullet points. |
| POST | /summary-in-language |
Generates a summary in the specified language. |
| POST | /content-rewriting |
Rewrites or rephrases the provided document text based on a style. |
| POST | /actionable-recommendations |
Generates actionable recommendations based on the document text. |
| GET | /graphql |
GraphQL endpoint (GraphiQL enabled) for querying and mutating data. |
More API endpoints will be added in the future to enhance the functionality of the app. Feel free to explore the existing endpoints and test them using Postman or Insomnia.
[!NOTE] This list is not exhaustive. For a complete list of API endpoints, please refer to the Swagger or Redoc documentation of the backend server.
http://localhost:3000/api-docs (Swagger UI loaded from a CDN). Visiting the root / redirects there. The raw spec is served at http://localhost:3000/swagger.json.http://localhost:3000/graphql.For example, our API endpoints documentation looks like this:
Additionally, we also offer API file generation using OpenAPI. You can generate API files using the OpenAPI specification. Here is how:
npx openapi-generator-cli generate -i http://localhost:3000/swagger.json -g typescript-fetch -o ./api
This will generate TypeScript files for the API endpoints in the api directory. Feel free to replace or modify the command as needed.
documents subcollection, passkeys, and challenges.controllers/ + passkeyController.js)./document-upload-url), then calls /upload with the extracted text. So /upload itself takes JSON, not a multipart file.curl --location --request POST 'http://localhost:3000/register' \
--header 'Content-Type: application/json' \
--data-raw '{
"email": "test@example.com",
"password": "password123"
}'
curl --location --request POST 'http://localhost:3000/upload' \
--header 'Content-Type: application/json' \
--data-raw '{
"userId": "<firebase-uid>",
"title": "My Report.pdf",
"text": "<extracted plain text>",
"html": "<optional display HTML>",
"filePath": "<path returned alongside the signed upload URL>",
"fileType": "application/pdf"
}'
[!TIP] To get
filePath, first callPOST /document-upload-urlwith{ userId, fileName }, upload the bytes to the returnedsignedUrl, then pass the returnedpathasfilePathabove. Alternatively, send the raw file toPOST /document-file(multipart) to upload through the backend.
The backend APIs uses centralized error handling to capture and log errors. Responses for failed requests are returned with a proper status code and an error message:
{
"error": "An internal error occurred",
"details": "Error details go here"
}
DocuThinker employs a two-layer agentic architecture that separates orchestration concerns (Node.js) from AI/ML execution (Python), connected by a resilient bridge with circuit breakers, cost controls, and full observability.
| Layer | Technology | Port | Responsibility |
|---|---|---|---|
| Orchestrator | Node.js 18+ / Express | 4000 |
Supervisor routing, agent loops, tool dispatch, cost tracking, MCP |
| AI/ML Backend | Python / FastAPI | 8000 |
LLM inference, RAG pipelines, NER, CrewAI multi-agent, vector/graph stores |
graph TB
subgraph "Clients"
WEB[React Frontend]
EXT[External Agents / MCP]
end
subgraph "Orchestrator :4000"
SUP[Supervisor<br/>classify / decompose / dispatch]
AL[Agent Loop<br/>tool-use cycle up to 10 iters]
CB[Circuit Breaker<br/>CLOSED / OPEN / HALF_OPEN]
CT[Cost Tracker<br/>daily + monthly budgets]
BP[Batch Processor<br/>concurrent doc processing]
DLQ[Dead Letter Queue<br/>retry + DLQ]
HO[Handoff Manager<br/>cross-agent context transfer]
TR[Tool Registry<br/>local + Python-bridge tools]
TB[Token Budget Manager<br/>context window guard]
CS[Conversation Store<br/>auto-summarizing history]
OBS[Context Observability<br/>OTel-compatible metrics]
PC[Prompt Cache Strategy<br/>3-layer Anthropic caching]
MCP_S[MCP Server<br/>13 tools over stdio]
MCP_C[MCP Client<br/>connect to external servers]
end
subgraph "AI/ML Backend :8000"
PY_SVC[DocumentIntelligenceService]
RAG[Agentic RAG Pipeline]
CREW[CrewAI Multi-Agent]
NLP[SpaCy NER / Sentiment]
VEC[ChromaDB Vectors]
KG[Neo4j Knowledge Graph]
end
subgraph "LLM Providers"
CLAUDE[Anthropic Claude]
GEMINI[Google Gemini]
end
WEB -->|REST| SUP
EXT -->|MCP stdio| MCP_S
SUP --> AL
SUP --> BP
AL --> TR
TR -->|Python Bridge| PY_SVC
AL --> CB
CB --> CLAUDE
CB --> GEMINI
CT -.->|budget check| SUP
TB -.->|token check| SUP
DLQ -.->|retry| SUP
HO -.->|context| AL
CS -.->|history| AL
OBS -.->|metrics| CT
PC -.->|cache hints| AL
PY_SVC --> RAG
PY_SVC --> CREW
PY_SVC --> NLP
RAG --> VEC
RAG --> KG
The orchestrator (orchestrator/) is a standalone Node.js service providing:
maxIterations (default 10), calling tools via the Tool Registry and feeding results back until the LLM produces a final response.maxRetries (default 3) before moving to the DLQ for manual inspection.analyze_document_text) and Python-bridged tools (e.g., extract_entities, rag_search, vector_search, knowledge_graph_query, python_sentiment). Tools are exposed to the Agent Loop in Anthropic tool-use format.userId:documentId. Auto-summarizes history when messages exceed 20, evicts LRU conversations beyond 10,000, and builds context-injected message arrays with document context and summaries.cache_control: ephemeral on system prompts, document context, and conversation history.orchestrator/mcp/server.js) – Exposes 13 tools over stdio transport: document_summarize, document_key_ideas, document_sentiment, document_discussion_points, document_analytics, document_bullet_summary, document_rewrite, document_recommendations, document_chat, system_health, system_costs, rag_query, knowledge_graph_query.orchestrator/mcp/client.js) – Connects to external MCP servers via stdio transport, enabling the orchestrator to consume tools from other agents.| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
System health with circuit breaker, cost, cache, DLQ, and provider status |
GET |
/api/costs |
Cost usage report by provider and intent |
GET |
/api/circuits |
Circuit breaker state for all providers |
GET |
/api/context-metrics |
Context utilization and cache hit rate metrics |
GET |
/api/dlq |
Dead letter queue stats and recent messages |
GET |
/api/tools |
Registered tool definitions and count |
POST |
/api/tools/execute |
Execute a registered tool by name |
POST |
/api/token-check |
Check token budget for a given model/prompt/messages |
POST |
/api/supervisor/process |
Route a request through the supervisor pipeline |
POST |
/api/agent/run |
Run the agentic tool-use loop with a message and context |
POST |
/api/batch/process |
Batch process multiple documents (summarize, keyIdeas, sentiment) |
POST |
/api/conversations/:userId/:documentId/message |
Add a message to a conversation |
GET |
/api/conversations/:userId/:documentId |
Retrieve conversation history |
DELETE |
/api/conversations/:userId/:documentId |
Clear a conversation |
[!TIP] Visit the
orchestrator/README.mdfor full API request/response examples and theai_ml/README.mdfor the Python AI/ML layer.
DocuThinker splits each document across three stores so it scales cleanly and never bumps Firestore’s per-document size limits. Firebase Firestore holds tiny metadata records, Supabase Storage (a private bucket) holds the original file bytes and the extracted text/HTML, and Firebase Auth handles identity. Nothing heavy ever lives in Firestore.
sequenceDiagram
participant B as Browser (React)
participant API as Express Backend
participant SB as Supabase Storage (private bucket)
participant FS as Firestore
participant AI as Google Gemini
B->>API: POST /document-upload-url { userId, fileName }
API->>SB: createSignedUploadUrl(path)
API-->>B: { path, token, signedUrl }
B->>SB: PUT file bytes (direct, bypasses ~4.5MB limit)
B->>API: POST /upload { userId, title, text, html, filePath, fileType }
API->>AI: summarize(text) (title + today's date injected as context)
API->>SB: store content JSON { originalText, originalHtml } → contentPath
API->>FS: write users/{uid}/documents/{docId} (metadata only)
API-->>B: { summary, fileUrl, ... }
docuthinker) using a backend-minted signed upload URL from POST /document-upload-url. This bypasses Vercel’s ~4.5 MB serverless request-body limit. A through-backend multipart fallback (POST /document-file, parsed with formidable) is also available.{ originalText, originalHtml }). Firestore stores only a tiny contentPath pointer to it.Metadata → Firestore subcollection. Each document is one Firestore document under users/{uid}/documents/{docId} — removing the old 1 MB-per-user documents array ceiling (effectively unlimited documents per user). Each record holds:
{
"id": "auto-generated-doc-id",
"title": "My Report.pdf",
"summary": "AI-generated summary text…",
"filePath": "uid/1700000000-abc-My_Report.pdf", // path in Supabase bucket
"fileType": "application/pdf",
"contentPath": "uid/content/<docId>.json", // points at the content object
"createdAt": "<server timestamp>"
}
fileUrl) for the original file. The viewer renders PDFs in a native <iframe>, DOCX via mammoth.convertToHtml, Markdown via react-markdown, HTML sanitized with DOMPurify, CSV/TSV as a table, JSON/code as monospace, and plain text as pre-wrapped text (see Multi-Format Upload & Extraction).service_role key is used only by the backend. The browser receives a one-time, path-scoped upload token (and, for the frontend SDK, only the public anon key).[!NOTE] Reads merge the new per-user
documentssubcollection with any legacy inlinedocumentsarray, so documents created before the migration still appear. Deletes clean up both the Firestore record and the associated Supabase objects (filePath+contentPath).
DocuThinker AI agents (and humans) use a Beads sub-architecture to coordinate work across multiple AI agents and humans operating on the same codebase. A bead is a self-contained, dependency-aware task unit that any agent can pick up, execute, and complete — enabling safe parallel development without merge conflicts.
When several AI agents (or human developers) work concurrently, they risk editing the same files and producing conflicting changes. Beads solve this with:
stateDiagram-v2
[*] --> Authored: Bead created from template
Authored --> Claimed: Agent reserves files via .status.json
Claimed --> InProgress: Agent begins implementation
InProgress --> Testing: Code changes complete
Testing --> Done: Acceptance criteria pass
Testing --> InProgress: Tests fail — iterate
Done --> [*]: Reservations released
InProgress --> Blocked: Dependency not met
Blocked --> InProgress: Dependency resolved
.beads/
├── .status.json # Live agent reservations & bead counters
├── README.md # Quick-start guide for the beads workflow
└── templates/
└── feature-bead.md # Canonical bead template
.beads/.status.json)The status file is the single source of truth for agent coordination:
{
"version": "1.0.0",
"agents": {},
"reservations": {},
"lastUpdated": null,
"beadsCompleted": 0,
"beadsActive": 0
}
| Field | Purpose |
|---|---|
agents |
Map of active agent IDs to their metadata (name, start time, current bead) |
reservations |
Map of file paths to the agent ID that holds the reservation |
beadsCompleted |
Counter of successfully finished beads |
beadsActive |
Counter of beads currently in progress |
Every bead follows a structured template (.beads/templates/feature-bead.md):
| Section | Description |
|---|---|
| Background | Why the work exists |
| Current State | Files to read before starting |
| Desired Outcome | Specific, testable result |
| Files to Touch | Explicit list of files to read, enhance, or create |
| Dependencies | Upstream beads that must finish first and downstream beads this unblocks |
| Acceptance Criteria | Checklist including “all existing tests still pass” |
Certain files are single-agent only — only one agent may hold a reservation at a time:
| Conflict Zone File | Reason |
|---|---|
docker-compose.yml |
Shared service definitions |
ai_ml/services/orchestrator.py |
Central AI/ML entry point |
ai_ml/providers/registry.py |
LLM provider configuration |
orchestrator/index.js |
Orchestrator entry point |
| Shared config files | Cross-service settings |
Safe parallel zones (multiple agents can work simultaneously):
ai_ml/providers/ vs. orchestrator/context/)sequenceDiagram
participant A as Agent
participant S as .status.json
participant C as Codebase
A->>S: 1. Check for conflicts
S-->>A: No reservation on target files
A->>S: 2. Post reservation (agent ID + file list)
A->>C: 3. Implement bead instructions
A->>C: 4. Run tests (acceptance criteria)
A->>S: 5. Release reservations
A->>S: 6. Increment beadsCompleted
Agents must:
.beads/.status.json before starting any work.agent/<agent-name>/<bead-id>.[!NOTE] For the full agent coordination protocol including conflict resolution and escalation, see AGENTS.md. For how beads integrate with the AI/ML pipeline, see AI_ML.md.
Our application supports a fully-featured GraphQL API that allows clients to interact with the backend using flexible queries and mutations. This API provides powerful features for retrieving and managing data such as users, documents, and related information.
documents subcollection).fileUrl, originalText, originalHtml) are resolved on demand from Supabase only when requested.| Type | Operation | Description |
|---|---|---|
| Query | getUser(id) |
User profile plus their documents. |
| Query | getUserEmail(userId) |
Email for a user. |
| Query | getDocument(userId, docId) |
A single document (resolve fileUrl / originalText / originalHtml on demand). |
| Query | listDocuments(userId) |
All of a user’s documents. |
| Query | searchDocuments(userId, searchTerm) |
Search results (docId, title, snippet). |
| Query | documentCount(userId) |
Number of documents. |
| Query | daysSinceJoined(userId) |
Days since the account was created. |
| Query | userJoinedDate(userId) |
The account’s join date. |
| Query | getSocialMedia(userId) |
Social-media links. |
| Query | analyzeSentiment(documentText) |
Sentiment score + description for arbitrary text. |
| Mutation | register(email, password) |
Create a user; returns { userId, customToken }. |
| Mutation | login(email, password) |
Authenticate; returns { userId, customToken }. |
| Mutation | summarizeDocument(userId, title, text, html, filePath, fileType) |
Summarize and (with userId) save to the library. |
| Mutation | deleteDocument(userId, docId) / deleteAllDocuments(userId) |
Delete one / all documents. |
| Mutation | updateDocumentTitle(userId, docId, title) |
Rename a document. |
| Mutation | updateEmail / updateTheme / updateSocialMedia |
Profile & account updates. |
| Mutation | generateKeyIdeas / generateDiscussionPoints / generateBulletSummary |
AI generation helpers (no storage). |
| Mutation | summaryInLanguage / actionableRecommendations / rewriteContent / refineSummary |
More AI helpers. |
| Mutation | chat(sessionId, message, originalText) |
Conversational chat over a document. |
https://docuthinker-app-backend-api.vercel.app/graphql
Or, if you are running the backend locally, the endpoint will be:
http://localhost:3000/graphql
Testing the API:
You can use the built-in GraphiQL Interface to test queries and mutations. Simply visit the endpoint in your browser.
You should see the following interface:
Now you can start querying the API using the available fields and mutations. Examples are below for your reference.
This query retrieves a user’s email and their documents, including titles and summaries:
query GetUser {
getUser(id: "USER_ID") {
id
email
documents {
id
title
summary
}
}
}
Retrieve details of a document by its ID:
query GetDocument {
getDocument(userId: "USER_ID", docId: "DOCUMENT_ID") {
id
title
summary
originalText
}
}
Create a user with an email and password (returns a Firebase custom token):
mutation Register {
register(email: "example@domain.com", password: "password123") {
userId
customToken
}
}
Change the title of a specific document:
mutation UpdateDocumentTitle {
updateDocumentTitle(userId: "USER_ID", docId: "DOCUMENT_ID", title: "Updated Title.pdf") {
id
title
}
}
Delete a document from a user’s account:
mutation DeleteDocument {
deleteDocument(userId: "USER_ID", docId: "DOCUMENT_ID")
}
errors field in the response.For more information about GraphQL, visit the official documentation. If you encounter any issues or have questions, feel free to open an issue in our repository.
The DocuThinker mobile app is a real React Native client (Expo SDK 51, TypeScript, expo-router) that talks directly to the same Vercel backend the web frontend uses. It is not a shell or mock — every screen reads from real endpoints, and accounts created via the web app sign in on mobile without any extra step.
graph LR
subgraph Web["💻 docuthinker.vercel.app"]
WLogin[Login.js]
WAuth["utils/auth.js<br/>localStorage + event"]
end
subgraph Mobile["📱 Expo Go / native build"]
MLogin[login.tsx]
MAuth["lib/auth.ts<br/>AsyncStorage + emitter"]
end
subgraph Backend["☁️ docuthinker-app-backend-api.vercel.app"]
Login["/login"]
FB[(Firebase Auth)]
end
WLogin --> Login
MLogin --> Login
Login --> FB
Login -->|customToken + userId| WAuth
Login -->|customToken + userId| MAuth
customToken + userId in AsyncStorage; root layout hydrates on boot.app/_layout.tsx redirects unauthed users to /login and authed users away from /login automatically.expo-document-picker + expo-file-system for plain-text uploads (see Upload Limitation).Pill accepts an align prop so the “Pro member” badge centers on Profile; settings rows on Profile are now real Pressables with onPress (showing “Coming soon” Alerts where the underlying flow isn’t wired yet).Alex Carter and the canned document list are gone. The only static content left is the four feature tiles on the Home screen.| Capability | Web | Mobile (this PR) | Backend |
|---|---|---|---|
| Email + password sign-in | ✅ | ✅ | POST /login |
| Register | ✅ | ✅ | POST /register |
| Forgot password | ✅ | UI stub | POST /forgot-password |
| Google sign-in | ✅ | UI stub | n/a |
| Document list | ✅ | ✅ pull-to-refresh | GET /documents/:userId |
| Document summary | ✅ | ✅ | GET /document-details/:userId/:docId |
| Profile (email, docs, days) | ✅ | ✅ | /users/:id, /document-count/:id, /days-since-joined/:id, /user-joined-date/:id |
| Document chat | ✅ | ✅ | POST /chat |
| Upload .txt/.md | ✅ | ✅ | POST /upload |
| Upload PDF/DOCX | ✅ (client-side parse) | ❌ (see below) | POST /upload |
| Document analytics dashboard | ✅ | future | — |
| Account / appearance / notifications panes | ✅ | UI stubs | — |
stateDiagram-v2
[*] --> Hydrating: app/_layout mount
Hydrating --> Anonymous: AsyncStorage empty
Hydrating --> Authed: userId present
Anonymous --> Authed: setAuth() after POST /login
Authed --> Anonymous: clearAuth() (Sign out)
Authed --> Loading: tab focus → fetch
Loading --> Ready: 4 parallel GETs resolve
Ready --> Refreshing: pull-to-refresh
Refreshing --> Ready
The backend /upload endpoint expects {userId, title, text} JSON. The web frontend parses PDF/DOCX in the browser with pdfjs-dist + mammoth before posting plain text. The mobile app does not currently ship a comparable RN parser because:
react-native-pdf, mammoth + xmldom polyfill) require native modules and expo prebuild, which would drop the Expo Go workflow.So: mobile uploads .txt/.md; web uploads PDF/DOCX. Both clients see the same documents in /documents/:userId, so the round-trip surface is consistent.
For the full mobile architecture (screen map, API client class diagram, lifecycle, troubleshooting) see mobile-app/README.md.
The DocuThinker app can be containerized using Docker for easy deployment and scaling. The docker-compose.yml defines all services including the new agentic orchestrator.
docker compose up --build
You can also view the image in the Docker Hub repository here.
| Service | Container | Port | Description |
|---|---|---|---|
frontend |
docuthinker-frontend |
3001 |
React frontend |
backend |
docuthinker-backend |
3000 |
Express API server |
orchestrator |
docuthinker-orchestrator |
4000 |
Agentic orchestration layer (Node.js) |
ai-ml |
docuthinker-ai-ml |
8000 |
Python AI/ML services (FastAPI) |
redis |
docuthinker-redis |
6379 |
In-memory cache (Redis 7 Alpine) |
firebase |
firebase | – | Firebase emulator |
The orchestrator container includes a health check (/health), runs as a non-root user, and depends on Redis being healthy before starting.
graph TB
A[Docker Compose] --> B[Frontend Container]
A --> C[Backend Container]
A --> O[Orchestrator Container]
A --> ML[AI/ML Container]
A --> D[Redis Container]
A --> F[Firebase Container]
B -->|Port 3001| G[React App]
C -->|Port 3000| H[Express Server]
O -->|Port 4000| I[Agentic Orchestrator]
ML -->|Port 8000| J[FastAPI AI/ML]
D -->|Port 6379| K[Redis Cache]
I -->|Python Bridge| J
I -->|Circuit Breaker| L[Claude / Gemini]
H -->|REST| I
DocuThinker now ships primarily via Kubernetes with blue/green promotion plus weighted canaries driven by the updated Jenkinsfile. Vercel/Render remain as backup endpoints, and AWS ECS Fargate is still available as an alternative target.
graph TB
GIT[GitHub Repo] --> JENKINS[Jenkins Pipeline]
JENKINS --> TEST[Install + Lint + Tests]
TEST --> BUILD[Containerize Frontend + Backend]
BUILD --> REG[Push Images to Registry]
REG --> CANARY[Canary Deploy - 10% weight]
CANARY --> BG[Promote to Blue/Green]
BG --> USERS[Live Traffic]
JENKINS --> VERCEL[Vercel Fallback Deploy]
VERCEL --> USERS
backend-service/frontend-service to the active track (blue by default). Canary traffic is handled by *-canary-service through the weighted ingress (ingress.yaml) using the X-DocuThinker-Canary: always header.${GIT_SHA}-${BUILD_NUMBER}, pushes them to $REGISTRY, deploys the target color (scaled to 3 replicas), and rolls out canaries (1 replica each). Promotion is a gated manual input before the service selector flips to the new color and the previous color scales to 0.To promote manually outside Jenkins:
TARGET=green # or blue
kubectl -n <ns> scale deployment/backend-$TARGET --replicas=3
kubectl -n <ns> scale deployment/frontend-$TARGET --replicas=3
kubectl -n <ns> patch service backend-service -p "{\"spec\": {\"selector\": {\"app\": \"backend\", \"track\": \"$TARGET\"}}}"
kubectl -n <ns> patch service frontend-service -p "{\"spec\": {\"selector\": {\"app\": \"frontend\", \"track\": \"$TARGET\"}}}"
kubectl -n <ns> scale deployment/backend-$( [ "$TARGET" = "blue" ] && echo green || echo blue ) --replicas=0
kubectl -n <ns> scale deployment/frontend-$( [ "$TARGET" = "blue" ] && echo green || echo blue ) --replicas=0
See kubernetes/README.md for the full rollout flow, ingress weighting, and rollback commands.
vercel --prod using the vercel-token credential when the main branch updates.To deploy manually:
npm install -g vercel
vercel --prod
kubernetes/backend-*.yaml, fronted by backend-service and the NGINX ingress canary (ingress.yaml). Vercel (https://docuthinker-app-backend-api.vercel.app/) and Render (https://docuthinker-ai-app.onrender.com/) remain as backup endpoints.$REGISTRY, deploys the next color alongside canary pods, and flips the service selector after manual approval.aws/ still provisions Fargate services if you prefer ECS over Kubernetes.To run the new rollout flow by hand:
kubectl apply -f kubernetes/configmap.yaml
kubectl apply -f kubernetes/backend-service.yaml kubernetes/backend-canary-service.yaml
kubectl apply -f kubernetes/backend-deployment-blue.yaml kubernetes/backend-deployment-green.yaml kubernetes/backend-deployment-canary.yaml
# See kubernetes/README.md for the promotion/rollback commands
nginx directory.npm ci) → lint/test → build → docker build/push ($REGISTRY) → canary deploy → manual promotion to blue/green on Kubernetes, with an optional Vercel deploy as fallback.docuthinker-registry – username/password for the container registry set in REGISTRY.kubeconfig-docuthinker – kubeconfig file used for all kubectl invocations.vercel-token – optional Vercel API token (keeps the legacy deploy available).For local Jenkins bootstrap:
brew install jenkins-lts
brew services start jenkins-lts
open http://localhost:8080
REGISTRY, KUBE_CONTEXT, and KUBE_NAMESPACE as job/env vars, and assign the credentials above. Jenkins will run automatically on every push to main.backend-service/frontend-service to the new track and scales down the previous color after approval.Jenkinsfile for the full stage definitions and environment configuration.If successful, you should see the Jenkins pipeline running tests, pushing images, rolling out the canary, and promoting blue/green automatically whenever changes are merged. Example dashboard:
In addition to Jenkins, we also have a GitHub Actions workflow set up for CI/CD. The workflow is defined in the .github/workflows/ci.yml file.
The GitHub Actions workflow includes the following steps:
vercel-token secret.dockerhub-username and dockerhub-password secrets, as well as to GHCR using the ghcr-token secret.
DocuThinker includes a comprehensive suite of tests to ensure the reliability and correctness of the application. The tests cover various aspects of the app, including:
To run the backend tests, follow these steps:
cd backend
# Run the tests in default mode
npm run test
# Run the tests in watch mode
npm run test:watch
# Run the tests with coverage report
npm run test:coverage
This will run the unit tests and integration tests for the backend app using Jest and Supertest.
To run the frontend tests, follow these steps:
cd frontend
# Run the tests in default mode
npm run test
# Run the tests in watch mode
npm run test:watch
# Run the tests with coverage report
npm run test:coverage
This will run the unit tests and end-to-end tests for the frontend app using Jest and React Testing Library.
kubernetes/*.yaml; see kubernetes/README.md for promotion/rollback commands.kubernetes directory.graph TB
A[Kubernetes Cluster] --> B[Ingress Controller]
B --> C[Frontend Service]
B --> D[Backend Service]
C --> E[Frontend Pods]
D --> F[Backend Pods]
E --> G[Pod 1]
E --> H[Pod 2]
E --> I[Pod 3]
F --> J[Pod 1]
F --> K[Pod 2]
F --> L[Pod 3]
D --> M[ConfigMap]
D --> N[Secrets]
D --> O[Persistent Volume]
O --> P[MongoDB]
O --> Q[Redis]
The DocuThinker Viewer extension brings your document upload, summarization and insight‑extraction workflow right into VS Code.
Key Features
To install the extension, follow these steps:
For full install and development steps, configuration options, and troubleshooting, see extension/README.md.
We welcome contributions from the community! Follow these steps to contribute:
Fork the repository.
git checkout -b feature/your-feature
git commit -m "Add your feature"
git push origin feature/your-feature
Thank you for contributing to DocuThinker! 🎉
This project is licensed under the Creative Commons Attribution-NonCommercial License. See the LICENSE file for details.
[!IMPORTANT] The DocuThinker open-source project is for educational purposes only and should not be used for commercial applications. But free to use it for learning and personal projects!
For more information on the DocuThinker app, please refer to the following resources:
However, this README file should already provide a comprehensive overview of the project ~
Here are some information about me - the project’s humble creator:
Happy Coding and Analyzing! 🚀
Created with ❤️ by Son Nguyen in 2024-2025. Licensed under the Creative Commons Attribution-NonCommercial License.