A production-ready, multi-agent RAG platform orchestrating LangGraph, LangChain, CrewAI, and multiple LLM providers for comprehensive document intelligence.
The ai_ml package is a sophisticated, production-ready Retrieval-Augmented Generation (RAG) platform that seamlessly integrates:
β Comprehensive Document Analysis - Extract summaries, topics, insights, and sentiments β Multi-Agent Reasoning - Three specialized agents collaborate for thorough analysis β Semantic Search - Vector-based retrieval with configurable embeddings β Q&A System - Context-aware question answering β Multi-Language Translation - Helsinki-NLP models for 7+ languages β Knowledge Graph Sync - Persistent document relationships in Neo4j β Persistent Memory - ChromaDB for cross-session semantic recall β Flexible Deployment - CLI, API server, MCP server, or Python library
graph TB
subgraph "External Interfaces"
CLI[CLI main.py]
FASTAPI[FastAPI Server<br/>server.py]
MCP[MCP Server<br/>mcp/server.py]
PYAPI[Python API<br/>backend.py]
end
subgraph "Service Layer"
SVC[DocumentIntelligenceService<br/>services/orchestrator.py]
end
subgraph "Pipeline Layer"
RAG[AgenticRAGPipeline<br/>pipelines/rag_graph.py]
CREW[CrewAI Agents<br/>agents/crew_agents.py]
end
subgraph "Tools & Utilities"
TOOLS[Document Tools<br/>tools/document_tools.py]
SEARCH[DocumentSearchTool]
INSIGHTS[InsightsExtractionTool]
end
subgraph "Providers & Models"
REGISTRY[LLMProviderRegistry<br/>providers/registry.py]
OPENAI[OpenAI<br/>GPT-4o]
ANTHROPIC[Anthropic<br/>Claude 3.5]
GEMINI[Google<br/>Gemini 1.5 Pro]
HF[HuggingFace<br/>Embeddings]
HELSINKI[Helsinki-NLP<br/>Translators]
end
subgraph "Persistence Layer"
FAISS[FAISS<br/>In-Memory Vector Store]
CHROMA[ChromaDB<br/>Persistent Vector Store]
NEO4J[Neo4j<br/>Knowledge Graph]
end
CLI --> SVC
FASTAPI --> SVC
MCP --> SVC
PYAPI --> SVC
SVC --> RAG
SVC --> CREW
SVC --> TOOLS
RAG --> REGISTRY
CREW --> REGISTRY
TOOLS --> REGISTRY
REGISTRY --> OPENAI
REGISTRY --> ANTHROPIC
REGISTRY --> GEMINI
REGISTRY --> HF
SVC --> HELSINKI
TOOLS --> FAISS
SVC --> CHROMA
SVC --> NEO4J
style SVC fill:#4CAF50,stroke:#333,stroke-width:3px,color:#fff
style RAG fill:#2196F3,stroke:#333,stroke-width:2px,color:#fff
style CREW fill:#FF9800,stroke:#333,stroke-width:2px,color:#fff
graph LR
subgraph "Core Settings"
CFG[Settings<br/>core/settings.py]
ENV[Environment Variables]
end
subgraph "Service Orchestration"
SERVICE[DocumentIntelligenceService]
end
subgraph "Agentic Pipeline"
INGEST[Ingest Node<br/>Chunking & Embedding]
RAG_NODE[RAG Node<br/>Primary Analysis]
CREW_NODE[Crew Node<br/>Multi-Agent Validation]
FINAL[Finalize Node<br/>Report Assembly]
end
subgraph "CrewAI Agents"
ANALYST[Analyst Agent<br/>OpenAI GPT-4o]
RESEARCHER[Researcher Agent<br/>Google Gemini]
REVIEWER[Reviewer Agent<br/>Anthropic Claude]
end
ENV --> CFG
CFG --> SERVICE
SERVICE --> INGEST
INGEST --> RAG_NODE
RAG_NODE --> CREW_NODE
CREW_NODE --> FINAL
CREW_NODE --> ANALYST
CREW_NODE --> RESEARCHER
CREW_NODE --> REVIEWER
ANALYST --> FINAL
RESEARCHER --> FINAL
REVIEWER --> FINAL
style SERVICE fill:#9C27B0,stroke:#333,stroke-width:3px,color:#fff
style CREW_NODE fill:#FF5722,stroke:#333,stroke-width:2px,color:#fff
sequenceDiagram
participant User
participant CLI/API
participant Service
participant Pipeline
participant RAG
participant CrewAI
participant Neo4j
participant ChromaDB
User->>CLI/API: Submit Document + Question
CLI/API->>Service: analyze_document()
Service->>Pipeline: run(document, question)
Pipeline->>Pipeline: 1. Ingest & Chunk
Pipeline->>Pipeline: 2. Create FAISS Vector Store
Pipeline->>RAG: 3. Primary RAG Pass
RAG->>RAG: Retrieve Context
RAG->>RAG: Generate JSON (overview, topics, QA)
RAG-->>Pipeline: RAG Payload
Pipeline->>CrewAI: 4. Crew Collaboration
CrewAI->>CrewAI: Analyst drafts summary
CrewAI->>CrewAI: Researcher validates with citations
CrewAI->>CrewAI: Reviewer synthesizes insights
CrewAI-->>Pipeline: Crew Payload
Pipeline->>Pipeline: 5. Finalize Report
Pipeline-->>Service: Final Output
Service->>Service: Enrich (sentiment, translation)
alt Knowledge Graph Enabled
Service->>Neo4j: Sync Document & Topics
Neo4j-->>Service: Sync Status
end
alt Vector Store Enabled
Service->>ChromaDB: Upsert Document
ChromaDB-->>Service: Upsert Status
end
Service-->>CLI/API: Complete Results
CLI/API-->>User: Display Analysis
ai_ml/
βββ core/
β βββ settings.py # Runtime configuration & environment settings
β βββ __init__.py
βββ services/
β βββ orchestrator.py # DocumentIntelligenceService facade
β βββ __init__.py
βββ pipelines/
β βββ rag_graph.py # LangGraph agentic RAG pipeline
β βββ __init__.py
βββ agents/
β βββ crew_agents.py # CrewAI multi-agent collaboration
β βββ __init__.py
βββ providers/
β βββ registry.py # Multi-provider LLM & embedding registry
β βββ __init__.py
βββ tools/
β βββ document_tools.py # Semantic search & insights tools
β βββ __init__.py
βββ graph/
β βββ neo4j_client.py # Neo4j knowledge graph client
β βββ __init__.py
βββ vectorstores/
β βββ chroma_store.py # ChromaDB persistent vector store
β βββ __init__.py
βββ mcp/
β βββ server.py # MCP server for tool exposure
β βββ __init__.py
βββ processing/
β βββ summarizer.py # Summarization utilities
β βββ sentiment.py # Sentiment analysis
β βββ topic_extractor.py # Topic extraction
β βββ translator.py # Translation utilities
βββ extended_features/
β βββ chat_interface.py # Conversational AI interface
β βββ bullet_summary_generator.py
β βββ key_ideas_extractor.py
β βββ recommendations_generator.py
β βββ refine_summary.py
β βββ rewriter.py
β βββ voice_chat.py
βββ models/
β βββ hf_model.py # HuggingFace model loaders
β βββ model_utils.py # Model utilities
β βββ onnx_helper.py # ONNX conversion helpers
βββ backend.py # High-level API facade
βββ server.py # FastAPI REST server
βββ main.py # CLI entry point
βββ config.py # Legacy configuration
βββ requirements.txt # Python dependencies
βββ README.md # This file
| Class | Location | Purpose |
|---|---|---|
DocumentIntelligenceService |
services/orchestrator.py |
Main facade - Orchestrates all AI/ML capabilities |
AgenticRAGPipeline |
pipelines/rag_graph.py |
LangGraph pipeline - Stateful RAG workflow |
LLMProviderRegistry |
providers/registry.py |
Provider registry - Lazy-load LLMs & embeddings |
Neo4jGraphClient |
graph/neo4j_client.py |
Knowledge graph - Neo4j operations |
ChromaVectorClient |
vectorstores/chroma_store.py |
Vector store - Persistent semantic search |
DocumentSearchTool |
tools/document_tools.py |
Semantic search - FAISS-backed retrieval |
InsightsExtractionTool |
tools/document_tools.py |
Topic extraction - Heuristic-based insights |
| Feature | Description | Module |
|---|---|---|
| Comprehensive Analysis | Full document intelligence with multi-agent validation | services/orchestrator.py |
| Summarization | Narrative and bullet-point summaries | processing/summarizer.py |
| Topic Extraction | AI-powered theme identification | services/orchestrator.py:140 |
| Q&A System | Context-aware question answering | pipelines/rag_graph.py |
| Sentiment Analysis | JSON-based sentiment with confidence scores | services/orchestrator.py:211 |
| Semantic Search | Vector-based document retrieval | tools/document_tools.py |
| Translation | Multi-language support (7+ languages) | models/hf_model.py |
| Recommendations | Actionable next steps generation | extended_features/recommendations_generator.py |
| Discussion Points | Debate prompts generation | discussion/discussion_generator.py |
| Rewriting | Tone-based document rewriting | extended_features/rewriter.py |
The platform employs three specialized CrewAI agents that collaborate sequentially:
graph LR
A[Document Analyst<br/>OpenAI GPT-4o] -->|Draft Summary| B[Cross-Referencer<br/>Google Gemini]
B -->|Validated Findings| C[Insights Curator<br/>Anthropic Claude]
C -->|Executive Insights| D[Final Report]
style A fill:#10A37F,stroke:#333,stroke-width:2px,color:#fff
style B fill:#4285F4,stroke:#333,stroke-width:2px,color:#fff
style C fill:#D97757,stroke:#333,stroke-width:2px,color:#fff
style D fill:#9C27B0,stroke:#333,stroke-width:2px,color:#fff
DOCUTHINKER_SYNC_VECTOR=trueDOCUTHINKER_CHROMA_DIR)sentence-transformers/all-MiniLM-L6-v2)DOCUTHINKER_SYNC_GRAPH=true(Document {id, title, summary, updated_at, metadata})
(Topic {name})
(Document)-[:COVERS]->(Topic)
graph_query toolgit clone https://github.com/hoangsonww/DocuThinker-AI-App.git
cd DocuThinker-AI-App/ai_ml
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
For specific features:
# For Neo4j knowledge graph
pip install neo4j
# For ChromaDB vector store
pip install chromadb
# For ONNX model optimization
pip install onnx onnxruntime optimum[onnxruntime]
Create a .env file in the ai_ml/ directory:
# Required: At least one LLM provider
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...
# Optional: Model overrides
DOCUTHINKER_OPENAI_MODEL=gpt-4o-mini
DOCUTHINKER_CLAUDE_MODEL=claude-3-5-sonnet-20241022
DOCUTHINKER_GEMINI_MODEL=gemini-1.5-pro
DOCUTHINKER_SENTIMENT_MODEL=claude-3-haiku-20240307
DOCUTHINKER_QA_MODEL=gpt-4o-mini
# Embedding configuration
DOCUTHINKER_EMBEDDING_PROVIDER=huggingface
DOCUTHINKER_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
# Chunking configuration
DOCUTHINKER_CHUNK_SIZE=900
DOCUTHINKER_CHUNK_OVERLAP=120
# RAG configuration
DOCUTHINKER_RAG_QUESTION="Provide a comprehensive intelligence brief for this document."
DOCUTHINKER_BULLET_STYLE="Use concise bullet points and preserve key metrics or figures."
# Neo4j configuration (optional)
DOCUTHINKER_SYNC_GRAPH=true
DOCUTHINKER_NEO4J_URI=bolt://localhost:7687
DOCUTHINKER_NEO4J_USER=neo4j
DOCUTHINKER_NEO4J_PASSWORD=your-password
DOCUTHINKER_NEO4J_DATABASE=neo4j
# ChromaDB configuration (optional)
DOCUTHINKER_SYNC_VECTOR=true
DOCUTHINKER_CHROMA_DIR=.chroma
DOCUTHINKER_CHROMA_COLLECTION=docuthinker
DOCUTHINKER_VECTOR_TOP_K=6
# Knowledge base path (optional)
DOCUTHINKER_KB_PATH=/path/to/knowledge/base
For quick start with OpenAI only:
export OPENAI_API_KEY=sk-...
The system will automatically fallback to OpenAI for all agents if other providers are unavailable.
The CLI provides a comprehensive document analysis interface.
python -m ai_ml.main documents/sample.txt
python -m ai_ml.main documents/sample.txt \
--question "What are the key risks mentioned?"
python -m ai_ml.main documents/sample.txt \
--question "Summarize the main findings" \
--translate_lang es
python -m ai_ml.main documents/sample.txt \
--doc_id policy-2024-001 \
--title "Q4 Policy Whitepaper" \
--translate_lang de
The CLI displays:
from ai_ml.services import get_document_service
# Get singleton service instance
service = get_document_service()
# Analyze document
results = service.analyze_document(
document="Your document text here...",
question="What are the key takeaways?",
translate_lang="fr",
metadata={"id": "doc-001", "title": "Sample Document"}
)
# Access results
print(results["summary"])
print(results["topics"])
print(results["qa"])
print(results["sentiment"])
print(results["translation"])
from ai_ml.services import get_document_service
service = get_document_service()
document = "Your document text..."
# Summarization
summary = service.summarize(document)
bullet_summary = service.bullet_summary(document)
# Topic extraction
topics = service.extract_topics(document)
# Q&A
answer = service.answer_question(document, "What is the main conclusion?")
# Sentiment analysis
sentiment = service.sentiment(document)
# Translation
translation = service.translate(document, "es")
# Semantic search
results = service.semantic_search(document, "risk assessment")
# Recommendations
recommendations = service.recommendations(document)
# Discussion points
discussion = service.discussion_points(document)
# Rewriting
rewritten = service.rewrite(document, tone="executive")
# Refine summary
refined = service.refine_summary("Draft summary...", document)
from ai_ml.services import get_document_service
service = get_document_service()
# Upsert document to vector store
result = service.upsert_vector_document(
document="Document text...",
metadata={"source": "research_paper", "year": 2024},
doc_id="doc-123"
)
# Query vector store
hits = service.query_vector_index("machine learning trends", n_results=10)
for hit in hits:
print(f"ID: {hit['id']}, Score: {hit['distance']}")
print(f"Snippet: {hit['document'][:200]}...")
from ai_ml.services import get_document_service
service = get_document_service()
# Sync to knowledge graph
result = service.sync_to_knowledge_graph(
document="Document text...",
agentic_payload={"overview": "...", "key_topics": [...]},
metadata={"id": "doc-456", "title": "Research Paper"}
)
# Run Cypher query
results = service.run_graph_query(
query="""
MATCH (d:Document)-[:COVERS]->(t:Topic)
WHERE t.name = $topic
RETURN d.title, d.summary
LIMIT 10
""",
params={"topic": "artificial intelligence"}
)
from ai_ml.services import get_document_service
service = get_document_service()
# Create conversation chain
chain = service.create_conversation_chain()
# Interactive conversation
response1 = chain.run(input="What is machine learning?")
response2 = chain.run(input="Can you give me an example?")
response3 = chain.run(input="How does it relate to AI?")
uvicorn ai_ml.server:app --reload --host 0.0.0.0 --port 8000
POST /analyze
Request Body:
{
"document": "Your document text here...",
"question": "What are the main findings?",
"translate_lang": "fr",
"metadata": {
"id": "doc-001",
"title": "Sample Document",
"source": "research"
}
}
Response:
{
"rag": {
"overview": "...",
"key_topics": ["topic1", "topic2"],
"qa_answer": "...",
"supporting_context": ["quote1", "quote2"],
"crew_analysis": {...},
"citations": [...]
},
"summary": "...",
"topics": ["topic1", "topic2"],
"qa": "...",
"discussion": "...",
"insights": "...",
"sentiment": {
"label": "positive",
"confidence": 0.85,
"rationale": "..."
},
"translation": "...",
"document_id": "doc-001",
"metadata": {...},
"sync": {
"graph": {"status": "ok", "document_id": "doc-001"},
"vector_store": {"status": "ok", "document_id": "doc-001"}
}
}
curl -X POST http://localhost:8000/analyze \
-H "Content-Type: application/json" \
-d '{
"document": "Artificial intelligence is transforming industries...",
"question": "What industries are affected?",
"translate_lang": "es"
}'
import requests
response = requests.post(
"http://localhost:8000/analyze",
json={
"document": "Your document text...",
"question": "What are the key findings?",
"translate_lang": "fr"
}
)
results = response.json()
print(results["summary"])
The MCP server exposes DocuThinker capabilities as standardized tools for external consumption.
python -m ai_ml.mcp.server
| Tool | Description | Parameters |
|---|---|---|
agentic_document_brief |
Full agentic analysis pipeline | document, question?, translate_lang |
semantic_document_search |
Semantic search within document | document, query |
quick_topics |
Extract bullet topics | document |
vector_upsert |
Persist to vector store | document, doc_id?, metadata? |
vector_search |
Search vector store | query, n_results |
graph_upsert |
Sync to knowledge graph | document, metadata? |
graph_query |
Execute Cypher query | query, params? |
Add to your MCP client configuration (e.g., Claude Desktop, Cline):
{
"mcpServers": {
"docuthinker-agentic": {
"command": "python",
"args": ["-m", "ai_ml.mcp.server"],
"env": {
"OPENAI_API_KEY": "sk-...",
"ANTHROPIC_API_KEY": "sk-ant-...",
"GOOGLE_API_KEY": "..."
}
}
}
}
From an MCP-compatible client:
Use the docuthinker-agentic tool agentic_document_brief with:
- document: "Quarterly earnings report shows 15% revenue growth..."
- question: "What are the key financial metrics?"
- translate_lang: "de"
docker run -d \
--name neo4j \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/your-password \
-v $HOME/neo4j/data:/data \
neo4j:5.11
export DOCUTHINKER_SYNC_GRAPH=true
export DOCUTHINKER_NEO4J_URI=bolt://localhost:7687
export DOCUTHINKER_NEO4J_USER=neo4j
export DOCUTHINKER_NEO4J_PASSWORD=your-password
export DOCUTHINKER_NEO4J_DATABASE=neo4j
from ai_ml.services import get_document_service
service = get_document_service()
results = service.run_graph_query(
query="""
MATCH (d:Document)-[:COVERS]->(t:Topic)
WHERE t.name CONTAINS $keyword
RETURN d.id, d.title, d.summary, t.name
ORDER BY d.updated_at DESC
LIMIT 20
""",
params={"keyword": "machine learning"}
)
results = service.run_graph_query(
query="""
MATCH (d1:Document)-[:COVERS]->(t:Topic)<-[:COVERS]-(d2:Document)
WHERE d1.id = $doc_id AND d1 <> d2
RETURN DISTINCT d2.id, d2.title, COUNT(t) AS shared_topics
ORDER BY shared_topics DESC
LIMIT 10
""",
params={"doc_id": "doc-123"}
)
results = service.run_graph_query(
query="""
MATCH (t:Topic)<-[:COVERS]-(d:Document)
RETURN t.name, COUNT(d) AS document_count
ORDER BY document_count DESC
LIMIT 20
"""
)
ChromaDB is file-based and requires no separate server:
export DOCUTHINKER_SYNC_VECTOR=true
export DOCUTHINKER_CHROMA_DIR=.chroma
export DOCUTHINKER_CHROMA_COLLECTION=docuthinker
from ai_ml.services import get_document_service
service = get_document_service()
# Upsert multiple documents
documents = [
{"text": "Document 1 content...", "id": "doc-1", "metadata": {"year": 2024}},
{"text": "Document 2 content...", "id": "doc-2", "metadata": {"year": 2023}},
{"text": "Document 3 content...", "id": "doc-3", "metadata": {"year": 2024}},
]
for doc in documents:
service.upsert_vector_document(
document=doc["text"],
doc_id=doc["id"],
metadata=doc["metadata"]
)
# Search across all documents
results = service.query_vector_index(
query="What are the latest trends in AI?",
n_results=10
)
for result in results:
print(f"Document: {result['id']}")
print(f"Similarity Score: {result['distance']}")
print(f"Snippet: {result['document'][:200]}...")
print(f"Metadata: {result['metadata']}")
print()
The CrewAI integration provides sophisticated multi-agent reasoning.
Agents are configured in agents/crew_agents.py and can be customized:
from crewai import Agent, Crew, Process, Task
from ai_ml.providers.registry import LLMProviderRegistry, LLMConfig
registry = LLMProviderRegistry()
# Create custom agent
custom_agent = Agent(
name="Data Analyst",
role="Statistical data analyzer",
goal="Extract quantitative insights and trends",
backstory="You are a data scientist specializing in statistical analysis...",
llm=registry.chat(LLMConfig(provider="openai", model="gpt-4o-mini")),
tools=[search_tool, insights_tool],
verbose=True
)
# Create custom task
custom_task = Task(
name="Statistical Analysis",
description="Perform statistical analysis on the document data...",
agent=custom_agent,
expected_output="Statistical report with key metrics and trends"
)
from ai_ml.agents import build_document_crew
from ai_ml.tools import DocumentSearchTool, InsightsExtractionTool, build_vector_store
from ai_ml.providers.registry import LLMProviderRegistry
# Build tools
retriever = build_vector_store(document)
search_tool = DocumentSearchTool(retriever)
insights_tool = InsightsExtractionTool(chunks)
# Build crew with custom context
registry = LLMProviderRegistry()
crew = build_document_crew(
registry,
retriever_tool=search_tool,
insights_tool=insights_tool,
additional_context={
"openai_model": "gpt-4o", # Use more powerful model
"gemini_model": "gemini-1.5-pro",
"claude_model": "claude-3-5-sonnet-20241022"
}
)
# Run crew
result = crew.kickoff(inputs={
"question": "Analyze the financial projections",
"rag_overview": "...",
"rag_topics": [...]
})
The main service facade for all document intelligence operations.
analyze_document(document, question=None, translate_lang='fr', metadata=None)Run the full agentic pipeline with enrichments.
Parameters:
document (str): Document text to analyzequestion (str, optional): Question for Q&Atranslate_lang (str, optional): Target language code (default: βfrβ)metadata (dict, optional): Document metadataReturns:
dict: Complete analysis resultsExample:
results = service.analyze_document(
document="Your text...",
question="What are the risks?",
translate_lang="es",
metadata={"id": "doc-001"}
)
summarize(document, style=None)Generate narrative summary.
Parameters:
document (str): Document textstyle (str, optional): Summary style hintReturns:
str: Summary textbullet_summary(document)Generate bullet-point summary.
Parameters:
document (str): Document textReturns:
str: Bullet summaryextract_topics(document)Extract main topics/themes.
Parameters:
document (str): Document textReturns:
list[str]: List of topicsanswer_question(document, question)Answer question about document.
Parameters:
document (str): Document textquestion (str): Question to answerReturns:
str: Answersentiment(document)Analyze sentiment.
Parameters:
document (str): Document textReturns:
dict: Sentiment result with keys: label, confidence, rationaletranslate(document, target_lang)Translate document.
Parameters:
document (str): Document texttarget_lang (str): Target language codeReturns:
str | None: Translated text or None if failedsemantic_search(document, query)Perform semantic search.
Parameters:
document (str): Document textquery (str): Search queryReturns:
list[dict]: List of results with keys: source, snippet, score, idrecommendations(document)Generate actionable recommendations.
Parameters:
document (str): Document textReturns:
str: Recommendationsdiscussion_points(document)Generate discussion prompts.
Parameters:
document (str): Document textReturns:
str: Discussion pointsrewrite(document, tone='professional')Rewrite document in specified tone.
Parameters:
document (str): Document texttone (str, optional): Target tone (default: βprofessionalβ)Returns:
str: Rewritten textrefine_summary(draft_summary, document)Refine draft summary.
Parameters:
draft_summary (str): Draft summarydocument (str): Original documentReturns:
str: Refined summarysync_to_knowledge_graph(document, agentic_payload, metadata)Sync document to Neo4j.
Parameters:
document (str): Document textagentic_payload (dict): Analysis resultsmetadata (dict): Document metadataReturns:
dict: Sync statusrun_graph_query(query, params=None)Execute Cypher query.
Parameters:
query (str): Cypher queryparams (dict, optional): Query parametersReturns:
list[dict]: Query resultsupsert_vector_document(document, metadata=None, doc_id=None)Upsert document to vector store.
Parameters:
document (str): Document textmetadata (dict, optional): Document metadatadoc_id (str, optional): Document IDReturns:
dict: Upsert statusquery_vector_index(query, n_results=None)Query vector store.
Parameters:
query (str): Search queryn_results (int, optional): Number of resultsReturns:
list[dict]: Search resultscreate_conversation_chain()Create conversation chain.
Returns:
ConversationChain: LangChain conversation chainDefined in core/settings.py and configurable via environment variables.
| Setting | Environment Variable | Default | Description |
|---|---|---|---|
| LLM Models | Β | Β | Β |
| Analyst Model | DOCUTHINKER_OPENAI_MODEL |
gpt-4o-mini |
OpenAI model for analysis |
| Researcher Model | DOCUTHINKER_GEMINI_MODEL |
gemini-1.5-pro |
Google model for research |
| Reviewer Model | DOCUTHINKER_CLAUDE_MODEL |
claude-3-5-sonnet-20241022 |
Anthropic model for review |
| Sentiment Model | DOCUTHINKER_SENTIMENT_MODEL |
claude-3-haiku-20240307 |
Model for sentiment analysis |
| Q&A Model | DOCUTHINKER_QA_MODEL |
Same as analyst | Model for Q&A |
| Embeddings | Β | Β | Β |
| Embedding Provider | DOCUTHINKER_EMBEDDING_PROVIDER |
huggingface |
Provider for embeddings |
| Embedding Model | DOCUTHINKER_EMBEDDING_MODEL |
sentence-transformers/all-MiniLM-L6-v2 |
Embedding model name |
| Chunking | Β | Β | Β |
| Chunk Size | DOCUTHINKER_CHUNK_SIZE |
900 |
Characters per chunk |
| Chunk Overlap | DOCUTHINKER_CHUNK_OVERLAP |
120 |
Overlap between chunks |
| RAG | Β | Β | Β |
| Default Question | DOCUTHINKER_RAG_QUESTION |
"Provide a comprehensive intelligence brief..." |
Default RAG question |
| Bullet Style | DOCUTHINKER_BULLET_STYLE |
"Use concise bullet points..." |
Bullet summary style |
| Neo4j | Β | Β | Β |
| Enable Sync | DOCUTHINKER_SYNC_GRAPH |
false |
Enable Neo4j sync |
| URI | DOCUTHINKER_NEO4J_URI |
None |
Neo4j connection URI |
| User | DOCUTHINKER_NEO4J_USER |
None |
Neo4j username |
| Password | DOCUTHINKER_NEO4J_PASSWORD |
None |
Neo4j password |
| Database | DOCUTHINKER_NEO4J_DATABASE |
None |
Neo4j database name |
| ChromaDB | Β | Β | Β |
| Enable Sync | DOCUTHINKER_SYNC_VECTOR |
false |
Enable vector store sync |
| Directory | DOCUTHINKER_CHROMA_DIR |
None |
ChromaDB persist directory |
| Collection | DOCUTHINKER_CHROMA_COLLECTION |
docuthinker |
Collection name |
| Top K | DOCUTHINKER_VECTOR_TOP_K |
6 |
Number of results |
| Other | Β | Β | Β |
| Knowledge Base Path | DOCUTHINKER_KB_PATH |
None |
Path to knowledge base |
| Fallback Summarizer | DOCUTHINKER_FALLBACK_SUMMARIZER |
facebook/bart-large-cnn |
HuggingFace summarizer |
Each agent model is configured as a ProviderSpec:
ProviderSpec(
provider="openai", # "openai", "anthropic", "google"
model="gpt-4o-mini",
temperature=0.15,
max_tokens=900,
extra={} # Provider-specific kwargs
)
Default Helsinki-NLP models by language:
| Language | Code | Model |
|---|---|---|
| French | fr |
Helsinki-NLP/opus-mt-en-fr |
| German | de |
Helsinki-NLP/opus-mt-en-de |
| Spanish | es |
Helsinki-NLP/opus-mt-en-es |
| Italian | it |
Helsinki-NLP/opus-mt-en-it |
| Portuguese | pt |
Helsinki-NLP/opus-mt-en-pt |
| Chinese | zh |
Helsinki-NLP/opus-mt-en-zh |
| Japanese | ja |
Helsinki-NLP/opus-mt-en-ja |
Typical performance metrics (M1 MacBook Pro, 16GB RAM):
| Operation | Input Size | Time | Notes |
|---|---|---|---|
| Full Analysis | 5K tokens | ~15-25s | With CrewAI collaboration |
| Summary Only | 5K tokens | ~3-5s | Single LLM call |
| Topic Extraction | 5K tokens | ~2-4s | Single LLM call |
| Semantic Search | 10K docs | ~100-200ms | FAISS in-memory |
| Vector Upsert | 1 doc | ~50-100ms | ChromaDB persist |
| Translation | 5K tokens | ~5-10s | HuggingFace model |
# Faster models for lower latency
export DOCUTHINKER_OPENAI_MODEL=gpt-4o-mini # Instead of gpt-4o
export DOCUTHINKER_CLAUDE_MODEL=claude-3-haiku-20240307 # Instead of sonnet
# Smaller chunks = faster embedding, less context
export DOCUTHINKER_CHUNK_SIZE=600
export DOCUTHINKER_CHUNK_OVERLAP=60
The service uses singleton pattern and caches:
# Use asyncio for concurrent document processing
import asyncio
from ai_ml.services import get_document_service
async def analyze_batch(documents):
service = get_document_service()
tasks = [
asyncio.to_thread(service.analyze_document, doc)
for doc in documents
]
return await asyncio.gather(*tasks)
# Process 10 documents concurrently
results = asyncio.run(analyze_batch(documents))
Convert HuggingFace models to ONNX for faster inference:
python -m ai_ml.convert_to_onnx \
--model_name sentence-transformers/all-MiniLM-L6-v2 \
--output_dir models/onnx/
FROM python:3.10-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application
COPY . .
# Expose ports
EXPOSE 8000
# Set environment variables
ENV PYTHONUNBUFFERED=1
# Run FastAPI server
CMD ["uvicorn", "ai_ml.server:app", "--host", "0.0.0.0", "--port", "8000"]
version: '3.8'
services:
docuthinker-api:
build: .
ports:
- "8000:8000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- GOOGLE_API_KEY=${GOOGLE_API_KEY}
- DOCUTHINKER_SYNC_GRAPH=true
- DOCUTHINKER_NEO4J_URI=bolt://neo4j:7687
- DOCUTHINKER_NEO4J_USER=neo4j
- DOCUTHINKER_NEO4J_PASSWORD=${NEO4J_PASSWORD}
- DOCUTHINKER_SYNC_VECTOR=true
- DOCUTHINKER_CHROMA_DIR=/data/chroma
volumes:
- chroma-data:/data/chroma
depends_on:
- neo4j
neo4j:
image: neo4j:5.11
ports:
- "7474:7474"
- "7687:7687"
environment:
- NEO4J_AUTH=neo4j/${NEO4J_PASSWORD}
volumes:
- neo4j-data:/data
volumes:
chroma-data:
neo4j-data:
# Set environment variables
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export GOOGLE_API_KEY=...
export NEO4J_PASSWORD=your-password
# Build and start
docker-compose up -d
# View logs
docker-compose logs -f docuthinker-api
# Stop
docker-compose down
# Install Heroku CLI
brew install heroku/brew/heroku
# Login
heroku login
# Create app
heroku create docuthinker-api
# Set environment variables
heroku config:set OPENAI_API_KEY=sk-...
heroku config:set ANTHROPIC_API_KEY=sk-ant-...
heroku config:set GOOGLE_API_KEY=...
# Deploy
git push heroku main
# View logs
heroku logs --tail
Use AWS Lambda + API Gateway for serverless deployment:
# lambda_handler.py
import json
from ai_ml.services import get_document_service
service = get_document_service()
def lambda_handler(event, context):
body = json.loads(event['body'])
results = service.analyze_document(
document=body['document'],
question=body.get('question'),
translate_lang=body.get('translate_lang', 'fr'),
metadata=body.get('metadata')
)
return {
'statusCode': 200,
'body': json.dumps(results),
'headers': {'Content-Type': 'application/json'}
}
Package and deploy using AWS SAM or Serverless Framework.
Error:
MissingAPIKeyError: OPENAI_API_KEY environment variable is required
Solution:
export OPENAI_API_KEY=sk-...
# Or add to .env file
Error:
MissingDependencyError: Install langchain-anthropic to use the Anthropic provider
Solution:
pip install langchain-anthropic
Error:
Neo4jNotConfigured: Neo4j credentials are not configured
Solution:
# Ensure Neo4j is running
docker ps | grep neo4j
# Check connection
export DOCUTHINKER_NEO4J_URI=bolt://localhost:7687
export DOCUTHINKER_NEO4J_USER=neo4j
export DOCUTHINKER_NEO4J_PASSWORD=your-password
Error:
ChromaNotConfigured: Chroma persist directory is not configured
Solution:
# Create directory
mkdir -p .chroma
# Set environment variable
export DOCUTHINKER_CHROMA_DIR=.chroma
Error:
TimeoutError: Model loading exceeded timeout
Solution:
# Pre-download models
python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')"
Error:
MemoryError: Unable to allocate array
Solution:
# Reduce chunk size
export DOCUTHINKER_CHUNK_SIZE=500
# Or split document before processing
def split_large_document(text, max_size=50000):
return [text[i:i+max_size] for i in range(0, len(text), max_size)]
parts = split_large_document(large_document)
results = [service.analyze_document(part) for part in parts]
Enable verbose logging:
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger('ai_ml')
logger.setLevel(logging.DEBUG)
from ai_ml.services import get_document_service
service = get_document_service()
# Test basic functionality
try:
result = service.summarize("This is a test document.")
print(f"β
Service operational: {result[:50]}...")
except Exception as e:
print(f"β Service error: {e}")
# Clone repository
git clone https://github.com/hoangsonww/DocuThinker-AI-App.git
cd DocuThinker-AI-App/ai_ml
# Create virtual environment
python -m venv venv
source venv/bin/activate
# Install development dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt # If available
# Install pre-commit hooks
pre-commit install
The project follows PEP 8 with Black formatting:
# Format code
black ai_ml/
# Check formatting
black --check ai_ml/
# Lint
flake8 ai_ml/
# Type checking
mypy ai_ml/
core/ - Configuration and settingsservices/ - High-level service facadespipelines/ - LangGraph workflowsagents/ - CrewAI agent definitionsproviders/ - LLM provider abstractionstools/ - Reusable tools for agentsgraph/ - Knowledge graph clientsvectorstores/ - Vector store clientsprocessing/ - Document processing utilitiesextended_features/ - Additional featuresmodels/ - Model loading utilities# services/orchestrator.py
def new_feature(self, document: str) -> str:
"""New feature description."""
prompt = ChatPromptTemplate.from_template("...")
llm = self._resolve_llm(self.settings.agent_models["analyst"])
chain = prompt | llm | StrOutputParser()
return chain.invoke({"document": document})
# backend.py
def new_feature(document: str) -> str:
return SERVICE.new_feature(document)
# mcp/server.py
@app.tool()
def new_feature_tool(document: str) -> str:
"""Tool description for MCP."""
return SERVICE.new_feature(document)
# main.py
print("\n=== New Feature ===")
print(backend.new_feature(document))
# Run all tests
pytest ai_ml/tests/
# Run specific test
pytest ai_ml/tests/test_services.py
# Run with coverage
pytest --cov=ai_ml --cov-report=html
# Test with live APIs (requires API keys)
pytest ai_ml/tests/integration/
from langchain_core.language_models.fake import FakeListChatModel
from ai_ml.services import DocumentIntelligenceService
from ai_ml.providers.registry import LLMProviderRegistry
# Create mock LLM
fake_llm = FakeListChatModel(responses=["Test summary"])
# Create mock registry
class MockRegistry(LLMProviderRegistry):
def chat(self, config):
return fake_llm
# Test with mock
service = DocumentIntelligenceService(registry=MockRegistry())
result = service.summarize("Test document")
assert result == "Test summary"
from ai_ml.services import get_document_service
service = get_document_service()
# Test full pipeline
results = service.analyze_document(
document="Artificial intelligence is rapidly transforming industries...",
question="What industries are affected?"
)
assert "summary" in results
assert "topics" in results
assert "qa" in results
assert results["qa"] is not None
We welcome contributions! Please follow these guidelines:
git checkout -b feature/amazing-featuregit commit -m 'Add amazing feature'git push origin feature/amazing-featureThis project is licensed under the MIT License - see the LICENSE file for details.
Made with β€οΈ by Son Nguyen