RAG Chatbot
RAG Chatbot: Multimodal Document Q&A
Section titled “RAG Chatbot: Multimodal Document Q&A”Build an AI-powered chatbot that can chat with your documents, images, and videos.
Time: 60 minutes Level: Advanced
What You’ll Build
Section titled “What You’ll Build”A multimodal chatbot that can:
- Upload and chat with PDFs, text files, images, and videos
- Search documents and provide context-aware answers
- Answer general questions using web search
- Understand images and videos using AI vision
- Route different question types to specialized handlers
Why This Example?
Section titled “Why This Example?”This example demonstrates advanced Jac concepts:
| Concept | How It’s Used |
|---|---|
| Object-Spatial Programming | Node-walker architecture for clean organization |
| byLLM | AI classifies and routes user queries automatically |
| Model Context Protocol (MCP) | Build modular, reusable AI tools |
| Multimodal AI | Work with text, images, and videos together |
Prerequisites & Key Concepts
Section titled “Prerequisites & Key Concepts”- Completed the AI Integration tutorials
- Familiar with OSP (node-walker architecture)
| Concept | Where to Learn |
|---|---|
by llm() | byLLM Quickstart |
| OSP architecture | OSP Tutorial |
| Structured outputs | Structured Outputs |
| Tool calling / MCP | Agentic AI |
Architecture Overview
Section titled “Architecture Overview”graph TD Client["Client<br/>Streamlit"] --> Router["Router<br/>(AI-based)"] Router --> Chat["Chat Node<br/>(Handler)"] Router --> MCP["MCP Server<br/>(Tools)"] MCP --> Chroma["ChromaDB<br/>(Docs)"] MCP --> Web["Web Search<br/>(Serper)"]Project Structure
Section titled “Project Structure”rag-chatbot/├── client.jac # Streamlit web interface├── server.jac # Main application (OSP structure)├── server.impl.jac # Implementation details├── mcp_server.jac # Tool server (doc search, web search)├── mcp_client.jac # Interface to tool server└── tools.jac # Document processing logicKey Components
Section titled “Key Components”1. Chat Nodes (Query Types)
Section titled “1. Chat Nodes (Query Types)”Define different types of queries the system handles:
node Router {}
"""Chat about uploaded documents."""node DocumentChat {}
"""Answer general knowledge questions."""node GeneralChat {}
"""Analyze and discuss images."""node ImageChat {}
"""Analyze and discuss videos."""node VideoChat {}2. Intelligent Routing
Section titled “2. Intelligent Routing”The AI automatically routes queries to the right handler:
import from byllm.lib { Model }
glob llm = Model(model_name="gpt-4o-mini");
enum QueryType { DOCUMENT = "document", GENERAL = "general", IMAGE = "image", VIDEO = "video"}
"""Classify the user's query to determine the best handler."""def classify_query(query: str, has_documents: bool) -> QueryType by llm();3. Walker-Based Interaction
Section titled “3. Walker-Based Interaction”walker interact { has query: str; has session_id: str;
can route with Router entry { # Get session context session = get_session(self.session_id);
# AI classifies the query query_type = classify_query( self.query, has_documents=len(session.documents) > 0 );
# Route to appropriate handler match query_type { case QueryType.DOCUMENT: visit [-->](?:DocumentChat); case QueryType.GENERAL: visit [-->](?:GeneralChat); case QueryType.IMAGE: visit [-->](?:ImageChat); case QueryType.VIDEO: visit [-->](?:VideoChat); } }
can handle_document with DocumentChat entry { # Search documents for context context = search_documents(self.query, self.session_id);
# Generate answer with RAG answer = generate_rag_response(self.query, context); report {"answer": answer, "sources": context.sources}; }
can handle_general with GeneralChat entry { # Use web search for current information search_results = web_search(self.query);
# Generate answer with web context answer = generate_web_response(self.query, search_results); report {"answer": answer}; }}4. Document Processing (tools.jac)
Section titled “4. Document Processing (tools.jac)”import from langchain_chroma { Chroma }import from langchain_openai { OpenAIEmbeddings }
def process_document(file_path: str, session_id: str) -> None { # Load document content = load_file(file_path);
# Split into chunks chunks = split_text(content, chunk_size=1000);
# Store in vector database embeddings = OpenAIEmbeddings(); vectorstore = Chroma( collection_name=session_id, embedding_function=embeddings );
vectorstore.add_texts(chunks);}
def search_documents(query: str, session_id: str) -> list { vectorstore = get_vectorstore(session_id); results = vectorstore.similarity_search(query, k=5); return results;}5. MCP Tool Server
Section titled “5. MCP Tool Server”# mcp_server.jacimport requests;import os;
"""Search uploaded documents for relevant information."""@tooldef document_search(query: str, session_id: str) -> str { results = search_documents(query, session_id); return format_results(results);}
"""Search the web for current information."""@tooldef web_search(query: str) -> str { response = requests.post( "https://google.serper.dev/search", headers={"X-API-KEY": os.getenv("SERPER_API_KEY")}, json={"q": query} );
return format_web_results(response.json());}Running the Application
Section titled “Running the Application”Prerequisites
Section titled “Prerequisites”pip install jaclang jac-scale jac-streamlit byllm \ langchain langchain-community langchain-openai langchain-chroma \ chromadb openai pypdf tiktoken requests mcp[cli] anyioSet API keys:
export OPENAI_API_KEY=your-keyexport SERPER_API_KEY=your-key # Free at serper.devStart the Services
Section titled “Start the Services”Terminal 1 - Tool server:
jac mcp_server.jacTerminal 2 - Main application:
jac start server.jacTerminal 3 - Web interface:
jac streamlit client.jacOpen http://localhost:8501 in your browser.
Testing the Chatbot
Section titled “Testing the Chatbot”- Register and log in using the web interface
- Upload files: PDFs, text files, images, or videos
- Ask questions:
- “What does the contract say about termination?” (document)
- “What’s the weather in Tokyo?” (web search)
- “What’s in this image?” (vision)
- “Summarize this video” (video analysis)
API Endpoints
Section titled “API Endpoints”| Endpoint | Description |
|---|---|
POST /user/register | Create account |
POST /user/login | Get access token |
POST /walker/upload_file | Upload documents |
POST /walker/interact | Chat with the AI |
Full API docs at http://localhost:8000/docs
Extension Ideas
Section titled “Extension Ideas”- New file types - Audio, spreadsheets, presentations
- Additional tools - Weather, databases, APIs
- Hybrid search - Combine keyword and semantic search
- Memory - Long-term conversation memory across sessions
- Custom models - Specialized LLMs for different domains
Full Source Code
Section titled “Full Source Code”Key Takeaways
Section titled “Key Takeaways”- OSP organizes complexity - Nodes for query types, walkers for actions
- AI-based routing - Let the LLM decide which handler to use
- MCP for modularity - Tools are independent, reusable services
- Vector search for RAG - Semantic search finds relevant context
Next Examples
Section titled “Next Examples”- EmailBuddy - Agentic email assistant
- RPG Generator - AI-generated game content