Documentation Index
Fetch the complete documentation index at: https://mintlify.com/JhonHander/obstetrics-rag-benchmark/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Hybrid RAG module implements a hybrid search pipeline that combines lexical search (BM25) and semantic search (ChromaDB) using LangChain’s EnsembleRetriever. This approach balances keyword matching with semantic similarity for improved retrieval. Module:src.rag.hybrid
Source: src/rag/hybrid.py
Configuration
Default Models
Retriever Configuration
Document Loading
data/chunks/chunks_final.json.
Prompt Template
Uses the same medical-focused prompt template as Simple RAG:Functions
load_documents
List of LangChain Document objects with content and metadata
format_docs
A list of retrieved LangChain Document objects
A formatted string containing the content of the documents
process_hybrid_query
The user’s question
A custom language model to use. Defaults to None (uses default gpt-4o)
A dictionary containing:
answer(str): The generated answercontexts(List[str]): List of retrieved document contentsretrieved_documents(List[Document]): Full Document objectsmetrics(dict): Token usage and cost metricsinput_tokens(int): Number of input tokensoutput_tokens(int): Number of output tokenstotal_tokens(int): Total tokens usedusage_source(str): Source of usage datacost(float): Total cost in USDcost_source(str): Source of cost calculation
query_for_evaluation
The question to process
Model name to use. If None, uses default “gpt-4o”
Pre-configured language model. Takes precedence over llm_model
A dictionary containing:
question(str): The original questionanswer(str): The generated answercontexts(List[str]): Retrieved document contentssource_documents(List[Document]): Full retrieved documentsmetadata(dict): Comprehensive metadata including:num_contexts(int): Number of retrieved contextsretrieval_method(str): “hybrid_bm25_semantic”ensemble_weights(List[float]): [bm25_weight, semantic_weight]llm_model(str): Model name usedprovider(str): Provider (e.g., “openai”)model_id(str): Full model identifierembedding_model(str): “text-embedding-3-small”execution_time(float): Total execution time in secondsinput_tokens(int): Input tokens usedoutput_tokens(int): Output tokens generatedtotal_cost(float): Total cost in USDtokens_used(int): Total tokens (input + output)usage_source(str): Source of usage metricscost_source(str): Source of cost calculation
Usage Example
Pipeline Flow
- BM25 Retrieval: Retrieves top 5 documents using lexical/keyword matching
- Semantic Retrieval: Retrieves top 5 documents using vector similarity
- Ensemble Fusion: Combines results from both retrievers using weighted scores
- Format: Formats documents with source and page metadata
- Generate: Uses the LLM to generate an answer based on the combined context
- Track: Captures token usage and cost metrics
Key Features
- Combines lexical (BM25) and semantic search
- Equal weighting (0.5/0.5) between both retrieval methods
- Better handling of exact keyword matches
- Improved recall compared to semantic-only search
- Automatic cost and token tracking
- Support for custom LLMs
