Documentation Index
Fetch the complete documentation index at: https://mintlify.com/JhonHander/obstetrics-rag-benchmark/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The HyDE (Hypothetical Document Embeddings) RAG module implements a two-stage RAG pipeline. It first generates a hypothetical document that would perfectly answer the user’s query, then uses that document for semantic search. This approach can improve retrieval accuracy by searching for detailed content rather than a short query. Module:src.rag.hyde
Source: src/rag/hyde.py
Configuration
Default Models
Vector Store
Prompt Templates
HyDE Document Generation Prompt
Answer Generation Prompt
Uses the standard medical expert prompt (same as Simple RAG).Functions
generate_hypothetical_document
The user’s question
Dictionary containing:
document(str): The generated hypothetical documentinput_tokens(int): Input tokens usedoutput_tokens(int): Output tokens generatedtotal_tokens(int): Total tokensusage_source(str): Source of usage datacost(float): Cost in USDcost_source(str): Source of cost calculation
format_docs
A list of retrieved LangChain Document objects
A formatted string containing the content of the documents
process_hyde_query
The user’s question
Custom model for hypothetical document generation
Custom model for answer generation
Dictionary containing:
answer(str): The final generated answercontexts(List[str]): Retrieved document contentshypothetical_document(str): The generated hypothetical documenthyde_metrics(dict): Metrics for HyDE generationinput_tokens(int)output_tokens(int)cost(float)usage_source(str)cost_source(str)
answer_metrics(dict): Metrics for answer generationinput_tokens(int)output_tokens(int)cost(float)usage_source(str)cost_source(str)
total_cost(float): Combined costtotal_input_tokens(int): Combined input tokenstotal_output_tokens(int): Combined output tokensusage_sources(List[str]): Sources of usage datacost_sources(List[str]): Sources of cost calculations
query_for_evaluation
The question to process
The name of the LLM model to use for HyDE generation. Defaults to “gpt-3.5-turbo”
The name of the LLM model to use for answer generation. Defaults to “gpt-4o”
Pre-configured LLM for HyDE. Takes precedence over hyde_model
Pre-configured LLM for answer. Takes precedence over answer_model
Dictionary containing:
question(str): Original questionanswer(str): Generated answercontexts(List[str]): Retrieved document contentsmetadata(dict): Comprehensive metadata including:execution_time(float): Total execution time in secondsinput_tokens(int): Total input tokens (HyDE + Answer)output_tokens(int): Total output tokens (HyDE + Answer)total_cost(float): Total cost in USDretrieval_method(str): “hyde”llm_hyde_model(str): Model used for HyDE generationllm_answer_model(str): Model used for answer generationhyde_provider(str): Provider for HyDE modelanswer_provider(str): Provider for answer modelhyde_model_id(str): Full HyDE model IDanswer_model_id(str): Full answer model IDhyde_cost(float): Cost for HyDE generationanswer_cost(float): Cost for answer generationusage_source(str): Combined usage sourcescost_source(str): Cost calculation source
Usage Example
Pipeline Flow
- Generate HyDE: Uses gpt-3.5-turbo (temperature=0.7) to generate a detailed hypothetical document that would answer the question
- Retrieve: Uses the hypothetical document (not the original query) to perform semantic search and retrieve the top 5 most relevant actual documents
- Format: Formats retrieved documents with metadata
- Generate Answer: Uses gpt-4o (temperature=0) to generate the final answer based on retrieved context
- Track: Captures separate metrics for both HyDE and answer generation
Key Features
- Two-stage retrieval: Generates hypothetical content first, then searches
- Improved semantic matching: Searches with detailed content vs. short query
- Dual model tracking: Separate metrics for HyDE and answer generation
- Creative HyDE generation: Uses higher temperature (0.7) for document generation
- Precise answer generation: Uses temperature 0 for final answer
- Comprehensive cost tracking: Tracks costs for both stages
When to Use HyDE
HyDE works best when:- User queries are short or ambiguous
- You need to bridge vocabulary gaps between query and documents
- Documents use different terminology than typical user queries
- You want to improve recall for conceptual questions
