-
Notifications
You must be signed in to change notification settings - Fork 0
React to me architectural upgrade - Advanced hybrid retrieval, preprocessing pipeline, and safety system #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Implement parallel execution of safety and scope check, query expansion, and language detection
… expansion and conversation history management
- Replace SelfQueryRetriever with efficient hybrid search (BM25 + vector) - Add RRF (Reciprocal Rank Fusion) support for query expansion - Implement parallel processing for improved performance
… expansion and conversation history management
- Add type annotation for rrf_scores in retrieval_utils.py - Fix metadata dictionary comprehension in csv_chroma.py - Update retriever type annotations to use Any - Add isinstance check for BM25Retriever - Remove default values from TypedDict in base.py - Fix TypedDict expansion in postprocess method
- Implement parallel execution of safety and scope check, query expansion, and language detection
- Implement parallel execution of safety and scope check, query expansion, and language detection
- Replace SelfQueryRetriever with efficient hybrid search (BM25 + vector) - Add RRF (Reciprocal Rank Fusion) support for query expansion - Implement parallel processing for improved performance
…ination mitigation
GFJHogue
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the wide set of modules significantly modified in this PR, it might be worth considering:
- incorporating these changes into a new
React-to-Me Betaprofile initially (instead of heavily modifying the Base profile), and/or - splitting up this PR into smaller, focused parts.
As a new profile, we could gradually roll this out (ie. test on Release) and easily revert to the original profile, if need be, instead of having to swap out the entire Docker image.
As for my review here, I'm starting with more surface-level things to address (see the attached comments).
Once those are all sorted out, I can delve deeper into reviewing the code logic.
| ) -> ReactToMeState: | ||
| """Run preprocessing workflow.""" | ||
| result = await super().preprocess(state, config) | ||
| return ReactToMeState(**result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to define this preprocess() if it's just going to run the one from the superclass (BaseGraphBuilder)
| self.unsafe_answer_generator = create_unsafe_answer_generator(streaming_llm) | ||
| self.reactome_rag: Runnable = create_reactome_rag( | ||
| llm, embedding, streaming=True | ||
| streaming_llm, embedding, streaming=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why create streaming_llm like this when we already have this?:
| llm = llm.model_copy(update={"streaming": True}) |
| @@ -0,0 +1,60 @@ | |||
| import json | |||
| from typing import List | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
importing List type is deprecated. Current Python just uses list directly.
| try: | ||
| return json.loads(output) | ||
| except json.JSONDecodeError: | ||
| raise ValueError("LLM output was not valid JSON. Output:\n" + output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will an LLM emitting invalid JSON crash the chatbot here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please exclude changes to code formatting & comments to unmodified existing code from the diff.
|
|
||
| try: | ||
| documents = create_documents_from_csv(csv_path) | ||
| retriever = BM25Retriever.from_documents(documents) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BM25Retriever missing preprocess_func
| """Main retrieval method supporting RRF and parallel processing.""" | ||
| original_query = inputs.get("input", "").strip() | ||
| if not original_query: | ||
| raise ValueError("Input query cannot be empty") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this edge-case crash the chatbot?
| from langchain.chains.combine_documents import create_stuff_documents_chain | ||
| from langchain.chains.retrieval import create_retrieval_chain | ||
| from pathlib import Path | ||
| from typing import Any, Dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dict type is deprecated, use dict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LangChain's EnsemblerRetriever which we're using already implements RRF:
| safety: str | ||
| reason_unsafe: str | ||
| expanded_queries: list[str] | ||
| detected_language: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove reason_unsafe, expanded_queries, & detected_language as they are never used outside of the preprocess step
This PR introduces a comprehensive architectural overhaul of the Reactome chatbot's retrieval and preprocessing systems. The changes replace the existing simple ensemble retriever with a hybrid system and implement a multi-step preprocessing pipeline with parallel execution capabilities.
1. Hybrid Retrieval System Overhaul
1.1 Core Architecture Changes
SelfQueryRetrieverwith customHybridRetrieverclass1.2 Retrieval Workflow Implementation
The new hybrid system operates on the following pipeline:
asyncio.to_thread()for non-blocking execution2. Preprocessing Pipeline Implementation
2.1 Multi-Step Workflow Architecture
The new preprocessing system implements a fan-out pattern with the following structure:
Sequential Step 1:
Parallel Step 2:
3. ReactToMe Profile Enhancements
3.1 Conditional Workflow Routing
The ReactToMe profile now implements intelligent routing based on safety assessment: