Skip to content

Conversation

@YuriNachos
Copy link

🎯 Summary

Fixes #1658

📝 Description

The _get_embedding_llm_config_dict method in EmbeddingStrategy was returning a fallback OpenAI config when embedding_llm_config was None. This prevented users from using local sentence-transformers embeddings, as the get_text_embeddings utility never saw None and always used remote embeddings.

🔧 Changes

crawl4ai/adaptive_crawler.py:

  • Changed return type from Dict to Optional[Dict]
  • Removed fallback OpenAI config, now returns None when no config is provided
  • Updated docstring to clarify behavior

✅ Before

def _get_embedding_llm_config_dict(self) -> Dict:
    # Always returns a dict, even when user wants local embeddings
    return {
        'provider': 'openai/text-embedding-3-small',
        'api_token': os.getenv('OPENAI_API_KEY')
    }

✅ After

def _get_embedding_llm_config_dict(self) -> Optional[Dict]:
    # Returns None to allow local sentence-transformers
    return None

🎯 Impact

Users can now use local embeddings by configuring AdaptiveCrawler without embedding_llm_config:

config = AdaptiveConfig(strategy="embedding", embedding_llm_config=None)

Co-Authored-By: Claude noreply@anthropic.com

…trategy

Fixes unclecode#1658

The `_get_embedding_llm_config_dict` method was returning a fallback
OpenAI config when `embedding_llm_config` was None, which prevented
the use of local sentence-transformers embeddings.

Changed the method to:
- Return None when no embedding config is provided
- Update return type from Dict to Optional[Dict]
- Update docstring to clarify the behavior

This allows `get_text_embeddings` utility to correctly switch to
the local sentence-transformers implementation when no LLM config
is provided.

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Fix EmbeddingStrategy._get_embedding_llm_config_dict ignoring local embeddings

1 participant