-
Notifications
You must be signed in to change notification settings - Fork 254
Adding Gemma3 Support (correctly loading tokenizer and mmproj, alongside existing GGUFs). #404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Small fixes Update to spiece tokenizer loading. Switch to pathlib
|
cool , but is mmproj.gguf necessary? that pr doesn't need that |
|
Yes it is necessary for visual reasoning. The original model distributed with the ComfyUI Gemma 3 safetensor release included the mmproj tensors. |
|
OK,i tried i2v without mmproj,seems work |
|
Well by all means if you don't want it for whatever reason, don't use it. But I figured they included it for a reason. This PR should work fine without it (as in it won't raise an exception), but it will issue a warning. In the ComfyUI-LTX repo's example workflows, they use a prompt "enhancer" node, which uses the image as an input to Gemma 3 and asks it to improve the prompt based on what is in the input image. You cannot do this without the mmproj model, for example. The model won't know what the image tokens are. |
|
i know what you mean, something like VL image edit model all need that. But native comfyui clip node can not give string as result, so VL model can not be used very efficiently |
|
That is true, however other custom nodes may want to make use of visual reasoning using the GGUF loader, so allowing the mmproj file to be loaded, as it already does with Qwen, made sense. When it comes to making things easier for the user, it's generally better to include things by default, so they won't run into issues later if they suddenly need or want it (perhaps without even knowing what it is). |
|
I always get this error: not enough values to unpack (expected 3, got 2) |
Is #402 also needed? I was having an issue when using a gemma gguf with those prs merged - it was just garbled output. I have already downloaded all of the files mentioned, including the .mmproj. I assume they just need to reside in the text_encoders folder. (I did try the fp8_e4m3n version and that did work. |
|
i don't think your mothed is right. ram cost is huge, gemma3 q4km 7G cost more than 50G ram to load. the other pr do not have that problem |
This is a concise implementation for loading the Gemma3 tokenizer, with the correct configuration values and an appropriate method of loading the tokenizer (
spiecemodel) and mmproj alongside existing GGUF quants available on huggingface.I've extracted the tokenizer into a standalone model, along with the mmproj. Note that these two supplementary models are required:
https://huggingface.co/smhf72/gemma-3-12b-it-extras-comfy
These two models should be placed in the
models/clipdirectory, and they are automatically loaded when agemma3architecture is detected.It works as expected with these GGUFs:
and others.
EDIT:
If testing, this expects #399 to be merged first.