Collama is a vscode extension that provides AI-powered coding support using local ollama models.
This project is still very experimental. Currently Collama is capable of:
- auto-complete (inline, multiline, multiblock)
- context menu with fixed code edits
Use the marketplace to install the extension or build the vsix yourself. Furthermore you need a ollama instance in your local network.
See this link how to install ollama or this link for the docker image.
The extension is primary tested with the qwen-coder model series. Currently there is one model used for autocomplete and one for edits.
- Default model autocomplete:
qwen2.5-coder:3b - Default model instructions:
qwen2.5-coder:3b-instruct
Supported are only the model families: qwen, starcoder, codellama.
However codellama does not have a file-separator and is currently not capable of getting the context of all opened files.
The extension supports experimental code-edits via context-menu. For this you should declare a base or instruct model - with good chatting-capabilities.
Tests are primary with q4. Feel free to do further testing and contribute your findings!
Currently ChatML is not implemented!
| Model | Tested | Comment | contribution needed |
|---|---|---|---|
| codeqwen | none | should work | X |
| qwen2.5-coder | 1.5b, 3b, 7b | works good | |
| qwen3-coder | 30b | works good | |
| starcoder | 1b, 3b | should work | X |
| starcoder2 | 3b | works good | |
| codellama | 7b, 13b | works ok | X |
Completition is triggered using the keybinding for editor.action.inlineSuggest.trigger.
Set it to
Alt + SorCtrl + NumPad 1- for example
Will trigger at a minimum of 1.5 seconds (set higher if needed).
Click the icon to switch modes faster.
- You can set the endpoint to a local network server (currently no bearer tokens supported).
OLLAMA_ORIGINS=*env may be needed to get a response- Check the settings to configure the extension
Any help needed? Open an issue.
Please test and contribute as much as you like!

