Skip to content

Use the model's template (ollama show --template <model>)#53

Open
mcelrath wants to merge 1 commit into
gergap:masterfrom
mcelrath:template
Open

Use the model's template (ollama show --template <model>)#53
mcelrath wants to merge 1 commit into
gergap:masterfrom
mcelrath:template

Conversation

@mcelrath

Copy link
Copy Markdown

This is to fix #52 .

I've tested it with:

starcoder2
qwen2.5-coder
deepseek-coder-v2
codegemma
codellama

Do you agree? I think it might be interesting to put some templates in the repo showing how to use FIM for larger models that don't natively support it, but may be good at following instructions.

@mcelrath

Copy link
Copy Markdown
Author

This does NOT work with gemma3, which supports FIM though the tag <|fim_middle|> but doesn't mention it anywhere in their template.

The method in my patch can be detected by ollama show --template <model> | grep Suffix but I don't see how to go about doing this without a laundry list of models and templates. The existing solution with the config/ directory definitely doesn't work with a lot of models... (especially thinking models like deepseek-r1).

@gergap

gergap commented Mar 15, 2025

Copy link
Copy Markdown
Owner

Hi, thank you for the contribution. I was going this route initially, but had the problem that the templates in ollama where not working for some models.
There is even a suffix argument in the REST API now, where ollama is supposed to do much of the work of building a correct prompts for code completion.

I will need to test this and if it works more reliable now, than in the past I can get rid of my own templates and token configurations. I can imagine to keep both. So you can override bogus behavior with local configurations, but if it is missing it will default to what ollama provides.

I'm also working on repo completions and inserting of complete files (or vim buffers) as context. For this I will still need to create my own prompts. However, if I can read out the FIM tokens reliable from the models template I would be happy to do so.

@gergap

gergap commented Mar 15, 2025

Copy link
Copy Markdown
Owner

Hi, I tested it with my default starcoder2:3b completion model. Technically, it works, but the results are radically different and not really useful. I don't know why this happens with the ollama REST API.

See yourself: First I try it with your branch. The task is trivial, it should only complete a missing 'f' for a printf call, which tests the fill-in-the-middle problem pretty good. With your solution it generates some nonsense and also garbage on the next lines instead of the required completion. Then I switch back to mast branch with my manual FIM code and it works as expected.

vim-pull-53-2025-03-15_11.06.09.mp4

@gergap

gergap commented Mar 15, 2025

Copy link
Copy Markdown
Owner

I does not look better with codellama either:

Ollama REST API result (your branch):
codellama1

Manual templates (master branch):
codellama2

I also tried to change the "Raw" option to false on your branch, but this didn't help either.
Looks to me like ollama stuff still does not work as it should.
I'm running ollama version is 0.5.4.
Please let me know if you are running a newer version with better results for this simple example task.

@mcelrath

mcelrath commented Mar 15, 2025 via email

Copy link
Copy Markdown
Author

@mcelrath

Copy link
Copy Markdown
Author

Are you on Discord? Or do you belong to any Discord/Slack or other chat that discusses FIM usage? There are a bunch of things I want to do here and it would be good to discuss it with someone. ;-)

@gergap

gergap commented Mar 15, 2025

Copy link
Copy Markdown
Owner

Hi, at the moment I don't have time for discussions, but you might find the use_model_template branch useful.
Just committed this experiment for you.

@poetaster

Copy link
Copy Markdown

I wasn't sure if this was on the radar: https://github.com/ollama/ollama/pull/14154/changes#diff-457756243c9ebf0d1160d86289a5c1851a0e211b80ee468378a5c1f0fed6933f

Direct integration in ollama. Not merged yet.

@gergap

gergap commented Jun 5, 2026

Copy link
Copy Markdown
Owner

I wasn't sure if this was on the radar: https://github.com/ollama/ollama/pull/14154/changes#diff-457756243c9ebf0d1160d86289a5c1851a0e211b80ee468378a5c1f0fed6933f

Direct integration in ollama. Not merged yet.

interesting, didn't now that.

Regarding this MR: already a while ago we added the USE_CUSTOM_TEMPLATE variable and -t option to complete.py, which allows using Ollama's FIM API (suffix arg). While it didn't work well in the past, Ollama seems to have matured a lot since then.
You can always change the variable USE_CUSTOM_TEMPLATE in this script from true to false.

It should also be easy to extend autoload/ollama.vim to make use of the -t option, and make this configurable in the config file.
As soon as this works reliable for all important models we can get rid of or custom template processing for FIM.

@poetaster

Copy link
Copy Markdown

I wasn't sure if this was on the radar: https://github.com/ollama/ollama/pull/14154/changes#diff-457756243c9ebf0d1160d86289a5c1851a0e211b80ee468378a5c1f0fed6933f
Direct integration in ollama. Not merged yet.

interesting, didn't now that.

Regarding this MR: already a while ago we added the USE_CUSTOM_TEMPLATE variable and -t option to complete.py, which allows using Ollama's FIM API (suffix arg). While it didn't work well in the past, Ollama seems to have matured a lot since then. You can always change the variable USE_CUSTOM_TEMPLATE in this script from true to false.

It should also be easy to extend autoload/ollama.vim to make use of the -t option, and make this configurable in the config file. As soon as this works reliable for all important models we can get rid of or custom template processing for FIM.

I took a look and it seems I'm out of luck

ollama show --template  glm-4.7-flash:latest
{{ .Prompt }}

Or am I missing something?

In any case, using insert with qwen3-coder:30b it's just not nearly as good as glm-4.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Don't use config files, use the model's template

3 participants