I am using a code-completion model for my tool I am making for godot (will be open sourced very soon).

Qwen2.5-coder 1.5b though tends to repeat what has already been written, or change it slightly. (See the video)

Is this intentional? I am passing the prefix and suffix correctly to ollama, so it knows where it currently is. I’m also trimming the amount of lines it can see, so the time-to-first-token isn’t too long.

Do you have a recommendation for a better code model, better suited for this?

  • lynx@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    7 days ago

    If you want in line completions, you need a model that is trained on “fill in the middle” tasks. On their Huggingface page they even say that this is not supported and needs fine tuning:

    We do not recommend using base language models for conversations. Instead, you can apply post-training, e.g., SFT, RLHF, continued pretraining, etc., or fill in the middle tasks on this model.

    A model that can do it is:

    • starcoder2
    • codegemma
    • codellama

    Another option is to just use the qwen model, but instead of only adding a few lines let it rewrite the entire function each time.