The model is using Llama Arch. Would we update the config to be compatible with Llama?

#8
by mzbac - opened

The code in modeling_yayi.py seems to be just the llama with multiple queries. Maybe I missed something, but it looks like the llama architecture. I have successfully fine-tuned the guanaco with llama-compatible config and it works well. FYI https://huggingface.co/mzbac/yayi2-30b-guanaco-gguf

Nice Job of the yayi2-30b-guanaco-gguf !
We have taken inspiration from decoder-only architecture models like llama and baichuan etc. when designing our model architecture. We have also introduced some new modules such as multi-query attention and flash-attention, which are detailed in our technical report: https://arxiv.org/abs/2312.14862.

wenge-research changed discussion status to closed

Sign up or log in to comment