Shape error in the SFT Model
#1
by
jieliu
- opened
Upon loading the sft model, I encountered a shape error during generation. It appears that the weights of k_proj and v_proj have been altered, which contradicts the printed results. I'm curious to know if you manually adjusted the weights using the safe tensor file. Is there a bug present? Additionally, why are there two safe tensor files, resulting in a 7B model imposing a 14B memory burden?