two questions about model training

#1
by Forrest20231206 - opened

I have two questions for the author about model training:

  1. The original llava-med used stage 1 pretraining, and it seems that you only used finetuning. this can lead to some medical concepts not being able to align them right?
  2. Are there ablation experment results of the training results between full finetuning and full lora (all linear layers)?

First, this project is still during exploring, mainly to verify whether the fit is enough by fine-tuning. We will further use some Chinese pertaining dataset to train from scratch.
Second, based on the experience of other researchers, the full fine-tuning is generally better than lora, so this version of the model now uses the full fine-tuning.

Thanks for the reply, looking forward to better chinese-llava-med.

BUAADreamer changed discussion status to closed

Sign up or log in to comment