two questions about model training

by Forrest20231206 - opened May 24

Discussion

Forrest20231206

May 24

I have two questions for the author about model training:

The original llava-med used stage 1 pretraining, and it seems that you only used finetuning. this can lead to some medical concepts not being able to align them right?
Are there ablation experment results of the training results between full finetuning and full lora (all linear layers)?

BUAADreamer

Owner May 24

First, this project is still during exploring, mainly to verify whether the fit is enough by fine-tuning. We will further use some Chinese pertaining dataset to train from scratch.
Second, based on the experience of other researchers, the full fine-tuning is generally better than lora, so this version of the model now uses the full fine-tuning.

Forrest20231206

May 25

Thanks for the reply, looking forward to better chinese-llava-med.

BUAADreamer changed discussion status to closed May 25

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment