Warning this model might be overhyped

#1
by rombodawg - opened

overhyped.png

Hi. What datasets were used for the fine tuning?

@hiauiarau like the V2.5 I didnt finetune, i only applied the last step of my method, which is to merge the already finetuned model (Llama-3.1-Nemotron-70B-Instruct-HF) with the original instruct and base models to reduce the loss gained from finetuning. So my version is basically Llama-3.1-Nemotron-70B-Instruct-HF without the loss from tuning.

Although ive leared that my method does have the downside of some instruct following loss

@rombodawg And what do you mean by instructional losses? When the instructions differ from one finetune to another?

No its mainly just a drop on the "IFEval" on the open llm leaderboard which is suppose to be a instruction following benchmark, but it could also be something else that causes it

What do you think it could be related to? And can it be overcome by preparing another lora adapter for the IFEval task?

the merge yaml:

models:
  - model: ./mergekit/models/llama-3.1-70b-instruct
    parameters:
      weight: 1
      density: 1
  - model: ./mergekit/models/Nemotron-70B-Instruct-HF
    parameters:
      weight: 1
      density: 1
merge_method: ties
base_model: ./mergekit/models/llama-3.1-70b-base
parameters:
  weight: 1
  density: 1
  normalize: true
  int8_mask: true
dtype: bfloat16

So it uses llama-3.1-70b + llama-3.1-70b-instruct + nemotron-70b-instruct with a regular 1:1:1 weightage.

Just to be sure im not missing anything in the picture.. if i re-run that merge and compare the parameter.item from the merge an the uploaded weights, it should match 1:1 right? there is no SFT/DPO or training session of any kind ..correct?

No its mainly just a drop on the "IFEval" on the open llm leaderboard which is suppose to be a instruction following benchmark, but it could also be something else that causes it

Well, you are “averaging” the weight of an instruction fine-tuned model with its base model.
It's kind of intuitive that it would lose some instruction following capacity.

Sign up or log in to comment