dawg this is not a pretrained model.

by Dampish - opened Nov 19, 2023

Discussion

Dampish

Nov 19, 2023

how do you derive a model and call it pretrained

GeneZC

Owner Nov 20, 2023

•

edited Nov 20, 2023

Basically, MiniMA-3B is distilled from LLaMA2-7B on subsampled Pile, GitHub, and WuDao data, totally 100+B tokens. In terms of data scale, MiniMA-3B is somehow a (continuously) pretrained model.

And you are perhaps referring MiniChat-3B (https://huggingface.co/GeneZC/MiniChat-3B) which is MiniMA-3B finetuned on instruction data and indeed a finetuned model.

GeneZC changed discussion status to closed Nov 30, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment