@m-ric on Hugging Face: "𝗔𝗿𝗰𝗲𝗲 𝗿𝗲𝗹𝗲𝗮𝘀𝗲𝘀 𝗦𝘂𝗽𝗲𝗿𝗡𝗼𝘃𝗮, 𝗯𝗲𝘁𝘁𝗲𝗿 𝗳𝗶𝗻𝗲-𝘁𝘂𝗻𝗲…"

Post

2173

𝗔𝗿𝗰𝗲𝗲 𝗿𝗲𝗹𝗲𝗮𝘀𝗲𝘀 𝗦𝘂𝗽𝗲𝗿𝗡𝗼𝘃𝗮, 𝗯𝗲𝘁𝘁𝗲𝗿 𝗳𝗶𝗻𝗲-𝘁𝘂𝗻𝗲 𝗼𝗳 𝗟𝗹𝗮𝗺𝗮-𝟯.𝟭-𝟳𝟬𝗕!

2️⃣ versions: 70B and 8B
🧠 Trained by distilling logits from Llama-3.1-405B
🐥 Used a clever compression method to reduce dataset weight from 2.9 Petabytes down to 50GB (may share it in a paper)
⚙️ Not all benchmarks are improved: GPQA and MUSR go down a slight bit
🤗 8B weights are available on HF (not the 70B)

Read their blog post 👉 https://blog.arcee.ai/arcee-supernova-training-pipeline-and-model-composition/
Model weights (8B) 👉 arcee-ai/Llama-3.1-SuperNova-Lite

Join the conversation