@DavidGF on Hugging Face: "Introducing Kraken-LoRA – a lightweight version of Kraken that uses…"

Post

1432

Introducing Kraken-LoRA – a lightweight version of Kraken that uses LoRA-Adapters as Experts based on the base model.

@fernandofernandes , me, @Crystalcareai , @ehartford created the Kraken-LoRA!

🔍 What’s the big deal?

✅ Size Consistency: While Kraken’s size increases with more Experts, Kraken-LoRA remains as compact as the base model (e.g., 8b if you use Meta-Llama3-8b-Instruct).
✅ VRAM Efficiency: Kraken-LoRA is highly VRAM efficient, maintaining the power of all experts without the bloat.
✅ Dynamic Adaptation: LoRA adapters are applied dynamically at runtime, following the routing process.
✅ High Efficiency: Enjoy increased efficiency without compromising performance, as long as the LoRA adapters match the base model.

💡 Conclusion: Kraken-LoRA empowers businesses to experience enhanced flexibility and performance from our architecture, enabling further scalability without sacrificing performance.

Check out the model here: VAGOsolutions/Kraken-LoRA
Explore the code here: https://github.com/cognitivecomputations/kraken/tree/main/Kraken-LoRA

Have fun with Kraken-LoRA! 🐙

Join the conversation