The integration of GaLore into the training of large language models (LLMs) marks a significant advancement in the field of deep learning, particularly in terms of memory efficiency and the democratization of AI research. By allowing for the training of billion-parameter models on consumer-grade hardware, reducing memory footprint in optimizer states, and leveraging advanced projection matrix techniques, GaLore opens new horizons for researchers and practitioners with limited access to high-end computational resources.
Under the hood there we have many other improvements, due to extensive maintenance activity, community contributions by super active + knowledgable volunteers β¨ π and the official sponsorship by Hugging Face that makes all this possible π€ β€οΈ π
We would greatly appreciate any further community contributions, be it to help with refactorings, exterminating flaky tests, writing doc-strings, tutorials, new features. Don't be shy, just contact us and we see where this leads us: https://github.com/TimDettmers/bitsandbytes/discussions