--- library_name: coreml license: apache-2.0 tags: - text-generation --- # Mistral-7B-Instruct-v0.3 + CoreML > [!IMPORTANT] > ❗ This repo requires the use of the macOS Sequoia (15) Developer Beta to utilize the latest and greatest CoreML has to offer! > Sign up for the Apple Beta Software Program [here](https://beta.apple.com/en/) to get acccess. > Check out the companion blog post to learn more about what's new in iOS 18 & macOS 15 [here](https://hf.co/blog/wwdc24). This repo contains [Mistral 7B Instruct v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) converted to CoreML in both FP16 & Int4 precision. Mistral-7B-Instruct-v0.3 is an instruct fine-tuned version of the Mistral-7B-v0.3 by Mistral AI. Mistral-7B-v0.3 has the following changes compared to v0.2 model: - Extended vocabulary to 32768 - Supports v3 Tokenizer - Supports function calling To learn more about the model, we recommend looking at its documentation and original model card ## Download Install `huggingface-cli` ```bash pip install -U "huggingface_hub[cli]" ``` To download one of the `.mlpackage` folders to the `models` directory: ```bash huggingface-cli download \ --local-dir models \ --local-dir-use-symlinks False \ apple/coreml-mistral-7b-instruct-v0.3 \ --include "StatefulMistral7BInstructInt4.mlpackage/*" ``` To download everything, remove the `--include` argument. ## Integrate in Swift apps The [`huggingface/swift-chat`](https://github.com/huggingface/swift-chat) repository contains a demo app to get you up and running quickly! You can integrate the model right into your Swift apps using the `preview` branch of [`huggingface/swift-transformers`](https://github.com/huggingface/swift-transformers/tree/preview) ## Limitations The Mistral 7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs.