Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
singhsidhukuldeepย 
posted an update Jun 12
Post
2079
There are 2.2 billion active @Apple devices ๐Ÿ and all of them just got smarter thanks to Apple Intelligence (AI) ๐Ÿง 

Well, almost all devices... ๐Ÿค”

Your device needs:
- A17 Pro chip or later if it's an iPhone ๐Ÿ“ฑ,
- M1 chip or later if iPad ๐Ÿ“ฑ,
- M1 chip or later if Mac ๐Ÿ’ป.

All this aside, this is probably the largest deployment of on-device LLMs ๐ŸŒ.

Here is the technical goodness:
- AI will run ~3B LLM on device (Mac, iPhone, iPad) with grouped-query-attention, activation, and embedding quantization (Talaria bit rate selection) running on the neural engine ๐Ÿš€.
- Will be using fine-tuned LoRA Adapters for different tasks, claiming to outperform other 7B and 3B LLMs! ๐Ÿฅ‡
- iPhone 15 Pro 0.6 ms time-to-first-token with 30 tokens/second latency โฑ.
- No server model size or details ๐Ÿค.
- Will be dynamically loading, caching, and swapping LoRA adapters (think LoRA Land) ๐Ÿ”„.
- On-device model has 49K vocab size, while the server model goes 100K ๐Ÿ“š.
- Using rejection sampling fine-tuning and RLHF in post-processing ๐Ÿ“ˆ.
- A rejection sampling fine-tuning algorithm with teacher committee ๐ŸŽ“.
- And reinforcement learning from human feedback (RLHF) algorithm with mirror descent policy optimization and a leave-one-out advantage estimator ๐Ÿงฎ.
- Used synthetic data generation (from bigger models, does not mention which) for tasks like summaries ๐Ÿ“.
- 750 evaluation samples for each production use case to evaluate summarization (dataset not released) ๐Ÿ“Š.
- No mention of multilingual support ๐ŸŒ.
- Used Apple's AXLearn framework (JAX) and FSP to train on TPUs and GPUs ๐Ÿ’ช.
- 3B + Adapter outperforms Phi-3 mini, Gemma 7B, Mistral 7B on summarization ๐Ÿ†.
- 3B + Adapter achieves 78.7% on IFEval beating Phi-3 mini, Gemma 7B, Mistral 7B; Server Model matches GPT-4-Turbo and beats Mixtral 8x22B and GPT-3.5-turbo โœจ.

LoRA for the win! ๐ŸŽ‰

Blog: https://machinelearning.apple.com/research/introducing-apple-foundation-models

yeah 3B LLM quantized to Q4 and finetuned on appl stuff... meh