allenai
/

OLMoE-1B-7B-0924-Instruct

Text Generation

Mixture of Experts

Inference Endpoints

Model card Files Files and versions Community

OLMoE-1B-7B-0924-Instruct / README.md

soldni's picture

Update README.md

b7e57d5 verified 2 months ago

|

2.46 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- moe
	- olmo
	- olmoe
	co2_eq_emissions: 1
	---

	<img alt="OLMoE Logo." src="olmoe-logo.png" width="250px">

	# Model Summary

	> OLMoE-1B-7B-Instruct is a Mixture-of-Experts LLM with 1B active and 7B total parameters released in August 2024 (0824) that has been adapted via SFT and DPO from [OLMoE-1B-7B](https://hf.co/OLMoE/OLMoE-1B-7B-0824). It yields state-of-the-art performance among models with a similar cost (1B) and is competitive with much larger models like Llama2-13B-Chat. OLMoE is 100% open-source.

	- Code: https://github.com/allenai/OLMoE
	- Paper:
	- Logs: https://github.com/allenai/OLMoE/blob/main/logs/olmoe-dpo-logs.txt

	# Use

	Install the `transformers` & `torch` libraries and run:

	```python
	from transformers import OlmoeForCausalLM, AutoTokenizer
	import torch

	DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

	# Load different ckpts via passing e.g. `revision=step10000-tokens41B`
	model = OlmoeForCausalLM.from_pretrained("OLMoE/OLMoE-1B-7B-Instruct").to(DEVICE)
	tokenizer = AutoTokenizer.from_pretrained("OLMoE/OLMoE-1B-7B-Instruct")
	message = [{"role": "user", "content": "Explain to me like I'm five what is Bitcoin."}]
	inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
	out = model.generate(**inputs, max_length=64)
	print(tokenizer.decode(out[0]))
	# > # Bitcoin is a digital currency that is created and held electronically. No one controls it. Bitcoins aren’t printed, like dollars or euros – they’re produced by people and businesses running computers all around the world, using software that solves mathematical
	```

	You can list all revisions/branches by installing `huggingface-hub` & running:
	```python
	from huggingface_hub import list_repo_refs
	out = list_repo_refs("OLMoE/OLMoE-1B-7B-0824")
	branches = [b.name for b in out.branches]
	```

	Important branches:
	- `main`: Preference tuned via DPO model of https://hf.co/OLMoE/OLMoE-1B-7B-0824-SFT (`main` branch)
	- `no-load-balancing`: Ablation without load balancing loss during DPO starting from the `no-load-balancing` branch of https://hf.co/OLMoE/OLMoE-1B-7B-0824-SFT
	- `non-annealed`: Ablation starting from the `non-annealed` branch of https://hf.co/OLMoE/OLMoE-1B-7B-0824-SFT which is an SFT of the pretraining checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/OLMoE/OLMoE-1B-7B-0824)

	# Citation

	```bibtex
	TODO
	```