Enable LlamaTokenizerFast and AutoTokenizer to load in seconds rather than 5 minutes.

#1

Same procedure as last time converting Tokenizer to support HF's AutoTokenizer. See https://huggingface.co/danielhanchen/open_llama_3b_600bt_preview for details.
Ie:

model_name = "openlm-research/open_llama_7b"
tokenizer = AutoTokenizer.from_pretrained(model_name, pad_token = "</s>")
tokenizer.push_to_hub("danielhanchen/open_llama_7b")
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment