Trying to use the model with training on trlx
I'm trying out loading a model with trlx and getting the following error.
Example code
import trlx
trainer = trlx.train('AI-Sweden/gpt-sw3-126m', dataset=[('dolphins', 'geese'), (1.0, 100.0)])
print("trainer: ", trainer)
OSError: AI-Sweden/gpt-sw3-126m does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.
Steps to reproduce.
Install trlx following the documentation here and then run the code.
https://github.com/CarperAI/trlx
If you compare it to https://huggingface.co/EleutherAI/gpt-j-6B/tree/main there are additional files in their model folder
Hey, without having looked deeply I think it could simply be a problem with repository/model path. TryAI-Sweden-Models/gpt-sw3-126m
Great. That actually solved the issue but I got a new one. Might be related to trlx so not sure if this is related to the model.
RuntimeError: Expected size for first two dimensions of batch2 tensor to be: [1536, 2] but got: [1536, 1].
Btw @JoeyOhman , it said in the readme you should load it like this.
import torch
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
# Initialize Variables
model_name = "AI-Sweden/gpt-sw3-126m"
device = "cuda:0" if torch.cuda.is_available() else "cpu"
prompt = "Träd är fina för att"
# Initialize Tokenizer & Model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
model.eval()
model.to(device)
Maybe just update the readme to.
import torch
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
# Initialize Variables
model_name = "AI-Sweden-Models/gpt-sw3-126m"
device = "cuda:0" if torch.cuda.is_available() else "cpu"
prompt = "Träd är fina för att"
# Initialize Tokenizer & Model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
model.eval()
model.to(device)
Yeah, we recently changed the repository name and will fix this as soon as possible. Thanks!
I don't know why you get that error and I don't have access to my computer until a few days (maybe someone else will jump in here before that). However, something that concerns me is that it seems to use GPT2TokenizerFast
and not the GPTSw3Tokenizer
. It might not be related to the error but will probably give you unexpected behavior later on. The config file does point to the correct tokenizer, please let us know if you think that problem could be on our end!
@JoeyOhman There is no tokenizer.json or tokenizer_config.json in this repository. Could it be that this makes huggingface use a default tokenizer that somehow breaks things?
Install transformers from source and installing sentencepiece resolved the issue. Then I could load the tokenizer with the following code.
self.tokenizer = AutoTokenizer.from_pretrained("AI-Sweden-Models/gpt-sw3-126m", use_auth_token=True)
Sorry for the delayed answer and great that you solved it!
The README:s now have the correct repository path and a note about how to use the access token. Installing from source is required right now, but not for much longer as GPTSw3 should be included in HF Transformer's next official release.