Trying to use the model with training on trlx

by birgermoell - opened Jan 1, 2023

Discussion

birgermoell

AI Sweden Model Hub org Jan 1, 2023

•

edited Jan 1, 2023

I'm trying out loading a model with trlx and getting the following error.

Example code

import trlx
trainer = trlx.train('AI-Sweden/gpt-sw3-126m', dataset=[('dolphins', 'geese'), (1.0, 100.0)])
print("trainer: ", trainer)

OSError: AI-Sweden/gpt-sw3-126m does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.

Steps to reproduce.

Install trlx following the documentation here and then run the code.
https://github.com/CarperAI/trlx

If you compare it to https://huggingface.co/EleutherAI/gpt-j-6B/tree/main there are additional files in their model folder

Files in the gpt-sw3-126m repo

JoeyOhman

AI Sweden Model Hub org Jan 1, 2023

Hey, without having looked deeply I think it could simply be a problem with repository/model path. Try
AI-Sweden-Models/gpt-sw3-126m

birgermoell

AI Sweden Model Hub org Jan 1, 2023

Great. That actually solved the issue but I got a new one. Might be related to trlx so not sure if this is related to the model.

birgermoell

AI Sweden Model Hub org Jan 1, 2023

Seems to be related to dimensions of the output, but this could be related to trlx.

birgermoell

AI Sweden Model Hub org Jan 1, 2023

RuntimeError: Expected size for first two dimensions of batch2 tensor to be: [1536, 2] but got: [1536, 1].

birgermoell

AI Sweden Model Hub org Jan 1, 2023

•

edited Jan 1, 2023

Btw @JoeyOhman , it said in the readme you should load it like this.

import torch
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM

# Initialize Variables
model_name = "AI-Sweden/gpt-sw3-126m"
device = "cuda:0" if torch.cuda.is_available() else "cpu"
prompt = "Träd är fina för att"

# Initialize Tokenizer & Model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
model.eval()
model.to(device)

Maybe just update the readme to.

import torch
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM

# Initialize Variables
model_name = "AI-Sweden-Models/gpt-sw3-126m"
device = "cuda:0" if torch.cuda.is_available() else "cpu"
prompt = "Träd är fina för att"

# Initialize Tokenizer & Model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
model.eval()
model.to(device)

JoeyOhman

AI Sweden Model Hub org Jan 1, 2023

Yeah, we recently changed the repository name and will fix this as soon as possible. Thanks!

I don't know why you get that error and I don't have access to my computer until a few days (maybe someone else will jump in here before that). However, something that concerns me is that it seems to use GPT2TokenizerFast and not the GPTSw3Tokenizer. It might not be related to the error but will probably give you unexpected behavior later on. The config file does point to the correct tokenizer, please let us know if you think that problem could be on our end!

birgermoell

AI Sweden Model Hub org Jan 1, 2023

@JoeyOhman There is no tokenizer.json or tokenizer_config.json in this repository. Could it be that this makes huggingface use a default tokenizer that somehow breaks things?

birgermoell

AI Sweden Model Hub org Jan 1, 2023

Install transformers from source and installing sentencepiece resolved the issue. Then I could load the tokenizer with the following code.
self.tokenizer = AutoTokenizer.from_pretrained("AI-Sweden-Models/gpt-sw3-126m", use_auth_token=True)

birgermoell changed discussion status to closed Jan 1, 2023

birgermoell changed discussion status to open Jan 3, 2023

JoeyOhman

AI Sweden Model Hub org Jan 10, 2023

Sorry for the delayed answer and great that you solved it!

The README:s now have the correct repository path and a note about how to use the access token. Installing from source is required right now, but not for much longer as GPTSw3 should be included in HF Transformer's next official release.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment