How to finetune on Kaggle TPU

#89

by LukeJacob2023 - opened Dec 7, 2023

LukeJacob2023

Dec 7, 2023

I have successfully finetune large v2 on Kaggle T4x2 by Adalora, but I want to be faster with TPU, anyone has tried this?

sanchit-gandhi

Dec 7, 2023

You can try using the Flax fine-tuning script provided here: https://github.com/huggingface/transformers/tree/main/examples/flax/speech-recognition#whisper-model

LukeJacob2023

Dec 7, 2023

You can try using the Flax fine-tuning script provided here: https://github.com/huggingface/transformers/tree/main/examples/flax/speech-recognition#whisper-model

Thanks for you reply. So can it be converted to pytorch again? Because I will use faster whisper at last.

sanchit-gandhi

Jan 6

Yes, you can convert any Transformers Flax model into PyTorch by passing the from_flax=True argument to from_pretrained:

from transformers import WhisperForConditionalGeneration

# load flax weights into pytorch
model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-large-v2", from_flax=True)

# save pytorch weights
model.save_pretrained("./output_dir")

See the docs for more details: https://huggingface.co/docs/transformers/main_classes/model#transformers.PreTrainedModel.from_pretrained.from_flax

LukeJacob2023

Jan 7

I have tried the example script as you reference, but it can not execute successfully. My kaggle notebook is below:
https://www.kaggle.com/code/lukejacob/whisper-finetune-tpu

I have test it on GPU(T4x2), it is ok. But when I change to TPU v3x8, it crashes.

LukeJacob2023

Jan 7

You can try using the Flax fine-tuning script provided here: https://github.com/huggingface/transformers/tree/main/examples/flax/speech-recognition#whisper-model

Hello, @sanchit-gandhi Thanks for you reply. I have tried the example script as you reference, but it can not execute successfully. My kaggle notebook is below:
https://www.kaggle.com/code/lukejacob/whisper-finetune-tpu

I have test it on GPU(T4x2), it is ok. But when I change to TPU v3x8, it crashes.

sanchit-gandhi

Feb 2

Hey @LukeJacob2023 - I think your notebook is private so I can't view it! What is the crash message? If you're able to make the notebook public I can take a look. If you're hitting OOMs, you might need to use the adafactor optimiser instead of adamw (1 optimiser state instead of 2): https://huggingface.co/sanchit-gandhi/large-v2-ls-ft/blob/9d49b40c54fe6e3bf458f05d2e955470ea8dc8d0/run_finetuning.py#L854

LukeJacob2023

Feb 3

Hey @LukeJacob2023 - I think your notebook is private so I can't view it! What is the crash message? If you're able to make the notebook public I can take a look. If you're hitting OOMs, you might need to use the adafactor optimiser instead of adamw (1 optimiser state instead of 2): https://huggingface.co/sanchit-gandhi/large-v2-ls-ft/blob/9d49b40c54fe6e3bf458f05d2e955470ea8dc8d0/run_finetuning.py#L854

Sorry, it is ok now:
https://www.kaggle.com/code/lukejacob/whisper-finetune-tpu. it is not oom, I don't think tpu can oom.

Dondada79

Feb 9

Hello @sanchit-gandhi i am student in ML and i want to fine-tune whisper only specified for ICELANDIC , just for icelandic i mean i need a very good W.E.R for it , can you help me i need it for my last project to become an engineer

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment