ysnow9876's picture
Update README.md
429d619
|
raw
history blame contribute delete
No virus
626 Bytes
metadata
language:
  - he
tags:
  - language model
datasets:
  - responsa

AlephBERT-base-finetuned-for-shut

Hebrew Language Model

Based on alephbert-base: https://huggingface.co/onlplab/alephbert-base#alephbert

How to use:

from transformers import AutoModelForMaskedLM, AutoTokenizer

checkpoint = 'ysnow9876/alephbert-base-finetuned-for-shut'

tokenizer = AutoTokenizer.from_pretrained(checkpoint)

model= AutoModelForMaskedLM.from_pretrained(checkpoint)

#if not finetuning - disable dropout

model.eval()

Training Data

about 26,000 different responsa from different rabbis from the past few hundred years