cointegrated
/

bert-char-ctc-en-ru-translit-v0

Inference Endpoints

Model card Files Files and versions Community

cointegrated commited on Sep 24

Commit

e143da0

•

1 Parent(s): b4eed35

Update README.md

Files changed (1) hide show

README.md +19 -1

README.md CHANGED Viewed

@@ -5,7 +5,25 @@ tags: []
 # Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->

 # Model Card for Model ID
+This is a model for trainable transliteration from Latin (English but not only) to Russian Cyrillic
+How to use:
+```
+import torch
+from transformers import BertForMaskedLM, AutoTokenizer
+tokenizer = AutoTokenizer.from_pretrained("cointegrated/bert-char-ctc-en-ru-translit-v0", trust_remote_code=True)
+model = BertForMaskedLM.from_pretrained("cointegrated/bert-char-ctc-en-ru-translit-v0")
+text = 'Hello world! My name is David Dale, and yours is Schwarzenegger?'
+with torch.inference_mode():
+    batch = tokenizer(text, return_tensors='pt', spaces=1, padding=True).to(model.device)
+    logits = torch.log_softmax(model(**batch).logits, axis=-1)
+print(tokenizer.decode(logits[0].argmax(-1), skip_special_tokens=True))
+# хэло Уорлд май нэйм из дэвид дэйл энд ёрз из скУорзэнэгжэр
+```