Edit model card

TOD-XLMR

TOD-XLMR is a conversationally specialized multilingual version based on XLM-RoBERTa. It is pre-trained on English conversational corpora consisting of nine human-to-human multi-turn task-oriented dialog (TOD) datasets as proposed in the paper TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue by Wu et al. and first released in this repository.

The model is jointly trained with two objectives as proposed in TOD-BERT, including masked language modeling (MLM) and response contrastive loss (RCL). Masked language modeling is a common pretraining strategy utilized for BERT-based architectures, where a random sample of tokens in the input sequence is replaced with the special token [MASK] for predicting the original masked tokens. To further encourage the model to capture dialogic structure (i.e., dialog sequential order), response contrastive loss is implemented by using in-batch negative training with contrastive learning.

How to use

Here is how to use this model to get the features of a given text in PyTorch:

from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("umanlp/TOD-XLMR")
model = AutoModelForMaskedLM.from_pretrained("umanlp/TOD-XLMR")

# prepare input
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')

# forward pass
output = model(**encoded_input)

Or you can also use AutoModel to load the pretrained model and further apply to downstream tasks:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("umanlp/TOD-XLMR")
model = AutoModel("umanlp/TOD-XLMR")

# prepare input
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')

# forward pass
output = model(**encoded_input)
Downloads last month
8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.