|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- squad_v2 |
|
language: |
|
- en |
|
library_name: transformers |
|
pipeline_tag: text-classification |
|
inference: false |
|
--- |
|
# longformer-large-4096 fine-tuned to SQuAD2.0 for answerability score |
|
This model determines whether the question is answerable (or unanswerable) given the context. |
|
The output is a probability where values close to 0.0 indicate that the question is unanswerable and values close to 1.0 means answerable. |
|
|
|
- Input: `question` and `context` |
|
- Output: `probability` (i.e. logit -> sigmoid) |
|
|
|
## Model Details |
|
|
|
longformer-large-4096 model is fine-tuned to the SQuAD2.0 dataset where the input is a concatenation of ```question + context```. |
|
Due to class imbalance in SQuAD2.0, we resample such that the model is trained on a 50/50 split between answerable and unanswerable samples in SQuAD2.0. |
|
|
|
## How to Use the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
```python |
|
>>> import torch |
|
>>> from transformers import LongformerTokenizer, LongformerForSequenceClassification |
|
|
|
>>> tokenizer = LongformerTokenizer.from_pretrained("potsawee/longformer-large-4096-answerable-squad2") |
|
>>> model = LongformerForSequenceClassification.from_pretrained("potsawee/longformer-large-4096-answerable-squad2") |
|
|
|
>>> context = """ |
|
British government ministers have been banned from using Chinese-owned social media app TikTok on their work phones and devices on security grounds. |
|
The government fears sensitive data held on official phones could be accessed by the Chinese government. |
|
Cabinet Minister Oliver Dowden said the ban was a "precautionary" move but would come into effect immediately. |
|
""".replace("\n", " ").strip() |
|
|
|
>>> question1 = "Which application have been banned by the British government?" |
|
>>> input_text1 = question1 + ' ' + tokenizer.sep_token + ' ' + context |
|
>>> inputs1 = tokenizer(input_text1, max_length=4096, truncation=True, return_tensors="pt") |
|
>>> prob1 = torch.sigmoid(model(**inputs1).logits.squeeze(-1)) |
|
>>> print("P(answerable|question1, context) = {:.2f}%".format(prob1.item()*100)) |
|
P(answerable|question1, context) = 99.21% # highly answerable |
|
|
|
>>> question2 = "Is Facebook popular among young students in America?" |
|
>>> input_text2 = question2 + ' ' + tokenizer.sep_token + ' ' + context |
|
>>> inputs2 = tokenizer(input_text2, max_length=4096, truncation=True, return_tensors="pt") |
|
>>> prob2 = torch.sigmoid(model(**inputs2).logits.squeeze(-1)) |
|
>>> print("P(answerable|question2, context) = {:.2f}%".format(prob2.item()*100)) |
|
P(answerable|question2, context) = 2.53% # highly unanswerable |
|
``` |
|
|
|
## Citation |
|
|
|
```bibtex |
|
@misc{manakul2023selfcheckgpt, |
|
title={SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models}, |
|
author={Potsawee Manakul and Adian Liusie and Mark J. F. Gales}, |
|
year={2023}, |
|
eprint={2303.08896}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
``` |