UnifiedQA-Reddit-SYAC

This is an abstractive title answering (TA) / clickbait spoiling model.
This is a variant of allenai/unifiedqa-t5-large, fine-tuned on the Reddit SYAC dataset.
The model was trained as part of my masters thesis:

Abstractive title answering for clickbait content

Disinformation

This model has the proven capability of generating, and hallucinating false information.
Any use of a TA system such as this one should be with knowledge of this risk.

Performance

Intrinsic

The following scores is the result of intrinsic evaluation on the Reddit SYAC test set.
We used a max input length of 2048 and truncated the tokens exceeding this limit.

rouge1	rouge2	rougeL	bleu	meteor
44.58	23.89	43.45	17.46	36.22

Qualtiy

Using human evaluation, we measured model performance by asking the evaluators to rate the models on a scale from 1 to 5 on how good their generated answer was for a given clickbait article.

Mean quality = 4.065

Factuality

We included a factuality assessment to address the issue of generating false information.
Human raters were asked to place each output in the categories "True", "Irrelevant", and "False".

True	Irrelevant	False
85%	7.5%	7.5%

Cite

If you use this model, please cite my master's thesis

@mastersthesis{heiervang2022AbstractiveTA
  title={Abstractive title answering for clickbait content},
  author={Markus Sverdvik Heiervang},
  publisher={University of Oslo, Department of Informatics},
  year={2022}
}

marksverdhei
/

unifiedqa-large-reddit-syac