File size: 2,029 Bytes
e82063b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
license: apache-2.0
language:
- en
tags:
- text-generation
- text2text-generation
pipeline_tag: text2text-generation
widget:
- text: "Summarize: You may want to stick it to your boss and leave your job, but don't do it if these are your reasons."
  example_title: "Example1"
- text: "Summarize: Jorge Alfaro drove in two runs, Aaron Nola pitched seven innings of two-hit ball and the Philadelphia Phillies beat the Los Angeles Dodgers 2-1 Thursday, spoiling Clayton Kershaw's first start in almost a month. Hitting out of the No. 8 spot in the ..."
  example_title: "Example2"
---

# MVP-summarization
The MVP-summarization model was proposed in [**MVP: Multi-task Supervised Pre-training for Natural Language Generation**](https://github.com/RUCAIBox/MVP/blob/main/paper.pdf) by Tianyi Tang, Junyi Li, Wayne Xin Zhao and Ji-Rong Wen.

The detailed information and instructions can be found [https://github.com/RUCAIBox/MVP](https://github.com/RUCAIBox/MVP).

## Model Description
MVP-summarization is a prompt-based model that MVP is further equipped with prompts pre-trained using labeled summarization datasets. It is a variant (MVP+S) of our main [MVP](https://huggingface.co/RUCAIBox/mvp) model. It follows a Transformer encoder-decoder architecture with layer-wise prompts.

MVP-summarization is specially designed for summarization tasks, such as new summarization (CNN/DailyMail, XSum) and dialog summarization (SAMSum).

## Example
```python
>>> from transformers import MvpTokenizer, MvpForConditionalGeneration

>>> tokenizer = MvpTokenizer.from_pretrained("RUCAIBox/mvp")
>>> model = MvpForConditionalGeneration.from_pretrained("RUCAIBox/mvp-summarization")

>>> inputs = tokenizer(
...     "Summarize: You may want to stick it to your boss and leave your job, but don't do it if these are your reasons.",
...     return_tensors="pt",
... )
>>> generated_ids = model.generate(**inputs)
>>> tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
["Don't do it if these are your reasons"]
```

## Citation