Update README.md
Browse files
README.md
CHANGED
@@ -11,4 +11,72 @@ base_model:
|
|
11 |
- google-t5/t5-small
|
12 |
pipeline_tag: summarization
|
13 |
library_name: transformers
|
14 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
- google-t5/t5-small
|
12 |
pipeline_tag: summarization
|
13 |
library_name: transformers
|
14 |
+
---
|
15 |
+
# Model Card for t5_small Summarization Model
|
16 |
+
|
17 |
+
## Model Details
|
18 |
+
|
19 |
+
- Model Architecture: T5 (Text-to-Text Transfer Transformer)
|
20 |
+
- Variant: t5-small
|
21 |
+
- Task: Text Summarization
|
22 |
+
- Framework: Hugging Face Transformers
|
23 |
+
|
24 |
+
## Training Data
|
25 |
+
|
26 |
+
- Dataset: CNN/DailyMail
|
27 |
+
- Content: News articles and their summaries
|
28 |
+
- Size: Approximately 300,000 article-summary pairs
|
29 |
+
|
30 |
+
## Training Procedure
|
31 |
+
|
32 |
+
- Fine-tuning method: Using Hugging Face Transformers library
|
33 |
+
- Hyperparameters:
|
34 |
+
- Learning rate: 5e-5
|
35 |
+
- Batch size: 8
|
36 |
+
- Number of epochs: 3
|
37 |
+
- Optimizer: AdamW
|
38 |
+
|
39 |
+
## How to Use
|
40 |
+
|
41 |
+
1. Install the Hugging Face Transformers library:
|
42 |
+
```
|
43 |
+
pip install transformers
|
44 |
+
```
|
45 |
+
|
46 |
+
2. Load the model:
|
47 |
+
```python
|
48 |
+
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
49 |
+
|
50 |
+
tokenizer = AutoTokenizer.from_pretrained("t5-small")
|
51 |
+
model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
|
52 |
+
```
|
53 |
+
|
54 |
+
3. Generate a summary:
|
55 |
+
```python
|
56 |
+
input_text = "Your input text here"
|
57 |
+
inputs = tokenizer("summarize: " + input_text, return_tensors="pt", max_length=512, truncation=True)
|
58 |
+
summary_ids = model.generate(inputs["input_ids"], max_length=150, min_length=40, length_penalty=2.0, num_beams=4, early_stopping=True)
|
59 |
+
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
|
60 |
+
```
|
61 |
+
|
62 |
+
## Evaluation
|
63 |
+
|
64 |
+
- Metric: ROUGE scores (Recall-Oriented Understudy for Gisting Evaluation)
|
65 |
+
- Exact scores not available, but typically evaluated on:
|
66 |
+
- ROUGE-1 (unigram overlap)
|
67 |
+
- ROUGE-2 (bigram overlap)
|
68 |
+
- ROUGE-L (longest common subsequence)
|
69 |
+
|
70 |
+
## Limitations
|
71 |
+
|
72 |
+
- Performance may be lower compared to larger T5 variants
|
73 |
+
- Optimized for news article summarization, may not perform as well on other text types
|
74 |
+
- Limited to input sequences of 512 tokens
|
75 |
+
- Generated summaries may sometimes contain factual inaccuracies
|
76 |
+
|
77 |
+
## Ethical Considerations
|
78 |
+
|
79 |
+
- May inherit biases present in the CNN/DailyMail dataset
|
80 |
+
- Not suitable for summarizing sensitive or critical information without human review
|
81 |
+
- Users should be aware of potential biases and inaccuracies in generated summaries
|
82 |
+
- Should not be used as a sole source of information for decision-making processes
|