Longformer-base-4096 fine-tuned on SQuAD v2
Longformer-base-4096 model fine-tuned on SQuAD v2 for Q&A downstream task.
Longformer-base-4096
Longformer is a transformer model for long documents.
longformer-base-4096
is a BERT-like model started from the RoBERTa checkpoint and pretrained for MLM on long documents. It supports sequences of length up to 4,096.
Longformer uses a combination of a sliding window (local) attention and global attention. Global attention is user-configured based on the task to allow the model to learn task-specific representations.
Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
Dataset ID: squad_v2
from HuggingFace/Datasets
Dataset |
Split |
# samples |
squad_v2 |
train |
130319 |
squad_v2 |
valid |
11873 |
How to load it from datasets
!pip install datasets
from datasets import load_dataset
dataset = load_dataset('squad_v2')
Check out more about this dataset and others in Datasets Viewer
Model fine-tuning 🏋️
The training script is a slightly modified version of this one
Model in Action 🚀
import torch
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
ckpt = "mrm8488/longformer-base-4096-finetuned-squadv2"
tokenizer = AutoTokenizer.from_pretrained(ckpt)
model = AutoModelForQuestionAnswering.from_pretrained(ckpt)
text = "Huggingface has democratized NLP. Huge thanks to Huggingface for this."
question = "What has Huggingface done ?"
encoding = tokenizer(question, text, return_tensors="pt")
input_ids = encoding["input_ids"]
attention_mask = encoding["attention_mask"]
start_scores, end_scores = model(input_ids, attention_mask=attention_mask)
all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())
answer_tokens = all_tokens[torch.argmax(start_scores) :torch.argmax(end_scores)+1]
answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))
Usage with HF pipleine
from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline
ckpt = "mrm8488/longformer-base-4096-finetuned-squadv2"
tokenizer = AutoTokenizer.from_pretrained(ckpt)
model = AutoModelForQuestionAnswering.from_pretrained(ckpt)
qa = pipeline("question-answering", model=model, tokenizer=tokenizer)
text = "Huggingface has democratized NLP. Huge thanks to Huggingface for this."
question = "What has Huggingface done?"
qa({"question": question, "context": text})
If given the same context we ask something that is not there, the output for no answer will be <s>
Created by Manuel Romero/@mrm8488 | LinkedIn
Made with ♥ in Spain