This is a NER model meant to be used to detect/extract citations from American legal documents.
Ignore the widget on the model card page; see below for usage.
How to Use the Model
This model outputs token-level predictions, which should be processed as follows to obtain meaningful labels for each token:
from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("ss108/legal-citation-bert")
model = AutoModelForTokenClassification.from_pretrained("ss108/legal-citation-bert")
text = "Your example text here"
inputs = tokenizer(text, return_tensors="pt", padding=True)
outputs = model(**inputs)
logits = outputs.logits
predictions = torch.argmax(logits, dim=-1)
tokens = tokenizer.convert_ids_to_tokens(inputs['input_ids'][0])
predicted_labels = [model.config.id2label[p.item()] for p in predictions[0]]
components = []
for token, label in zip(tokens, predicted_labels):
components.append(f"{token} : {label}")
concat = " ; ".join(components)
print(concat)
- Downloads last month
- 118
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.