datasets:
- rcds/MultiLegalNeg
language:
- de
- fr
- it
- en
tags:
- legal
Model Card for joelito/legal-swiss-longformer-base
This model is based on XLM-R-Base. It was pretrained on negation scope resolution using NegBERT (Khandelwal and Sawant 2020) For training we used the Multi Legal Neg Dataset, a multilingual dataset of legal data annotated for negation cues and scopes, ConanDoyle-neg ( Morante and Blanco. 2012), SFU Review (Konstantinova et al. 2012), BioScope (Szarvas et al. 2008) and Dalloux (Dalloux et al. 2020).
Model Details
Model Description
- Model type: Transformer-based language model (XLM-R-base)
- Languages: de, fr, it, en
- License: CC BY-SA
- Finetune Task: Negation Scope Resolution
Uses
See LegalNegBERT for details on the training process and how to use this model.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
Training Data
This model was pretrained on the Multi Legal Neg Dataset
Evaluation
We evaluate neg-xlm-roberta-base on the test sets in the Multi Legal Neg Dataset.
_Test Dataset | F1-score |
---|---|
fr | 92.49 |
it | 88.81 |
de (DE) | 95.66 |
de (CH) | 87.82 |
SFU Review | 88.53 |
ConanDoyle-neg | 90.47 |
BioScope | 95.59 |
Dalloux | 93.99 |
Software
pytorch, transformers.
Citation
Please cite the following preprint:
@misc{christen2023resolving,
title={Resolving Legalese: A Multilingual Exploration of Negation Scope Resolution in Legal Documents},
author={Ramona Christen and Anastassia Shaitarova and Matthias Stürmer and Joel Niklaus},
year={2023},
eprint={2309.08695},
archivePrefix={arXiv},
primaryClass={cs.CL}
}