rcds
/

Token Classification
Transformers
PyTorch
xlm-roberta
legal
Inference Endpoints
ramonachristen's picture
add citation
aeda5ed
|
raw
history blame
2.69 kB
metadata
datasets:
  - rcds/MultiLegalNeg
language:
  - de
  - fr
  - it
  - en
tags:
  - legal

Model Card for joelito/legal-swiss-longformer-base

This model is based on XLM-R-Base. It was pretrained on negation scope resolution using NegBERT (Khandelwal and Sawant 2020) For training we used the Multi Legal Neg Dataset, a multilingual dataset of legal data annotated for negation cues and scopes, ConanDoyle-neg ( Morante and Blanco. 2012), SFU Review (Konstantinova et al. 2012), BioScope (Szarvas et al. 2008) and Dalloux (Dalloux et al. 2020).

Model Details

Model Description

  • Model type: Transformer-based language model (XLM-R-base)
  • Languages: de, fr, it, en
  • License: CC BY-SA
  • Finetune Task: Negation Scope Resolution

Uses

See LegalNegBERT for details on the training process and how to use this model.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Training Data

This model was pretrained on the Multi Legal Neg Dataset

Evaluation

We evaluate neg-xlm-roberta-base on the test sets in the Multi Legal Neg Dataset.

_Test Dataset F1-score
fr 92.49
it 88.81
de (DE) 95.66
de (CH) 87.82
SFU Review 88.53
ConanDoyle-neg 90.47
BioScope 95.59
Dalloux 93.99

Software

pytorch, transformers.

Citation

Please cite the following preprint:

@misc{christen2023resolving,
      title={Resolving Legalese: A Multilingual Exploration of Negation Scope Resolution in Legal Documents}, 
      author={Ramona Christen and Anastassia Shaitarova and Matthias Stürmer and Joel Niklaus},
      year={2023},
      eprint={2309.08695},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}