---
license: mit
language:
- fr
library_name: transformers
tags:
- linformer
- medical
- RoBERTa
- pytorch
---

# Jargon-NACHOS-4096

[Jargon](https://hal.science/hal-04535557/file/FB2_domaines_specialises_LREC_COLING24.pdf) is an efficient transformer encoder LM for French, combining the LinFormer attention mechanism with the RoBERTa model architecture.

Jargon is available in several versions with different context sizes and types of pre-training corpora.

<!-- Provide a quick summary of what the model is/does. -->

<!-- This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
 -->

| **Model**                                                                           | **Initialised from...** |**Training Data**|
|-------------------------------------------------------------------------------------|:-----------------------:|:----------------:|
| [jargon-general-base](https://huggingface.co/PantagrueLLM/jargon-general-base)        |         scratch         |8.5GB Web Corpus|
| [jargon-general-biomed](https://huggingface.co/PantagrueLLM/jargon-general-biomed)    |   jargon-general-base   |5.4GB Medical Corpus|
| jargon-general-legal                                                                |   jargon-general-base   |18GB Legal Corpus
| [jargon-multidomain-base](https://huggingface.co/PantagrueLLM/jargon-multidomain-base) |   jargon-general-base   |Medical+Legal Corpora|
| jargon-legal                                                                        |         scratch         |18GB Legal Corpus|
| [jargon-legal-4096](https://huggingface.co/PantagrueLLM/jargon-legal-4096)          |         scratch         |18GB Legal Corpus|
| [jargon-biomed](https://huggingface.co/PantagrueLLM/jargon-biomed)                    |         scratch         |5.4GB Medical Corpus|
| [jargon-biomed-4096](https://huggingface.co/PantagrueLLM/jargon-biomed-4096)          |         scratch         |5.4GB Medical Corpus|
| [jargon-NACHOS](https://huggingface.co/PantagrueLLM/jargon-NACHOS)                    |         scratch         |[NACHOS](https://drbert.univ-avignon.fr/)|
| [jargon-NACHOS-4096](https://huggingface.co/PantagrueLLM/jargon-NACHOS-4096)        |         scratch         |[NACHOS](https://drbert.univ-avignon.fr/)|


## Evaluation

The Jargon models were evaluated on an range of specialized downstream tasks.

## Biomedical Benchmark

Results averaged across five funs with varying random seeds.

| |[**FrenchMedMCQA**](https://huggingface.co/datasets/qanastek/frenchmedmcqa)|[**MQC**](https://aclanthology.org/2020.lrec-1.72/)|[**CAS-POS**](https://clementdalloux.fr/?page_id=28)|[**ESSAI-POS**](https://clementdalloux.fr/?page_id=28)|[**CAS-SG**](https://aclanthology.org/W18-5614/)|[**MEDLINE**](https://huggingface.co/datasets/mnaguib/QuaeroFrenchMed)|[**EMEA**](https://huggingface.co/datasets/mnaguib/QuaeroFrenchMed)|[**E3C-NER**](https://live.european-language-grid.eu/catalogue/corpus/7618)|[**CLISTER**](https://aclanthology.org/2022.lrec-1.459/)|
|-------------------------|:-----------------------:|:-----------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:|
| **Task Type**           | Sequence Classification | Sequence Classification | Token Classification | Token Classification | Token Classification | Token Classification | Token Classification | Token Classification |          STS         |
| **Metric**              |           EMR           |         Accuracy        |       Macro-F1       |       Macro-F1       |      Weighted F1     |      Weighted F1     |      Weighted F1     |      Weighted F1     | Spearman Correlation |
| jargon-general-base     |           12.9          |           76.7          |         96.6         |         96.0         |         69.4         |         81.7         |         96.5         |         91.9         |         78.0         |
| jargon-biomed           |           15.3          |           91.1          |         96.5         |         95.6         |         75.1         |         83.7         |         96.5         |         93.5         |         74.6         |
| jargon-biomed-4096      |           14.4          |           78.9          |         96.6         |         95.9         |         73.3         |         82.3         |         96.3         |         92.5         |         65.3         |
| jargon-general-biomed   |           16.1          |           69.7          |         95.1         |         95.1         |         67.8         |         78.2         |         96.6         |         91.3         |         59.7         |
| jargon-multidomain-base |           14.9          |           86.9          |         96.3         |         96.0         |         70.6         |         82.4         |         96.6         |         92.6         |         74.8         |
| jargon-NACHOS           |           13.3          |           90.7          |         96.3         |         96.2         |         75.0         |         83.4         |         96.8         |         93.1         |         70.9         |
| jargon-NACHOS-4096      |           18.4          |           93.2          |         96.2         |         95.9         |         74.9         |         83.8         |         96.8         |         93.2         |         74.9         |

For more info please check out the [paper](https://hal.science/hal-04535557/file/FB2_domaines_specialises_LREC_COLING24.pdf), accepted for publication at [LREC-COLING 2024](https://lrec-coling-2024.org/list-of-accepted-papers/).


## Using Jargon models with HuggingFace transformers

You can get started with `jargon-NACHOS-4096` using the code snippet below:

```python
from transformers import AutoModelForMaskedLM, AutoTokenizer, pipeline

tokenizer = AutoTokenizer.from_pretrained("PantagrueLLM/jargon-NACHOS-4096", trust_remote_code=True)
model = AutoModelForMaskedLM.from_pretrained("PantagrueLLM/jargon-NACHOS-4096", trust_remote_code=True)

jargon_maskfiller = pipeline("fill-mask", model=model, tokenizer=tokenizer)
output = jargon_maskfiller("Il est allé au <mask> hier")
```

You can also use the classes `AutoModel`, `AutoModelForSequenceClassification`, or `AutoModelForTokenClassification` to load Jargon models, depending on the downstream task in question.

- **Language(s):** French
- **License:** MIT
- **Developed by:** Vincent Segonne
- **Funded by**
  - GENCI-IDRIS (Grant 2022 A0131013801)
  - French National Research Agency: Pantagruel grant ANR-23-IAS1-0001
  - MIAI@Grenoble Alpes ANR-19-P3IA-0003
  - PROPICTO ANR-20-CE93-0005
  - Lawbot ANR-20-CE38-0013
  - Swiss National Science Foundation (grant PROPICTO N°197864)
- **Authors**
  - Vincent Segonne
  - Aidan Mannion
  - Laura Cristina Alonzo Canul
  - Alexandre Audibert
  - Xingyu Liu
  - Cécile Macaire
  - Adrien Pupier
  - Yongxin Zhou
  - Mathilde Aguiar
  - Felix Herron
  - Magali Norré
  - Massih-Reza Amini
  - Pierrette Bouillon
  - Iris Eshkol-Taravella
  - Emmanuelle Esperança-Rodier
  - Thomas François
  - Lorraine Goeuriot
  - Jérôme Goulian
  - Mathieu Lafourcade
  - Benjamin Lecouteux
  - François Portet
  - Fabien Ringeval
  - Vincent Vandeghinste
  - Maximin Coavoux
  - Marco Dinarelli
  - Didier Schwab


## Citation

If you use this model for your own research work, please cite as follows:

```bibtex
@inproceedings{segonne:hal-04535557,
  TITLE = {{Jargon: A Suite of Language Models and Evaluation Tasks for French Specialized Domains}},
  AUTHOR = {Segonne, Vincent and Mannion, Aidan and Alonzo Canul, Laura Cristina and Audibert, Alexandre and Liu, Xingyu and Macaire, C{\'e}cile and Pupier, Adrien and Zhou, Yongxin and Aguiar, Mathilde and Herron, Felix and Norr{\'e}, Magali and Amini, Massih-Reza and Bouillon, Pierrette and Eshkol-Taravella, Iris and Esperan{\c c}a-Rodier, Emmanuelle and Fran{\c c}ois, Thomas and Goeuriot, Lorraine and Goulian, J{\'e}r{\^o}me and Lafourcade, Mathieu and Lecouteux, Benjamin and Portet, Fran{\c c}ois and Ringeval, Fabien and Vandeghinste, Vincent and Coavoux, Maximin and Dinarelli, Marco and Schwab, Didier},
  URL = {https://hal.science/hal-04535557},
  BOOKTITLE = {{LREC-COLING 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evaluation}},
  ADDRESS = {Turin, Italy},
  YEAR = {2024},
  MONTH = May,
  KEYWORDS = {Self-supervised learning ; Pretrained language models ; Evaluation benchmark ; Biomedical document processing ; Legal document processing ; Speech transcription},
  PDF = {https://hal.science/hal-04535557/file/FB2_domaines_specialises_LREC_COLING24.pdf},
  HAL_ID = {hal-04535557},
  HAL_VERSION = {v1},
}
```


<!-- - **Finetuned from model [optional]:** [More Information Needed] -->
<!-- 
### Model Sources [optional]


<!-- Provide the basic links for the model. -->