Token Classification
GLiNER
PyTorch
English
NER
hoan commited on
Commit
a55bc73
1 Parent(s): d0071c4

Initial model version

Browse files
README.md CHANGED
@@ -1,3 +1,213 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ metrics:
6
+ - f1
7
+ - recall
8
+ - precision
9
+ tags:
10
+ - NER
11
+ pipeline_tag: token-classification
12
+ library_name: gliner
13
+ datasets:
14
+ - knowledgator/GLINER-multi-task-synthetic-data
15
+ - EmergentMethods/AskNews-NER-v0
16
+ - urchade/pile-mistral-v0.1
17
+ - MultiCoNER/multiconer_v2
18
+ - DFKI-SLT/few-nerd
19
+ base_model: knowledgator/gliner-multitask-large-v0.5
20
+ ---
21
+
22
+ ![Illustration](images/gliner_merge_model_illustration.png)
23
+
24
+ The `xomad/gliner-model-merge-large-v1.0` model is developed from the pretrained model `knowledgator/gliner-multitask-large-v0.5` to explore the capabilities of model merging techniques, resulting in a significant performance boost of 3.25 points, elevating the model's capability from 0.6276 to 0.6601 F1-score.
25
+
26
+ The model is trained exclusively on datasets with commercial-friendly licenses to ensure broad applicability under the Apache-2.0 license. The following datasets were used in the training process:
27
+ - [knowledgator/GLINER-multi-task-synthetic-data](https://huggingface.co/datasets/knowledgator/GLINER-multi-task-synthetic-data)
28
+ - [EmergentMethods/AskNews-NER-v0](https://huggingface.co/datasets/EmergentMethods/AskNews-NER-v0)
29
+ - [urchade/pile-mistral-v0.1](https://huggingface.co/datasets/urchade/pile-mistral-v0.1)
30
+ - [MultiCoNER/multiconer_v2](https://huggingface.co/datasets/MultiCoNER/multiconer_v2)
31
+ - [DFKI-SLT/few-nerd](https://huggingface.co/datasets/DFKI-SLT/few-nerd)
32
+
33
+ ### ⚙️ Finetuning process
34
+ The process begins with the base model `knowledgator/gliner-multitask-large-v0.5`. Our model `xomad/gliner-model-merge-large-v1.0` is fine-tuned separately on each of the above datasets , and we save multiple checkpoints along the fine-tuning process. We put all these checkpoints together into a pool and then we apply the [Model soups](https://arxiv.org/abs/2203.05482) technique to produce different merged models:
35
+ - `uniform_merged`
36
+ - `greedy_on_random`
37
+ - `greedy_on_sorted`
38
+
39
+ Following this, we apply [WiSE-FT](https://openaccess.thecvf.com/content/CVPR2022/html/Wortsman_Robust_Fine-Tuning_of_Zero-Shot_Models_CVPR_2022_paper.html?ref=roboflow-blog) merging technique to pairs of models selected from a group of the above 3 models and the original model to produce the `wise_ft_merged` model. This concludes the 1st finetuning phase.
40
+
41
+ The process is then repeated in the 2nd finetuning phase, using the `wise_ft_merged` as the new starting point, to produce the final model. The whole finetuning flow is illustrated in the following figure:
42
+
43
+ ![Finetuning flow](images/finetune_flow.png)
44
+
45
+ The performance of the pool of fine-tuned models and the merged models are evaluated on the `CrossNER`, TwitterNER benchmarks, and plotted in the following 2 figures (as `crossner_f1` and `other_f1` respectively).
46
+
47
+ The 1st finetuning phase plot:
48
+ ![1st finetuning phase](images/model_soups.png)
49
+
50
+ The 2nd finetuning phase plot:
51
+ ![2nd finetuning phase](images/model_soups2.png)
52
+
53
+
54
+ ### 🛠️ Installation
55
+ To use this model, you must install the [GLiNER Python library](https://github.com/urchade/GLiNER):
56
+
57
+ ```bash
58
+ pip install gliner
59
+ ```
60
+
61
+ Once you've downloaded the GLiNER library, you can import the GLiNER class. You can then load this model using GLiNER.from_pretrained.
62
+
63
+ ### 💻 Usage
64
+
65
+ ```python
66
+ from gliner import GLiNER
67
+
68
+ model = GLiNER.from_pretrained("xomad/gliner-model-merge-large-v1.0")
69
+
70
+ text = """
71
+ Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975 to develop and sell BASIC interpreters for the Altair 8800. During his career at Microsoft, Gates held the positions of chairman, chief executive officer, president and chief software architect, while also being the largest individual shareholder until May 2014.
72
+ """
73
+
74
+ labels = ["founder", "computer", "software", "position", "date", "company"]
75
+
76
+ entities = model.predict_entities(text, labels)
77
+
78
+ for entity in entities:
79
+ print(entity["text"], "=>", entity["label"])
80
+ ```
81
+
82
+ Output:
83
+ ```
84
+ Microsoft => company
85
+ Bill Gates => founder
86
+ Paul Allen => founder
87
+ April 4, 1975 => date
88
+ BASIC => software
89
+ Altair 8800 => computer
90
+ Microsoft => company
91
+ chairman => position
92
+ chief executive officer => position
93
+ president => position
94
+ chief software architect => position
95
+ May 2014 => date
96
+ ```
97
+
98
+
99
+ ### 📊 Benchmarks:
100
+
101
+ ![Model Performance](images/performance.png)
102
+
103
+ Performance on different zero-shot NER benchmarks (CrossNER, mit-movie and mit-restaurant), numbers reported from https://huggingface.co/knowledgator/gliner-multitask-large-v0.5:
104
+
105
+ | Model | F1 Score |
106
+ |---------------------------------------------------------------------------------------------------------|-------------|
107
+ | [xomad/gliner-model-merge-large-v1.0](https://huggingface.co/xomad/gliner-model-merge-large-v1.0) | **0.6601** |
108
+ | [knowledgator/gliner-multitask-v0.5](https://huggingface.co/knowledgator/gliner-multitask-v0.5) | _0.6276_ |
109
+ | [numind/NuNER_Zero-span](https://huggingface.co/numind/NuNER_Zero-span) | 0.6196 |
110
+ | [gliner-community/gliner_large-v2.5](https://huggingface.co/gliner-community/gliner_large-v2.5) | 0.615 |
111
+ | [EmergentMethods/gliner_large_news-v2.1](https://huggingface.co/EmergentMethods/gliner_large_news-v2.1) | 0.5876 |
112
+ | [urchade/gliner_large-v2.1](https://huggingface.co/urchade/gliner_large-v2.1) | 0.5754 |
113
+
114
+
115
+ Detailed performance on different datasets:
116
+
117
+
118
+ | Model | Dataset | Precision | Recall | F1 Score | F1 Score (Decimal) |
119
+ |------------------------------------|--------------------|-----------|--------|----------|--------------------|
120
+ | xomad/gliner-model-merge-large-v1.0 | CrossNER_AI | 62.66% | 57.48% | 59.96% | 0.5996 |
121
+ | | CrossNER_literature | 73.28% | 66.42% | 69.68% | 0.6968 |
122
+ | | CrossNER_music | 74.89% | 70.67% | 72.72% | 0.7272 |
123
+ | | CrossNER_politics | 79.46% | 77.57% | 78.51% | 0.7851 |
124
+ | | CrossNER_science | 74.72% | 70.24% | 72.41% | 0.7241 |
125
+ | | mit-movie | 67.33% | 57.89% | 62.25% | 0.6225 |
126
+ | | mit-restaurant | 54.94% | 40.41% | 46.57% | 0.4657 |
127
+ | | **Average** | | | | **0.6601** |
128
+ | numind/NuNER_Zero-span | CrossNER_AI | 63.82% | 56.82% | 60.12% | 0.6012 |
129
+ | | CrossNER_literature| 73.53% | 58.06% | 64.89% | 0.6489 |
130
+ | | CrossNER_music | 72.69% | 67.40% | 69.95% | 0.6995 |
131
+ | | CrossNER_politics | 77.28% | 68.69% | 72.73% | 0.7273 |
132
+ | | CrossNER_science | 70.08% | 63.12% | 66.42% | 0.6642 |
133
+ | | mit-movie | 63.00% | 48.88% | 55.05% | 0.5505 |
134
+ | | mit-restaurant | 54.81% | 37.62% | 44.62% | 0.4462 |
135
+ | | **Average** | | | | **0.6196** |
136
+ | knowledgator/gliner-multitask-v0.5 | CrossNER_AI | 51.00% | 51.11% | 51.05% | 0.5105 |
137
+ | | CrossNER_literature | 72.65% | 65.62% | 68.96% | 0.6896 |
138
+ | | CrossNER_music | 74.91% | 73.70% | 74.30% | 0.7430 |
139
+ | | CrossNER_politics | 78.84% | 77.71% | 78.27% | 0.7827 |
140
+ | | CrossNER_science | 69.20% | 65.48% | 67.29% | 0.6729 |
141
+ | | mit-movie | 61.29% | 52.59% | 56.60% | 0.5660 |
142
+ | | mit-restaurant | 50.65% | 38.13% | 43.51% | 0.4351 |
143
+ | | **Average** | | | | **0.6276** |
144
+ | gliner-community/gliner_large-v2.5 | CrossNER_AI | 50.85% | 63.03% | 56.29% | 0.5629 |
145
+ | | CrossNER_literature | 64.92% | 67.21% | 66.04% | 0.6604 |
146
+ | | CrossNER_music | 70.88% | 73.10% | 71.97% | 0.7197 |
147
+ | | CrossNER_politics | 72.67% | 72.93% | 72.80% | 0.7280 |
148
+ | | CrossNER_science | 61.71% | 68.85% | 65.08% | 0.6508 |
149
+ | | mit-movie | 54.63% | 52.83% | 53.71% | 0.5371 |
150
+ | | mit-restaurant | 47.99% | 42.13% | 44.87% | 0.4487 |
151
+ | | **Average** | | | | **0.6154** |
152
+ | urchade/gliner_large-v2.1 | CrossNER_AI | 54.98% | 52.00% | 53.45% | 0.5345 |
153
+ | | CrossNER_literature| 59.33% | 56.47% | 57.87% | 0.5787 |
154
+ | | CrossNER_music | 67.39% | 66.77% | 67.08% | 0.6708 |
155
+ | | CrossNER_politics | 66.07% | 63.76% | 64.90% | 0.6490 |
156
+ | | CrossNER_science | 61.45% | 62.56% | 62.00% | 0.6200 |
157
+ | | mit-movie | 55.94% | 47.36% | 51.29% | 0.5129 |
158
+ | | mit-restaurant | 53.34% | 40.83% | 46.25% | 0.4625 |
159
+ | | **Average** | | | | **0.5754** |
160
+ | EmergentMethods/gliner_large_news-v2.1| CrossNER_AI | 59.60% | 54.55% | 56.96% | 0.5696 |
161
+ | | CrossNER_literature| 65.41% | 56.16% | 60.44% | 0.6044 |
162
+ | | CrossNER_music | 67.47% | 63.08% | 65.20% | 0.6520 |
163
+ | | CrossNER_politics | 66.05% | 60.07% | 62.92% | 0.6292 |
164
+ | | CrossNER_science | 68.44% | 63.57% | 65.92% | 0.6592 |
165
+ | | mit-movie | 65.85% | 49.59% | 56.57% | 0.5657 |
166
+ | | mit-restaurant | 54.71% | 35.94% | 43.38% | 0.4338 |
167
+ | | **Average** | | | | **0.5876** |
168
+
169
+
170
+ ### Authors
171
+
172
+ Hoan Nguyen, at xomad.com
173
+
174
+ ### Citations
175
+
176
+ ```
177
+ @misc{wortsman2022modelsoupsaveragingweights,
178
+ title={Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time},
179
+ author={Mitchell Wortsman and Gabriel Ilharco and Samir Yitzhak Gadre and Rebecca Roelofs and Raphael Gontijo-Lopes and Ari S. Morcos and Hongseok Namkoong and Ali Farhadi and Yair Carmon and Simon Kornblith and Ludwig Schmidt},
180
+ year={2022},
181
+ eprint={2203.05482},
182
+ archivePrefix={arXiv},
183
+ primaryClass={cs.LG},
184
+ url={https://arxiv.org/abs/2203.05482},
185
+ }
186
+
187
+ @InProceedings{Wortsman_2022_CVPR,
188
+ author = {Wortsman, Mitchell and Ilharco, Gabriel and Kim, Jong Wook and Li, Mike and Kornblith, Simon and Roelofs, Rebecca and Lopes, Raphael Gontijo and Hajishirzi, Hannaneh and Farhadi, Ali and Namkoong, Hongseok and Schmidt, Ludwig},
189
+ title = {Robust Fine-Tuning of Zero-Shot Models},
190
+ booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
191
+ month = {June},
192
+ year = {2022},
193
+ pages = {7959-7971}
194
+ }
195
+
196
+ @misc{stepanov2024gliner,
197
+ title={GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks},
198
+ author={Ihor Stepanov and Mykhailo Shtopko},
199
+ year={2024},
200
+ eprint={2406.12925},
201
+ archivePrefix={arXiv},
202
+ primaryClass={id='cs.LG' full_name='Machine Learning' is_active=True alt_name=None in_archive='cs' is_general=False description='Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.'}
203
+ }
204
+
205
+ @misc{zaratiana2023gliner,
206
+ title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer},
207
+ author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
208
+ year={2023},
209
+ eprint={2311.08526},
210
+ archivePrefix={arXiv},
211
+ primaryClass={cs.CL}
212
+ }
213
+ ```
added_tokens.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "<<ENT>>": 128002,
3
+ "<<SEP>>": 128003,
4
+ "[FLERT]": 128001,
5
+ "[MASK]": 128000
6
+ }
gliner_config.json ADDED
@@ -0,0 +1,134 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "class_token_index": 128002,
3
+ "dropout": 0.1,
4
+ "embed_ent_token": true,
5
+ "encoder_config": {
6
+ "_name_or_path": "microsoft/deberta-v3-large",
7
+ "add_cross_attention": false,
8
+ "architectures": null,
9
+ "attention_probs_dropout_prob": 0.1,
10
+ "bad_words_ids": null,
11
+ "begin_suppress_tokens": null,
12
+ "bos_token_id": null,
13
+ "chunk_size_feed_forward": 0,
14
+ "cross_attention_hidden_size": null,
15
+ "decoder_start_token_id": null,
16
+ "diversity_penalty": 0.0,
17
+ "do_sample": false,
18
+ "early_stopping": false,
19
+ "encoder_no_repeat_ngram_size": 0,
20
+ "eos_token_id": null,
21
+ "exponential_decay_length_penalty": null,
22
+ "finetuning_task": null,
23
+ "forced_bos_token_id": null,
24
+ "forced_eos_token_id": null,
25
+ "hidden_act": "gelu",
26
+ "hidden_dropout_prob": 0.1,
27
+ "hidden_size": 1024,
28
+ "id2label": {
29
+ "0": "LABEL_0",
30
+ "1": "LABEL_1"
31
+ },
32
+ "initializer_range": 0.02,
33
+ "intermediate_size": 4096,
34
+ "is_decoder": false,
35
+ "is_encoder_decoder": false,
36
+ "label2id": {
37
+ "LABEL_0": 0,
38
+ "LABEL_1": 1
39
+ },
40
+ "layer_norm_eps": 1e-07,
41
+ "length_penalty": 1.0,
42
+ "max_length": 20,
43
+ "max_position_embeddings": 512,
44
+ "max_relative_positions": -1,
45
+ "min_length": 0,
46
+ "model_type": "deberta-v2",
47
+ "no_repeat_ngram_size": 0,
48
+ "norm_rel_ebd": "layer_norm",
49
+ "num_attention_heads": 16,
50
+ "num_beam_groups": 1,
51
+ "num_beams": 1,
52
+ "num_hidden_layers": 24,
53
+ "num_return_sequences": 1,
54
+ "output_attentions": false,
55
+ "output_hidden_states": false,
56
+ "output_scores": false,
57
+ "pad_token_id": 0,
58
+ "pooler_dropout": 0,
59
+ "pooler_hidden_act": "gelu",
60
+ "pooler_hidden_size": 1024,
61
+ "pos_att_type": [
62
+ "p2c",
63
+ "c2p"
64
+ ],
65
+ "position_biased_input": false,
66
+ "position_buckets": 256,
67
+ "prefix": null,
68
+ "problem_type": null,
69
+ "pruned_heads": {},
70
+ "relative_attention": true,
71
+ "remove_invalid_values": false,
72
+ "repetition_penalty": 1.0,
73
+ "return_dict": true,
74
+ "return_dict_in_generate": false,
75
+ "sep_token_id": null,
76
+ "share_att_key": true,
77
+ "suppress_tokens": null,
78
+ "task_specific_params": null,
79
+ "temperature": 1.0,
80
+ "tf_legacy_loss": false,
81
+ "tie_encoder_decoder": false,
82
+ "tie_word_embeddings": true,
83
+ "tokenizer_class": null,
84
+ "top_k": 50,
85
+ "top_p": 1.0,
86
+ "torch_dtype": null,
87
+ "torchscript": false,
88
+ "type_vocab_size": 0,
89
+ "typical_p": 1.0,
90
+ "use_bfloat16": false,
91
+ "vocab_size": 128004
92
+ },
93
+ "ent_token": "<<ENT>>",
94
+ "eval_every": 1000,
95
+ "fine_tune": true,
96
+ "freeze_token_rep": false,
97
+ "fuse_layers": false,
98
+ "has_rnn": true,
99
+ "hidden_size": 512,
100
+ "labels_encoder": null,
101
+ "labels_encoder_config": null,
102
+ "log_dir": "logs_final",
103
+ "loss_alpha": 0.75,
104
+ "loss_gamma": 0,
105
+ "loss_reduction": "sum",
106
+ "lr_encoder": "5e-6",
107
+ "lr_others": "7e-6",
108
+ "max_len": 768,
109
+ "max_neg_type_ratio": 1,
110
+ "max_types": 30,
111
+ "max_width": 100,
112
+ "model_name": "microsoft/deberta-v3-large",
113
+ "model_type": "gliner",
114
+ "name": "token level gliner large",
115
+ "num_post_fusion_layers": 1,
116
+ "post_fusion_schema": "",
117
+ "random_drop": true,
118
+ "root_dir": "gliner_logs",
119
+ "save_total_limit": 10,
120
+ "scheduler_type": "linear",
121
+ "sep_token": "<<SEP>>",
122
+ "shuffle_types": true,
123
+ "size_sup": -1,
124
+ "span_mode": "token_level",
125
+ "subtoken_pooling": "first",
126
+ "train_batch_size": 8,
127
+ "transformers_version": "4.44.2",
128
+ "val_data_dir": "none",
129
+ "vocab_size": 128004,
130
+ "warmup_ratio": 0.1,
131
+ "weight_decay_encoder": 0.01,
132
+ "weight_decay_other": 0.01,
133
+ "words_splitter_type": "whitespace"
134
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:089504018e1ac4f24a9f4e13f09490240430567fb863af27aecf71fe9d74ced8
3
+ size 1761030542
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "[CLS]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "[SEP]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "[MASK]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "[PAD]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "[SEP]",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": true,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
spm.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c679fbf93643d19aab7ee10c0b99e460bdbc02fedf34b92b05af343b4af586fd
3
+ size 2464616
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "[CLS]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "[SEP]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "128000": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "128001": {
44
+ "content": "[FLERT]",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "128002": {
52
+ "content": "<<ENT>>",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "128003": {
60
+ "content": "<<SEP>>",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ }
67
+ },
68
+ "bos_token": "[CLS]",
69
+ "clean_up_tokenization_spaces": true,
70
+ "cls_token": "[CLS]",
71
+ "do_lower_case": false,
72
+ "eos_token": "[SEP]",
73
+ "mask_token": "[MASK]",
74
+ "max_length": null,
75
+ "model_max_length": 1000000000000000019884624838656,
76
+ "pad_to_multiple_of": null,
77
+ "pad_token": "[PAD]",
78
+ "pad_token_type_id": 0,
79
+ "padding_side": "right",
80
+ "sep_token": "[SEP]",
81
+ "sp_model_kwargs": {},
82
+ "split_by_punct": false,
83
+ "tokenizer_class": "DebertaV2Tokenizer",
84
+ "unk_token": "[UNK]",
85
+ "vocab_type": "spm"
86
+ }