Token Classification
GLiNER
PyTorch
English
NER
Edit model card

Illustration

The xomad/gliner-model-merge-large-v1.0 model is developed from the pretrained model knowledgator/gliner-multitask-large-v0.5 to explore the capabilities of model merging techniques, resulting in a significant performance boost of 3.25 points, elevating the model's capability from 0.6276 to 0.6601 F1-score.

The model is trained exclusively on datasets with commercial-friendly licenses to ensure broad applicability under the Apache-2.0 license. The following datasets were used in the training process:

βš™οΈ Finetuning process

The process begins with the base model knowledgator/gliner-multitask-large-v0.5. Our model xomad/gliner-model-merge-large-v1.0 is fine-tuned separately on each of the above datasets , and we save multiple checkpoints along the fine-tuning process. We put all these checkpoints together into a pool and then we apply the Model soups technique to produce different merged models:

  • uniform_merged
  • greedy_on_random
  • greedy_on_sorted

Following this, we apply WiSE-FT merging technique to pairs of models selected from a group of the above 3 models and the original model to produce the wise_ft_merged model. This concludes the 1st finetuning phase.

The process is then repeated in the 2nd finetuning phase, using the wise_ft_merged as the new starting point, to produce the final model. The whole finetuning flow is illustrated in the following figure:

Finetuning flow

The performance of the pool of fine-tuned models and the merged models are evaluated on the CrossNER, TwitterNER benchmarks, and plotted in the following 2 figures (as crossner_f1 and other_f1 respectively).

The 1st finetuning phase plot: 1st finetuning phase

The 2nd finetuning phase plot: 2nd finetuning phase

πŸ› οΈ Installation

To use this model, you must install the GLiNER Python library:

pip install gliner

Once you've downloaded the GLiNER library, you can import the GLiNER class. You can then load this model using GLiNER.from_pretrained.

πŸ’» Usage

from gliner import GLiNER

model = GLiNER.from_pretrained("xomad/gliner-model-merge-large-v1.0")

text = """
Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975 to develop and sell BASIC interpreters for the Altair 8800. During his career at Microsoft, Gates held the positions of chairman, chief executive officer, president and chief software architect, while also being the largest individual shareholder until May 2014.
"""

labels = ["founder", "computer", "software", "position", "date", "company"]

entities = model.predict_entities(text, labels)

for entity in entities:
    print(entity["text"], "=>", entity["label"])

Output:

Microsoft => company
Bill Gates => founder
Paul Allen => founder
April 4, 1975 => date
BASIC => software
Altair 8800 => computer
Microsoft => company
chairman => position
chief executive officer => position
president => position
chief software architect => position
May 2014 => date

πŸ“Š Benchmarks:

Model Performance

Performance on different zero-shot NER benchmarks (CrossNER, mit-movie and mit-restaurant), numbers reported from https://huggingface.co/knowledgator/gliner-multitask-large-v0.5:

Detailed performance on different datasets:

Model Dataset Precision Recall F1 Score F1 Score (Decimal)
xomad/gliner-model-merge-large-v1.0 CrossNER_AI 62.66% 57.48% 59.96% 0.5996
CrossNER_literature 73.28% 66.42% 69.68% 0.6968
CrossNER_music 74.89% 70.67% 72.72% 0.7272
CrossNER_politics 79.46% 77.57% 78.51% 0.7851
CrossNER_science 74.72% 70.24% 72.41% 0.7241
mit-movie 67.33% 57.89% 62.25% 0.6225
mit-restaurant 54.94% 40.41% 46.57% 0.4657
Average 0.6601
numind/NuNER_Zero-span CrossNER_AI 63.82% 56.82% 60.12% 0.6012
CrossNER_literature 73.53% 58.06% 64.89% 0.6489
CrossNER_music 72.69% 67.40% 69.95% 0.6995
CrossNER_politics 77.28% 68.69% 72.73% 0.7273
CrossNER_science 70.08% 63.12% 66.42% 0.6642
mit-movie 63.00% 48.88% 55.05% 0.5505
mit-restaurant 54.81% 37.62% 44.62% 0.4462
Average 0.6196
knowledgator/gliner-multitask-v0.5 CrossNER_AI 51.00% 51.11% 51.05% 0.5105
CrossNER_literature 72.65% 65.62% 68.96% 0.6896
CrossNER_music 74.91% 73.70% 74.30% 0.7430
CrossNER_politics 78.84% 77.71% 78.27% 0.7827
CrossNER_science 69.20% 65.48% 67.29% 0.6729
mit-movie 61.29% 52.59% 56.60% 0.5660
mit-restaurant 50.65% 38.13% 43.51% 0.4351
Average 0.6276
gliner-community/gliner_large-v2.5 CrossNER_AI 50.85% 63.03% 56.29% 0.5629
CrossNER_literature 64.92% 67.21% 66.04% 0.6604
CrossNER_music 70.88% 73.10% 71.97% 0.7197
CrossNER_politics 72.67% 72.93% 72.80% 0.7280
CrossNER_science 61.71% 68.85% 65.08% 0.6508
mit-movie 54.63% 52.83% 53.71% 0.5371
mit-restaurant 47.99% 42.13% 44.87% 0.4487
Average 0.6154
urchade/gliner_large-v2.1 CrossNER_AI 54.98% 52.00% 53.45% 0.5345
CrossNER_literature 59.33% 56.47% 57.87% 0.5787
CrossNER_music 67.39% 66.77% 67.08% 0.6708
CrossNER_politics 66.07% 63.76% 64.90% 0.6490
CrossNER_science 61.45% 62.56% 62.00% 0.6200
mit-movie 55.94% 47.36% 51.29% 0.5129
mit-restaurant 53.34% 40.83% 46.25% 0.4625
Average 0.5754
EmergentMethods/gliner_large_news-v2.1 CrossNER_AI 59.60% 54.55% 56.96% 0.5696
CrossNER_literature 65.41% 56.16% 60.44% 0.6044
CrossNER_music 67.47% 63.08% 65.20% 0.6520
CrossNER_politics 66.05% 60.07% 62.92% 0.6292
CrossNER_science 68.44% 63.57% 65.92% 0.6592
mit-movie 65.85% 49.59% 56.57% 0.5657
mit-restaurant 54.71% 35.94% 43.38% 0.4338
Average 0.5876

Authors

Hoan Nguyen, at xomad.com

Citations

@misc{wortsman2022modelsoupsaveragingweights,
      title={Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time}, 
      author={Mitchell Wortsman and Gabriel Ilharco and Samir Yitzhak Gadre and Rebecca Roelofs and Raphael Gontijo-Lopes and Ari S. Morcos and Hongseok Namkoong and Ali Farhadi and Yair Carmon and Simon Kornblith and Ludwig Schmidt},
      year={2022},
      eprint={2203.05482},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2203.05482}, 
}

@InProceedings{Wortsman_2022_CVPR,
    author    = {Wortsman, Mitchell and Ilharco, Gabriel and Kim, Jong Wook and Li, Mike and Kornblith, Simon and Roelofs, Rebecca and Lopes, Raphael Gontijo and Hajishirzi, Hannaneh and Farhadi, Ali and Namkoong, Hongseok and Schmidt, Ludwig},
    title     = {Robust Fine-Tuning of Zero-Shot Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {7959-7971}
}

@misc{stepanov2024gliner,
      title={GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks}, 
      author={Ihor Stepanov and Mykhailo Shtopko},
      year={2024},
      eprint={2406.12925},
      archivePrefix={arXiv},
      primaryClass={id='cs.LG' full_name='Machine Learning' is_active=True alt_name=None in_archive='cs' is_general=False description='Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.'}
}

@misc{zaratiana2023gliner,
      title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer}, 
      author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
      year={2023},
      eprint={2311.08526},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Downloads last month
287
Inference Examples
Inference API (serverless) does not yet support gliner models for this pipeline type.

Model tree for xomad/gliner-model-merge-large-v1.0

Finetuned
(1)
this model

Datasets used to train xomad/gliner-model-merge-large-v1.0