English NER model for extraction of named entities from scientific acknowledgement texts using Flair Embeddings
F1-Score: 0.79
Predicts 6 tags:
label | description | precision | recall | f1-score | support |
---|---|---|---|---|---|
GRNB | grant number | 0,93 | 0,98 | 0,96 | 160 |
IND | person | 0,98 | 0,98 | 0,98 | 295 |
FUND | funding organization | 0,70 | 0,83 | 0,76 | 157 |
UNI | university | 0,77 | 0,74 | 0,75 | 99 |
MISC | miscellaneous | 0,65 | 0,65 | 0,65 | 82 |
COR | corporation | 0,75 | 0,50 | 0,60 | 12 |
Based on Flair embeddings
Usage
Requires: Flair (pip install flair)
#import libraries
from flair.data import Sentence
from flair.models import SequenceTagger
# load the trained model
model = SequenceTagger.load("kalawinka/flair-ner-acknowledgments")
# create example sentence
sentence = Sentence("This work was supported by State Key Lab of Ocean Engineering Shanghai Jiao Tong University and financially supported by China National Scientific and Technology Major Project (Grant No. 2016ZX05028-006-009)")
# predict the tags
model.predict(sentence)
#print output as spans
for entity in sentence.get_spans('ner'):
print(entity)
This produces the following output:
Span[5:15]: "State Key Lab of Ocean Engineering Shanghai Jiao Tong University" → UNI (0.9396)
Span[19:26]: "China National Scientific and Technology Major Project" → FUND (0.9865)
Span[29:30]: "2016ZX05028-006-009" → GRNB (0.9996)
You can try the model by copying the following acknowledgement text in the small text box on the right and click “Compute”:
The original work was funded by the German Center for Higher Education Research and Science Studies (DZHW) via the project "Mining Acknowledgement Texts in Web of Science (MinAck)". Access to the WoS data was granted via the Competence Centre for Bibliometrics. Data access was funded by BMBF (Federal Ministry of Education and Research, Germany) under grant number 01PQ17001. Nina Smirnova received funding from the German Research Foundation (DFG) via the project "POLLUX". The present paper is an extended version of the paper "Evaluation of Embedding Models for Automatic Extraction and Classification of Acknowledged Entities in Scientific Documents" (Smirnova & Mayr, 2022) presented at the 3rd Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE2022).
For other examples also see our Google colab notebook
Citation
if you use this model, please consider citing this work:
@misc{smirnova2023embedding,
title={Embedding Models for Supervised Automatic Extraction and Classification of Named Entities in Scientific Acknowledgements},
author={Nina Smirnova and Philipp Mayr},
year={2023},
eprint={2307.13377},
archivePrefix={arXiv},
primaryClass={cs.DL}
}
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.