Details: https://spacy.io/models/mk#mk_core_news_lg
Macedonian pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.
Feature | Description |
---|---|
Name | mk_core_news_lg |
Version | 3.7.0 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | morphologizer , parser , attribute_ruler , lemmatizer , ner |
Components | morphologizer , parser , senter , attribute_ruler , lemmatizer , ner |
Vectors | 274587 keys, 274587 unique vectors (300 dimensions) |
Sources | Macedonian Corpus (Damjan Zlatinov, Melanija Gerasimovska, Borijan Georgievski, Marija Todosovska) spaCy lookups data (Explosion) Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia) (Explosion) |
License | CC BY-SA 4.0 |
Author | Explosion |
Label Scheme
View label scheme (54 labels for 3 components)
Component | Labels |
---|---|
morphologizer |
POS=PROPN , POS=AUX , POS=ADJ , POS=NOUN , POS=ADP , POS=PUNCT , POS=CONJ , POS=NUM , POS=VERB , POS=PRON , POS=ADV , POS=SCONJ , POS=PART , POS=SYM , _ , POS=SPACE , POS=X , POS=INTJ |
parser |
ROOT , advmod , att , aux , cc , dep , det , dobj , iobj , neg , nsubj , pobj , poss , pozm , pozv , prep , punct , relcl |
ner |
CARDINAL , DATE , EVENT , FAC , GPE , LANGUAGE , LAW , LOC , MONEY , NORP , ORDINAL , ORG , PERCENT , PERSON , PRODUCT , QUANTITY , TIME , WORK_OF_ART |
Accuracy
Type | Score |
---|---|
TOKEN_ACC |
100.00 |
TOKEN_P |
100.00 |
TOKEN_R |
100.00 |
TOKEN_F |
100.00 |
SENTS_P |
70.42 |
SENTS_R |
64.94 |
SENTS_F |
67.57 |
DEP_UAS |
67.84 |
DEP_LAS |
52.98 |
ENTS_P |
75.06 |
ENTS_R |
75.06 |
ENTS_F |
75.06 |
POS_ACC |
93.09 |
- Downloads last month
- 19
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Evaluation results
- NER Precisionself-reported0.751
- NER Recallself-reported0.751
- NER F Scoreself-reported0.751
- POS (UPOS) Accuracyself-reported0.931
- Unlabeled Attachment Score (UAS)self-reported0.678
- Labeled Attachment Score (LAS)self-reported0.530
- Sentences F-Scoreself-reported0.676