Urchade Zaratiana

urchade

AI & ML interests

None yet

Organizations

Posts 3

view post
Post
7057
**Release Announcement: gliner_multi_pii-v1**

I am pleased to announce the release of gliner_multi_pii-v1, a model developed for recognizing a wide range of Personally Identifiable Information (PII). This model is the result of fine-tuning the urchade/gliner_multi-v2.1 on synthetic dataset (urchade/synthetic-pii-ner-mistral-v1).

**Model Features:**
- Capable of identifying multiple PII types including addresses, passport numbers, emails, social security numbers, and more.
- Designed to assist with data protection and compliance across various domains.
- Multilingual (English, French, Spanish, German, Italian, Portugese)

Link: urchade/gliner_multi_pii-v1

from gliner import GLiNER

model = GLiNER.from_pretrained("urchade/gliner_multi_pii-v1")

text = """
Harilala Rasoanaivo, un homme d'affaires local d'Antananarivo, a enregistré une nouvelle société nommée "Rasoanaivo Enterprises" au Lot II M 92 Antohomadinika. Son numéro est le +261 32 22 345 67, et son adresse électronique est [email protected]. Il a fourni son numéro de sécu 501-02-1234 pour l'enregistrement.
"""

labels = ["work", "booking number", "personally identifiable information", "driver licence", "person",  "address", "company",  "email", "passport number", "Social Security Number", "phone number"]
entities = model.predict_entities(text, labels)

for entity in entities:
    print(entity["text"], "=>", entity["label"])


Harilala Rasoanaivo => person
Rasoanaivo Enterprises => company
Lot II M 92 Antohomadinika => full address
+261 32 22 345 67 => phone number
[email protected] => email
501-02-1234 => Social Security Number

view post
Post
7647
**Some updates on GLiNER**

🆕 A new commercially permissible multilingual version is available urchade/gliner_multiv2.1

🐛 A subtle bug that causes performance degradation on some models has been corrected. Thanks to @yyDing1 for raising the issue.

from gliner import GLiNER

# Initialize GLiNER
model = GLiNER.from_pretrained("urchade/gliner_multiv2.1")

text = "This is a text about Bill Gates and Microsoft."

# Labels for entity prediction
labels = ["person", "organization", "email"]

entities = model.predict_entities(text, labels, threshold=0.5)

for entity in entities:
    print(entity["text"], "=>", entity["label"])