Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
2
1
1
Catherine Arnett
catherinearnett
Follow
danmana's profile picture
genesith's profile picture
vinhnx90's profile picture
15 followers
·
1 following
https://catherinearnett.github.io/
linguist_cat
catherinearnett
AI & ML interests
multilingual NLP, tokenization
Recent Activity
upvoted
an
article
8 days ago
View all activity
Articles
Releasing the largest multilingual open pretraining dataset
8 days ago
•
94
Detoxifying the Commons
21 days ago
•
6
wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR??
Sep 27
•
35
Organizations
catherinearnett
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
liked
a dataset
5 months ago
ambean/lingOly
Viewer
•
Updated
Jun 11
•
90
•
187
•
7