Zengzhi Wang

SinclairWang

AI & ML interests

Data Engineering for Generative AI

Recent Activity

updated a model about 1 hour ago
SinclairWang/fasttext

Organizations

SinclairWang's activity

upvoted an article 7 days ago
view article
Article

Releasing the largest multilingual open pretraining dataset

94
upvoted an article about 1 month ago
view article
Article

Scaling AI-based Data Processing with Hugging Face + Dask

23
upvoted an article about 2 months ago
view article
Article

RegMix: Data Mixture as Regression for Language Model Pre-training

10