AI & ML interests

Dataset Viber is your chill repo for data collection, annotation and vibe checks.

Avoid the hype, check the vibe!

I've cooked up Dataset Viber, a cool set of tools to make your life easier when dealing with data for AI models. Dataset Viber is all about making your data prep journey smooth and fun. It's not for team collaboration or production, nor trying to be all fancy and formal - just a bunch of cool tools to help you collect feedback and do vibe-checks as an AI engineer or lover. Want to see it in action? Just plug it in and start vibing with your data. It's that easy!

  • CollectorInterface: Lazily collect data of model interactions without human annotation.
  • AnnotatorInterface: Walk through your data and annotate it with models in the loop.
  • BulkInterface: Explore your data distribution and annotate in bulk.
  • Embdedder: Efficiently embed data with ONNX-optimized speeds.

Need any tweaks or want to hear more about a specific tool? Just open an issue or give me a shout!

  • Data is logged to a local CSV or directly to the Hugging Face Hub.
  • All tools also run in .ipynb notebooks.
  • Models in the loop through fn_model.
  • Input data streamers through fn_next_input.
  • It supports various tasks for text, chat and image modalities.
  • Import and export from the Hugging Face Hub or CSV files.

Examples can be found in src/dataset_viber/examples.

models

None public yet

datasets

None public yet