FiftyOne Computer Vision Datasets Come to the Hugging Face Hub

Community Article Published June 3, 2024

Use and Share Cutting Edge Computer Vision Datasets with Ease

What is FiftyOne?

Loading Visual Datasets from the 🤗 Hub
Loading FiftyOne Datasets from the 🤗 Hub

Loading Parquet Datasets from the 🤗Hub with FiftyOne

Pushing FiftyOne Datasets to the 🤗 Hub

Conclusion

📚Resources

Use and Share Cutting Edge Computer Vision Datasets with Ease

Today’s cutting-edge ML models, like transformers and diffusion models, are primarily designed for unstructured data like text, audio, images, and videos. High-quality datasets built out of these unstructured components are essential for benchmarking and training state-of-the-art models.

Connecting this data to our models has always been a pain, with inhomogeneous data schemas putting the onus of data wrangling, filtering, and processing on the end user. Until now!

Introducing the integration between FiftyOne Computer Vision Datasets and the Hugging Face Hub. With this integration, you can

Load visual datasets from the Hugging Face Hub directly into FiftyOne for streamlined data curation, visualization, and model inference/training.
Share visual datasets to the Hugging Face Hub from FiftyOne for improved transparency and reproducibility.

In short:

Hugging Face democratizes ML model distribution and application
FiftyOne brings structure to unstructured visual data
The FiftyOne 🤝 🤗 Hub integration bridges the gap between data and models

Before we dive into this integration, here’s some brief background:

What is FiftyOne?

FiftyOne is the leading open-source toolkit for curating, visualizing, and managing unstructured visual data. The library streamlines data-centric workflows, from finding low-confidence predictions to identifying poor-quality samples and uncovering hidden patterns in your data. The library supports all sorts of visual data, from images and videos to PDFs, point clouds, and meshes.

Whereas tabular data formats like a pandas DataFrame or a Parquet file consist of rows and columns, FiftyOne datasets are considerably more flexible. The atomic element of a fiftyone.Dataset is a sample, which contains all of the information related to a piece of visual data. These attributes are stored in fields, which can be elementary data types or fields with custom schemas (like object detections, keypoints, and polylines). FiftyOne datasets are efficient and flexible data structures for visual data for a few reasons:

FiftyOne Datasets are logical datasets pointing to media files on disk rather than storing the media file contents directly.
FiftyOne datasets are constructed from MongoDB documents, so they inherit the flexibility in MongoDB’s non-relational data model.
FiftyOne natively integrates with vector databases for efficient retrieval and semantic search at scale.

When you put it all together, FiftyOne provides the most straightforward and intuitive API for filtering, indexing, evaluating, and aggregating over visual datasets.

🚀 This powerful data model allows you to apply Hugging Face transformer models directly to image or video datasets with a single line of code.

📢 The code blocks in this blog post require fiftyone>=0.24.0 and huggingface_hub>=0.24.0 (how convenient!).

Loading Visual Datasets from the 🤗 Hub

With FiftyOne’s Hugging Face Hub integration, you can load any FiftyOne dataset uploaded to the hub (see the section below), as well as most image-based datasets stored in Parquet files, which is the standard for datasets uploaded to the hub via the datasets library. The load_from_hub() function in FiftyOne’s Hugging Face utils handles both of these cases!

Loading FiftyOne Datasets from the 🤗 Hub

Any dataset pushed to the hub in one of FiftyOne’s supported common formats should have all of the necessary configuration info in its dataset repo on the hub, so you can load the dataset by specifying its repo_id. As an example, to load the VisDrone detection dataset, all you need is:

import fiftyone as fo
from fiftyone.utils import load_from_hub

## load from the hub
dataset = load_from_hub("Voxel51/VisDrone2019-DET")

## visualize in app
session = fo.launch_app(dataset)

It’s as simple as that!

You can customize the download process, including the number of samples to download, the name of the created dataset object, whether or not it is persisted to disk, and more!

What Datasets Are in FiftyOne Format?

Every dataset uploaded via FiftyOne’s Hugging Face Hub integration will have a fiftyone tag. You can see all datasets with this tag online at this URL. You can also retrieve this list programmatically using the Hugging Face Hub’s API:

from huggingface_hub import HfApi
api = HfApi()
api.list_datasets(tags="fiftyone")

In fact, this is how the list of loadable datasets is populated in FiftyOne’s Hugging Face Hub plugin!

Why load FiftyOne Datasets from the 🤗 Hub? Don’t reinvent the wheel. If the ML community has formatted and processed a popular dataset for you, spend your time on other parts of the model development pipeline.

Loading Parquet Datasets from the 🤗Hub with FiftyOne

You can also use the load_from_hub() function to load datasets from Parquet files, giving you access to an even wider range of computer vision and multimodal datasets. This function makes it easy to specify which features you want to convert into FiftyOne labels, which features point to media files, and which splits/subsets to download. FiftyOne will handle type conversions for you and download images from URLs if necessary.

With this functionality, you can load:

Image classification datasets like Food101 and ImageNet-Sketch
Object detection datasets like CPPE-5 and WIDER FACE
Segmentation datasets like SceneParse150 and Sidewalk Semantic
Image captioning datasets like COYO-700M and New Yorker Caption Contest
Visual question-answering datasets like TextVQA and ScienceQA And many more!

As a simple example, we can load the first 1,000 samples from the WikiArt dataset into FiftyOne with:

import fiftyone as fo
from fiftyone.utils.huggingface import load_from_hub

dataset = load_from_hub(
    "huggan/wikiart",  ## repo_id
    format="parquet",  ## for Parquet format
    classification_fields=["artist", "style", "genre"], ## columns to treat as classification labels
    max_samples=1000,  # number of samples to load
    name="wikiart",  # name of the dataset in FiftyOne
)

This one-line command gives us access to FiftyOne's data visualization, analysis, and understanding capabilities, which we can use to analyze artistic styles!

Why load Parquet Datasets from the 🤗 Hub in FiftyOne?

Simplify your data processing pipelines.
Bring structure to the data with clustering, semantic search, and dimensionality reduction techniques.
Apply transformer models to your entire dataset with a single line of code.

📚 Documentation on loading from the hub

Pushing FiftyOne Datasets to the 🤗 Hub

There’s never been an easier way to share your visual datasets with the world. Whether you are developing the benchmark for a visual understanding task or creating a collection of AI-generated artwork, push_to_hub() from the FiftyOne Hugging Face utils allows you to upload image, video, or 3D datasets to the Hugging Face Hub in a single line of code.

Pushing a dataset to the hub is as simple as:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone.utils.huggingface import push_to_hub

## load example dataset
dataset = foz.load_zoo_dataset("quickstart")

## push to hub
push_to_hub(dataset, "my-hf-dataset")

The upload process is highly customizable: you can specify a license, tags, a description, the number of files to upload at a time, the format of the exported dataset, and more!

When you call push_to_hub(), the dataset will be uploaded to the repo with the specified repo name under your username, and the repo will be created if necessary. A Dataset Card will automatically be generated and populated with instructions for loading the dataset from the hub. You can even upload a thumbnail image/gif to appear on the Dataset Card with the preview_path argument.

Here’s an example using many of these arguments, which would upload the dataset to the private repo https://huggingface.co/datasets/username/my-action-recognition-dataset with tags, an MIT license, a description, and a preview image:

dataset = foz.load_from_zoo("quickstart-video", max_samples=3)

push_to_hub(
    dataset,
    "my-video-dataset",
    tags=["video", "tracking"],
    license="mit",
    description="A dataset of videos for action recognition tasks",
    private=True,
    preview_path="<path/to/preview.png>"
)

Why push FiftyOne Datasets to the 🤗 Hub?

Share your data with your friends (gated access) or the broader ML community!
Participate in our upcoming FiftyOne Dataset Curation Competition on Hugging Face!

📚 Documentation on pushing to the hub

Conclusion

FiftyOne’s Hugging Face Hub integration makes sharing, using, and structuring visual datasets easier than ever. By combining the loading from and pushing to the hub, you can create your own dataset from existing datasets, as my teammate Harpreet Sahota did with MashupVQA. You can also connect your models and data by loading your dataset from the hub into FiftyOne and using FiftyOne’s Hugging Face Transformers integration. Generate embeddings, test out augmentation techniques, and fine-tune models without worrying about data formats.

Go forth: build better datasets and train better models!

📚Resources

Upvote