Export your data

Once your data is annotated, you'll likely want to export it. To do so, you need to create and download an export. Here's a quick refresher of what generating a release looks like in the UI:

📘

View releases

If you've already generated one or more releases in V7, you can view each them using the following CLI command:

darwin dataset releases [DATASET_NAME]

You can generate a new release using the snippet below:

release_name = "name_of_export_here"
dataset.export(release_name)

This release will contain all of the completed images and videos within your dataset. You can also filter for specific classes within your dataset by adding the names of your annotation classes:

dataset.export(release_name, annotation_class_ids=[...])

From there, you can specify which release you'd like to pull:

release_name = "name_of_export_here"
try:
    release = dataset.get_release(release_name)
except NotFound:
    print(f"Dataset release {release_name} not found")

Once you have generated your release object, it's time to pull that release. This will pull all completed images and videos and their annotations. Note that currently, the SDK only supports pulling releases in the Darwin JSON 2.0 format:

dataset.pull(release=release)

🚧

Multi-Processing Errors

By default, pull() uses the Python multiprocessing library to significant increase the speed of the download. Because the multiprocessing library imports and runs the script that invoked it when spawning processes, it's necessary to protect all code that invokes it in an if __name__ == "__main__" block. Otherwise processes will be spawned in a loop.

if __name__ == "__main__":
    dataset.pull(release=release, multi_processed=True)

Note that there may be a few seconds delays between the exporting and pulling.

🚧

Waiting for Release Creation

It may be necessary to factor in the release creation time (typically a few seconds) before attempting to pull the release. Otherwise, you may run into a release not found exception.

You can also copy the pre-populated command above to your clipboard by clicking the copy icon for any release from the GUI:

You can pull just the annotations by adding the only_annotators argument.

dataset.pull(release=release, only_annotations=True)

If your dataset has multiple folders, you can keep that structure by using the use_folders argument:

dataset.pull(release=release, use_folders=True)

Finally, if you're exporting video, you can choose to either pull it as a video, or as individual frames. By default, the parameter video_frames is false, but if it's set as true, your video will be pulled with each frame as its own image:

dataset.pull(release=release, video_frames=True)

❗️

Video frame extraction

When you upload videos to Darwin at a non-native framerate, specific video frames are extracted and displayed. When pulling videos uploaded at a non-native framerate, please pass video_frames=True to guarantee a match between your annotations and the resulting frames.

If you instead extract frames from the video files later on, you might experience a mismatch between annotations and frames

The release can also be downloaded using the below line instead of dataset.pull

release.download_zip(Path(f"./{release_name}.zip"))

By putting all the above together, the full code for locating, creating and pulling a release can be found on the recipes page below.