Upload data to a dataset (SDK)

Now that you know how to manage datasets and how to download data from them, we are going to cover the reverse operation, which is uploading data to a dataset.

By end of the section you should be able to understand:

  • How to upload an image to a dataset

1. push

Uploads the given files to the given dataset.

from darwin.client import Client
from pathlib import Path

# Authenticate
client = Client.local()


# Get the dataset
dataset = client.get_remote_dataset('my-team-slug/my-dataset-slug')

files = [
    Path('/home/user/new_images/my-image.jpg')
]

# Push the files to the dataset
handler = dataset.push(files)

#Print handler information
print(handler.blocked_items, handler.pending_items, handler.errors)

The push function returns a handler. This handler has three important parts:

  • blocked_items: repeated items are marked as blocked and are not uploaded
  • pending_items: items that have not yet been uploaded but are in the queue
  • errors: any errors that may have occurred when uploading the files

When an upload fails it is retried every 2 seconds for 5 times. After this, if it still fails, the file wont be uploaded.