Upload files

Uploading images

With a dataset created, images can be uploaded to it using the push command. push uses a list of paths to upload files:

dataset.push(["/path/to/1.jpg", "/path/to/2.jpg"])

To upload an entire directory of images, simply specify the path of the directory itself, and push() will find all files at all depths within this directory to be uploaded:

dataset.push(["/path/to/directory/1", "/path/to/directory/2"])

Additionally, if you want to preserve the local directory structure of your data, you can do so by adding the preserve_folders argument, which by default is False:

dataset.push(["/path/to/directory/1", "/path/to/directory/2"], preserve_folders=True)

If you're sorting images into separate directories within your dataset, this can be done by setting path to the name of the directory within the dataset.

dataset.push(["/path/to/1.jpg", "/path/to/2.jpg"], path="my_directory")

Uploading videos

All of the above functionality applies to uploading video files as well, but two additional options are available for videos:

  • 1: You can control the framerate that frames are sampled at from video files at by setting the fps argument. To upload a videos at their native framerate, simply leave the fps argument out.
  • 2: By default, videos are uploaded as entire video files. They can however be uploaded as individual frames through the as_frames argument:
dataset.push(["/path/to/video.mp4"], fps=10, as_frames=True)

Uploading multi-file items

push supports uploading individual directories of files as multi-slotted, multi-channel, and DICOM series items. To do so, set the item_merge_mode argument to either slots, channels, or series. For example, to upload the following directory of files as single dataset item with 4 slots:

/path/to/directory/of/files
├── image_1.jpg
├── image_2.jpg
├── image_3.jpg
└── video_1.mp4
files_to_uplodad = ["/path/to/directory/of/files"]
item_merge_mode = "slots"

dataset.push(files_to_upload=files_to_upload, item_merge_mode=item_merge_mode)

If using item_merge_mode, each file path passed to push must be a directory. The files inside the 1st level of that directory are treated as follows:

Multi-slotted items

  • Each file is assigned it's own slot
  • Dataset items are named after the directory containing the files that result in the item
  • Slots are named as incrementing integers starting from 0

Multi-channel items

  • Each file is assigned it's own channel
  • Dataset items are named after the directory containing the files that result in the item
  • Each slot is named after the file in the slot

DICOM series

  • Each .dcm file is concatenated into the same slot
  • Other file types are ignored
  • The slot is named after the directory containing the files that result in the item

🚧

Search Scope

Unlike most push operations, if item_merge_mode is set, only files in the first level of each directory will be considered for upload. Files at lower depths are ignored