Upload files
An SDK how-to guide. SDK power-users can refer to our full SDK docs generated from our source code here
Uploading images
With a dataset created, images can be uploaded to it using the push
command. push
uses a list of paths to upload files:
dataset.push(["/path/to/1.jpg", "/path/to/2.jpg"])
To upload an entire directory of images, simply specify the path of the directory itself, and push()
will find all files at all depths within this directory to be uploaded:
dataset.push(["/path/to/directory/1", "/path/to/directory/2"])
Additionally, if you want to preserve the local directory structure of your data, you can do so by adding the preserve_folders
argument, which by default is False
:
dataset.push(["/path/to/directory/1", "/path/to/directory/2"], preserve_folders=True)
If you're sorting images into separate directories within your dataset, this can be done by setting path
to the name of the directory within the dataset.
dataset.push(["/path/to/1.jpg", "/path/to/2.jpg"], path="my_directory")
Uploading videos
All of the above functionality applies to uploading video files as well, but two additional options are available for videos:
- 1: You can control the framerate that frames are sampled at from video files at by setting the
fps
argument. To upload a videos at their native framerate, simply leave thefps
argument out. - 2: By default, videos are uploaded as entire video files. They can however be uploaded as individual frames through the
as_frames
argument:
dataset.push(["/path/to/video.mp4"], fps=10, as_frames=True)
Uploading multi-file items
push
supports uploading individual directories of files as multi-slotted, multi-channel, and DICOM series items. To do so, set the item_merge_mode
argument to either slots
, channels
, or series
. For example, to upload the following directory of files as single dataset item with 4 slots:
/path/to/directory/of/files
├── image_1.jpg
├── image_2.jpg
├── image_3.jpg
└── video_1.mp4
files_to_uplodad = ["/path/to/directory/of/files"]
item_merge_mode = "slots"
dataset.push(files_to_upload=files_to_upload, item_merge_mode=item_merge_mode)
If using item_merge_mode
, each file path passed to push must be a directory. The files inside the 1st level of that directory are treated as follows:
Multi-slotted items
- Each file is assigned it's own slot
- Dataset items are named after the directory containing the files that result in the item
- Slots are named as incrementing integers starting from
0
Multi-channel items
- Each file is assigned it's own channel
- Dataset items are named after the directory containing the files that result in the item
- Each slot is named after the file in the slot
DICOM series
- Each
.dcm
file is concatenated into the same slot - Other file types are ignored
- The slot is named after the directory containing the files that result in the item
Search Scope
Unlike most
push
operations, ifitem_merge_mode
is set, only files in the first level of each directory will be considered for upload. Files at lower depths are ignored
Updated about 2 months ago