split

Splits a LocalDataset using two strategies:

random
stratified

With the random strategy, DatasetItems are randomly assigned to the Validation, Test and Training partitions, while with the stratified one, the assignment will consider the type of Annotations and use this to create more balanced partitions.

> darwin dataset split dog-dataset -v 0.1 -t 0.2
Partition lists saved at /Users/john/.darwin/datasets/v7-john/dog-dataset/releases/latest/lists/145_20_41

By the end of the operation, the result of the operation for both strategies will be under the displayed folder.

Positional arguments:

dataset: Local dataset name to split.

Optional arguments:

-v VAL_PERCENTAGE, --val-percentage VAL_PERCENTAGE: Validation percentage.
-t TEST_PERCENTAGE, --test-percentage TEST_PERCENTAGE: Test percentage.
-s SEED, --seed SEED: Split seed.

🚧
Mandatory arguments
Even though -v and -t are marked as "Optional arguments", they are mandatory and the command will break unless you provide them.

📘
Understanding Divisions
The name of the output folder tells you which division was made. For example, if the folder is 18_2_5, this means that:

18 DatasetItems were used in Training

2 DatasetItems were used in Validation

5 DatasetItems were used in Testing

Watch it in action!

split

🚧
Mandatory arguments

📘
Understanding Divisions

🚧Mandatory arguments

📘Understanding Divisions

🚧
Mandatory arguments

📘
Understanding Divisions