Import annotations made outside of V7
In this guide, we'll take a look at how you can use Darwin's Python Library to import annotations made outside of V7.
Supported file formats
To import annotations to V7, make sure they're in one of the supported formats below:
- Darwin JSON
- COCO
- Pascal VOC
- Dataloop
- Labelbox
- Super-Annotate
- CSV (tags only)
- NIfTI (polygons only)
Getting started
If this is your first time using the Darwin Python Library, check out our Getting Started Guide to make sure you have the correct version of Python and our SDK installed. This guide will also show you how to generate an API key, which you'll want to hold onto for the steps below.
In addition to an API key, you will also want to gather:
1. The dataset identifier
The dataset identifier is the slugified version of your team name and dataset name (team-name/dataset-name
). You can gather this using CLI commands. Start by entering the command below:
darwin authenticate
Enter your API key when prompted (and hold onto it, you'll need it again later on).
Once authenticated, enter the following command to pull up a list of your team's datasets and their identifiers:
darwin dataset remote
2. The annotation format name
The name of the format of the annotations you will be importing to V7. We currently support the following formats:
- Darwin JSON Format:
"darwin"
- COCO:
"coco"
- Pascal VOC:
"pascal_voc"
- Dataloop:
"dataloop"
- Labelbox:
"labelbox"
- Super-annotate:
"superannotate"
- CSV (for image tags):
"csv_tags"
- CSV (for video tags):
"csv_tags_video"
- NIfTI (for polygons):
"nifti"
3. The annotation paths
The paths to each of the annotation files you will be importing:
annotation_paths = [
"/path/to/annotation/1.json",
"/path/to/annotation/2.json",
"/path/to/annotation/3.json"
]
Import annotations
Now that we've gathered our API key, dataset identifier, format name, and annotation paths it's time to import annotations.
This starts by initialising the client using the API key:
import darwin.importer as importer
from darwin.client import Client
from darwin.importer import get_importer
client = Client.from_api_key(API_KEY)
From there, our dataset identifier can be used to target the dataset in V7:
dataset = client.get_remote_dataset(dataset_identifier=dataset_identifier)
Next, we can fetch the parser object needed to import annotations in the correct format by plugging the format name into the snippet below:
parser = get_importer(format_name)
importer.import_annotations(dataset, parser, annotation_paths, append=False)
If importing annotations to dataset items that already have annotations, those existing annotations will be removed before the import. By default, you will be warned before overwriting annotations in this way. The warning can be bypassed by setting the overwrite
argument to True:
importer.import_annotations(dataset, parser, annotation_paths, append=False, overwrite=True)
It's also possible to specify the append
argument. When append
is set to True
, the annotations are going to be added to the target items. When append
is set to False
, the target item is overwritten.
importer.import_annotations(dataset, parser, annotation_paths, append=True)
Appending annotations
Adding
append=True
to the function call above will add the imported annotations, without overriding the existing ones!
Adding CSV Tags
In order to import tags via CSV file, you should format the .csv
file as follows:
- The first column should be the name & file path of the file you are uploading to in V7
- Additional columns should be tag names you wish to add to that file
e.g. If you wanted to add the tags "Blue", "Yellow Beak" and "Winter" to the "bird.jpeg" file and "Brown" and "Summer" to the "bear.jpeg" file then the CSV would have the following format:
bird.jpeg,Blue,Yellow Beak, Winter
/path/to/dataset/folder/bear.jpeg,Brown,Summer
To upload CSV tags to video, each tag needs to be listed in a separate row and include the starting & ending frame indexes. If importing multiple tag annotations to the same dataset item, be sure to set the append
option to True
. See the example below:
bird.jpeg, Blue, start_frame_index, end_frame_index
bird.jpeg, Yellow Beak, start_frame_index, end_frame_index
bird.jpeg, Winter, start_frame_index, end_frame_index
bear.jpeg, Brown, start_frame_index, end_frame_index
bear.jpeg, Summer, start_frame_index, end_frame_index
Importing Annotations by Slot
If you have more than one slot, then you will want to ensure that you import annotations into the correct slot. This can be done by specifying the x and y coordinates of your annotations on the file in the slot and specifying the correct slot name in the slots_names
field. In the example annotation snippet below, a keypoint is added to each slot in a two slotted item.
{
"annotators": [
{
"email": "[email protected]",
"full_name": "Example User"
}
],
"id": "2a71ca96-52e3-4c0f-aa0e-9bafe277240a",
"keypoint": {
"x": 712.47,
"y": 133.45
},
"name": "Location",
"reviewers": [],
"slot_names": [
"1"
],
"updated_at": "2023-03-01T16:44:44"
},
{
"annotators": [
{
"email": "[email protected]",
"full_name": "Example User"
}
],
"id": "f4b76b06-2b21-4ab3-86d4-9dff6d8266c0",
"keypoint": {
"x": 485.6,
"y": 233.11
},
"name": "Location",
"reviewers": [],
"slot_names": [
"2"
],
"updated_at": "2023-02-28T18:33:15"
}
Note that the coordinate systems of individual slots are completely independent of each other. You will not be able to add an annotation to another slot by increasing/decreasing the coordinate values of the annotations.
You can read more about our slots functionality here.
Required Darwin JSON 2.0 Fields
To construct Darwin JSON 2.0 files for import, many of the fields in the schema are not required. Below is a sample unannotated Darwin JSON 2.0 file with only the required fields:
{
"version": "2.0",
"schema_ref": "https://darwin-public.s3.eu-west-1.amazonaws.com/darwin_json/2.0/schema.json",
"item": {
"name": "item_name.jpg",
"path": "/"
},
"annotations": []
}
Adding Annotation Confidence Scores
Darwin-py 0.8+ supports the ability to add confidence scores to your imported annotations. To do this, add you can add an inference
field to the annotation payload
"inference": {
"confidence": 0.75,
"model": {
"id": "82b533b4-2637-468b-ba7b-9c9073f4f085",
"name": "Test",
"type": "external"
}
}
All fields are required:
confidence
: the confidence score from 0 to 1model
-id
: a UUID v4
-name
: The name of your model
-type
: if the model was trained outside of V7, this should beexternal
With annotation data, it would look something like the following:
{
"annotators": [
{
"email": "[email protected]",
"full_name": "Mark CS"
}
],
"bounding_box": {
"h": 50.0,
"w": 35.0,
"x": 1072.0,
"y": 618.0
},
"inference": {
"confidence": 0.75,
"model": {
"id": "82b533b4-2637-468b-ba7b-9c9073f4f085",
"name": "Test",
"type": "external"
}
},
"id": "15461689-5ba8-4eea-b600-07ad250b58e8",
"name": "Sign DV",
"polygon": {
"path": [
{
"x": 1090.0,
"y": 618.0
},
{
"x": 1089.0,
"y": 619.0
},
{
"x": 1089.0,
"y": 622.0
},
{
"x": 1086.0,
"y": 624.0
},
{
"x": 1086.0,
"y": 626.0
},
{
"x": 1085.0,
"y": 627.0
},
{
"x": 1085.0,
"y": 631.0
},
{
"x": 1081.0,
"y": 634.0
},
{
"x": 1079.0,
"y": 634.0
},
{
"x": 1075.0,
"y": 636.0
},
{
"x": 1072.0,
"y": 636.0
},
{
"x": 1072.0,
"y": 647.0
},
{
"x": 1074.0,
"y": 648.0
},
{
"x": 1075.0,
"y": 650.0
},
{
"x": 1076.0,
"y": 649.0
},
{
"x": 1077.0,
"y": 650.0
},
{
"x": 1078.0,
"y": 649.0
},
{
"x": 1079.0,
"y": 650.0
},
{
"x": 1078.0,
"y": 652.0
},
{
"x": 1075.0,
"y": 652.0
},
{
"x": 1073.0,
"y": 653.0
},
{
"x": 1072.0,
"y": 655.0
},
{
"x": 1074.0,
"y": 657.0
},
{
"x": 1074.0,
"y": 663.0
},
{
"x": 1075.0,
"y": 664.0
},
{
"x": 1086.0,
"y": 663.0
},
{
"x": 1088.0,
"y": 666.0
},
{
"x": 1091.0,
"y": 666.0
},
{
"x": 1095.0,
"y": 668.0
},
{
"x": 1105.0,
"y": 668.0
},
{
"x": 1107.0,
"y": 666.0
},
{
"x": 1106.0,
"y": 664.0
},
{
"x": 1106.0,
"y": 662.0
},
{
"x": 1107.0,
"y": 661.0
},
{
"x": 1107.0,
"y": 656.0
},
{
"x": 1106.0,
"y": 655.0
},
{
"x": 1107.0,
"y": 654.0
},
{
"x": 1106.0,
"y": 640.0
},
{
"x": 1105.0,
"y": 639.0
},
{
"x": 1105.0,
"y": 636.0
},
{
"x": 1104.0,
"y": 635.0
},
{
"x": 1104.0,
"y": 632.0
},
{
"x": 1103.0,
"y": 631.0
},
{
"x": 1102.0,
"y": 626.0
},
{
"x": 1101.0,
"y": 627.0
},
{
"x": 1100,
"y": 626.0
},
{
"x": 1100,
"y": 625.0
},
{
"x": 1098.0,
"y": 623.0
},
{
"x": 1098.0,
"y": 621.0
},
{
"x": 1097.0,
"y": 621.0
},
{
"x": 1095.0,
"y": 618.0
}
]
},
"reviewers": []
},
That's it! Check out the rest of our Darwin Python Library guides on how to upload images and video, create classes, and pull data.
Uploading SubAnnotations
If the class for the annotation you are trying to upload already exists, all you need to do is add the sub-annotation to the annotation file and relevant annotation. You would then upload as normal.
Below is an example Darwin JSON snippet including all subannotation types:
"annotations": [
{
"annotators": [
{
"email": "[email protected]",
"full_name": "Mark CS"
}
],
"attributes": [
"Example attribute"
],
"bounding_box": {
"h": 333.74,
"w": 509.9,
"x": 236.77,
"y": 52.53
},
"directional_vector": {
"angle": -2.22,
"length": 161.54
},
"id": "d44eed57-6526-46e9-88d2-3aa3c3cdb9ea",
"instance_id": {
"value": 64
},
"name": "TestBBOX",
"reviewers": [],
"slot_names": [
"0"
],
"text": {
"text": "Example text"
},
"updated_at": "2023-05-05T09:53:25"
}
]
Note on uploading SubAnnotations and Class doesn't exist
If the class does not already exist (or does not include this subannotation), you will need to perform the upload twice. The first upload will create the annotation class and ignore subtypes. The second will add the subannotations.
Note on annotation types
There are small differences to note on how the path and filename fields are defined in different annotation file formats. For example, for the Darwin JSON format the path to the folder of the image files within V7 is defined in the 'path' field, while the filename itself is defined within the 'name' field. For a COCO format however, both the filename and path to the folder containing the files is defined within 'file_name' field.
An example of the fields in the Darwin JSON format and a COCO format are shown below respectively
Darwin JSON format
"name": "Brush_Training.jpeg",
"path": "/Set1",
COCO format
"file_name": "Set1/Brush_Training.jpeg",
Importing RLE
If you would like to import directly in RLE rather than converting to the vectorized Darwin JSON polygons form then you will need to have Raster mode enabled on your account. Please reach out to your CSM if you are interested.
Once enabled, you can then import the RLE using Darwin JSON. The simplest way to get the required format and class mappings is to create an example annotation export using the mask format with each pixel class you would like to include in your import. This can then be used as a template. You will then need to edit the dense_rle
field in the __raster_layer__
section of your import to reflect the RLE you would like to add to the platform.
Each mask has its own unique id which maps to a shorter id used in the RLE. This can be found in the mask_annotation_ids_mapping
.
The RLE format is odd numbers reflect the class and even numbers reflect the number of pixels of that class. So 41 background class pixels (0) followed by 52 of a non background pixel (1) would be:
"dense_rle": [
0,
41,
1,
52
]
Below is a a full example with some of Darwin JSON including RLE:
{
"version": "2.0",
"schema_ref": "https://darwin-public.s3.eu-west-1.amazonaws.com/darwin_json_2_0.schema.json",
"item": {
"name": "e13cb60a2bfc1c22d2524518b7444f92e37fe5d404b0144390f8c078a1ebb2_640.jpg",
"path": "/",
"source_info": {
"dataset": {
"name": "Raster_Dataset",
"slug": "raster_dataset",
"dataset_management_url": "https://darwin.v7labs.com/datasets/612356/dataset-management"
},
"item_id": "0185edd1-1fe9-e8fa-07c7-3d0fe74662b4",
"team": {
"name": "Mark Test",
"slug": "mark-test"
},
"workview_url": "https://darwin.v7labs.com/workview?dataset=612356&item=0185edd1-1fe9-e8fa-07c7-3d0fe74662b4"
},
"slots": [
{
"type": "image",
"slot_name": "0",
"width": 640,
"height": 480,
"thumbnail_url": "https://darwin.v7labs.com/api/v2/teams/mark-test/files/af7c9720-008d-4449-a520-567023908b59/thumbnail",
"source_files": [
{
"file_name": "e037b00b2af41c22d2524518b7444f92e37fe5d404b0144390f8c078a0eabd_640.jpg",
"url": "https://darwin.v7labs.com/api/v2/teams/mark-test/uploads/c560b065-4642-421f-9b30-65a60ab631e2"
}
]
}
]
},
"annotations": [
{
"annotators": [
{
"email": "[email protected]",
"full_name": "Dave"
}
],
"id": "fcd36e48-64ea-43f3-a8b8-d37a28fcca30",
"mask": {
"sparse_rle": null
},
"name": "Grass_Mask",
"reviewers": [],
"slot_names": [
"0"
],
"updated_at": "2023-01-27T11:35:48"
},
{
"annotators": [
{
"email": "[email protected]",
"full_name": "Dave"
}
],
"id": "9e29f364-85df-44e9-832b-f0cf5f4576f9",
"mask": {
"sparse_rle": null
},
"name": "Background_Mask",
"reviewers": [],
"slot_names": [
"0"
],
"updated_at": "2023-01-27T11:35:00"
},
{
"annotators": [
{
"email": "[email protected]",
"full_name": "Eric"
},
{
"email": "[email protected]",
"full_name": "Dave"
}
],
"id": "bea5a84a-9c3a-4f6c-b7d1-3ec029dd0698",
"mask": {
"sparse_rle": null
},
"name": "Sheep_Mask",
"reviewers": [],
"slot_names": [
"0"
],
"updated_at": "2023-05-31T09:04:13"
},
{
"annotators": [
{
"email": "[email protected]",
"full_name": "Dave"
},
{
"email": "[email protected]",
"full_name": "Eric"
}
],
"id": "0d0de08b-f989-45ac-a6a8-34d4d7767031",
"name": "__raster_layer__",
"raster_layer": {
"dense_rle": [
0,
467,
3,
76,
0,
558,
3,
1,
0,
1,
3,
80,
0,
14
],
"mask_annotation_ids_mapping": {
"9e29f364-85df-44e9-832b-f0cf5f4576f9": 2,
"bea5a84a-9c3a-4f6c-b7d1-3ec029dd0698": 1,
"fcd36e48-64ea-43f3-a8b8-d37a28fcca30": 3
},
"total_pixels": 307200
},
"reviewers": [],
"slot_names": [
"0"
],
"updated_at": "2023-05-31T09:04:13"
}
]
}
Panoptic Annotation
You can use rasters alongside vectorized annotations like polygons in the same items.
Updated 6 months ago