Import annotations made outside of V7

In this guide, we'll take a look at how you can use Darwin's Python Library to import annotations made outside of V7.

📘

Supported file formats

To import annotations to V7, make sure they're in one of the supported formats below:

  • coco
  • csv_tags
  • csv_tags_video
  • darwin
  • dataloop
  • pascal_voc

Getting started

If this is your first time using the Darwin Python Library, make sure you check out our Getting Started Guide to make sure you have the correct version of Python and our SDK installed. This guide will also show you how to generate an API key, which you'll want to hold onto for the steps below.

In addition to an API key, you will also want to gather:

1. The dataset identifier: dataset_identifier

The dataset identifier is the slugified version of your team name and dataset name (1team-name/dataset-name1). You can gather this using CLI commands. Start by entering the command below:

darwin authenticate

Enter your API key when prompted(and hold onto it, you'll need it again later on).

Once authenticated, enter the following command to pull up a list of your team's datasets and their identifiers:

darwin dataset remote

2. The format name: format_name

The format of the annotations you will be importing to V7: coco, csv-tags, csv-tags-video, darwin, dataloop or pascal-voc.

3. The annotation paths: annotation_paths

The paths to each of the annotation files you will be importing:

annotation_paths = [

"/path/to/annotation/1.json",

"/path/to/annotation/2.json", "/path/to/annotation/3.json"

]

Import annotations

Now that we've gathered our API key, dataset identifier, format name, and annotation paths it's time to import annotations.

This starts by initialising the client using the API key:

import darwin.importer as importer
from darwin.client import Client
from darwin.importer import formats

client = Client.from_api_key(API_KEY)

From there, our dataset identifier can be used to target the dataset in V7:

dataset = client.get_remote_dataset(dataset_identifier=dataset_identifier)

Next, we can fetch the parser object needed to import annotations in the correct format by plugging the format name into the snippet below:

parser = dict(formats.supported_formats)[format_name]
importer.import_annotations(dataset, parser, annotation_paths)

It's also possible to specify the append argument. When append is set to True, the annotations are going to be added to the target items. When append is set to False (that's the default value), the target item is overwritten.

importer.import_annotations(dataset, parser, annotation_paths, append=True)

📘

Append

Adding append=True to the command above will add the imported annotations to any existing annotations in the destination dataset.

If you would like to overwrite any existing annotations, enter append=False instead.

That's it! Check out the rest of our Darwin Python Library guides on how to upload images and video, create classes, and pull data.