External S3 Buckets

Using V7's external S3 integration, you can keep your data stored within a private AWS domain. Check out the diagram below to see how it works, and if you're ready to get started follow our step-by-step instructions to create the integration.

1780

🚧

The S3 integration is available on V7's Business and Enterprise plans. You can find out more about what each plan includes on our pricing page.

Read / Write access

To setup an external s3 account we first need to give our AWS role (arn:aws:iam::258327614892:role/external_s3 ) access:

  • Read via GetObject
  • Write via PutObject (optional)
{
	"Version": "2012-10-17",
	"Id": "PolicyForExternalAccess",
	"Statement": [
    {
      "Sid": "Darwin Access",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::258327614892:role/external_s3"
      },
      "Action": ["s3:GetObject", "s3:PutObject"],
      "Resource": "arn:aws:s3:::your-s3-bucket-name/*"
    }
  ]
}

If you don't need Darwin to process images after they are uploaded (e.g. generate thumbnails, split video frames etc), then you can leave out the Write access "s3:PutObject"

{
	"Version": "2012-10-17",
	"Id": "PolicyForExternalAccess",
	"Statement": [
    {
      "Sid": "Darwin Access",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::258327614892:role/external_s3"
      },
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::your-s3-bucket-name/*"
    }
  ]
}

If you already have a policy for your bucket, then you only need to add the Statement part

CORS access

When annotators are requesting images to annotate, they will load them directly from your s3 bucket via a presigned url. However since that s3 bucket sits on a different domain than darwin.v7labs.com a CORS header needs to be configured.

You can find this under Permissions > CORS Configuration in the AWS S3 UI:

[
    {
        "AllowedHeaders": [
            "*"
        ],
        "AllowedMethods": [
            "GET"
        ],
        "AllowedOrigins": [
            "https://darwin.v7labs.com"
        ],
        "ExposeHeaders": []
    }
]
<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
	<CORSRule>
		<AllowedOrigin>https://darwin.v7labs.com</AllowedOrigin>
		<AllowedMethod>GET</AllowedMethod>
	</CORSRule>
</CORSConfiguration>

Activation

When this is all setup, please message [email protected] with the following details:

  • s3 region
  • s3 bucket name
  • an optional prefix where we can upload thumbnails if needed (often /darwin/ )
  • your team name
    And we will turn on the external access for your team.

If you encounter any issues or have any questions feel free to contact us at [email protected]

"Uploading" new images

After the external bucket is ready and you have populated it with images you need to notify Darwin which images Darwin should list, and for which dataset, this is done via a REST POST request.

All REST endpoints in Darwin are using api keys for authentication, you can setup your own key here.

The following python scripts shows how to register a newly added image:

import requests

api_key = "your_key_here"

team_slug = "team_slug_here"
dataset_slug = "dataset_slug_here"
storage_name = "storage_name_here"

headers = {
	"Content-Type": "application/json",
	"Authorization": f"ApiKey {api_key}"
}

payload = {
	"items": [
		{	
		  "type": "image",
		  "key": "darwin/cars/2008_000074.jpg",
		  "filename": "2008_000074.jpg"
		}
	],
  "storage_name": storage_name
}

response = requests.put(
  f"https://darwin.v7labs.com/api/teams/{team_slug}/datasets/{dataset_slug}/data",
  headers=headers,
  json=payload
)

if response.status_code != 200:
	print("request failed", response.text)
else:
	print("success")

Generating thumbnails

If you are generating your own thumbnails the payload changes slightly

"items": [
	{
		"type": "image",
		"key": "darwin/cars/2008_000074.jpg",
		"thumbnail_key": "darwin/2008_000074_thumbnail.jpg"
		"filename": "2008_000074.jpg"
	}
]

We recommend using mogrify for thumbnail generation.

> mogrify -resize "356x200>" -format jpg -quality 50 -write thumbnail.jpg large.png