Train a Model

V7's Models feature allows you to train and run models directly from V7 in a matter of minutes. In this series of guides, we'll show you how to train your first model, how to add model stages to your labelling workflow, and finally, how to run models from V7 through the API. Let's start with getting your first model up and running.

Getting started

As with any model, the results that we get out of the model that we run in V7 will be heavily dependent on the data that we use to train it.

Each model that you run with the Models feature will be based off of the data in a single dataset. Before we train our model, we want to make sure that a couple of things are true of our dataset.

For Instance, Segmentation and Object Detection models:

All the files that we'd like to use are in a Completed state.
There are enough instances (at least 100) of each class in the dataset for the model to be able to learn them, spread over at least 10 files.

For Classification model:

All the files that we'd like to use are in a Completed state.
At least two tag classes in a dataset, selected for training.
At least 100 tag instances for each class.

If you've got a dataset full of completed items with well-represented classes throughout, then you're ready to start training!

📘
V7 Darwin currently doesn't support model training with slotted items

Create a new model

Select Models from the side menu. Hit Train a Model to select the model type that you'd like to use, and name your model.

Model Type	Description	Model Type
Instance Segmentation	This type of model encloses each individual object in a polygon, around which will be an enclosing bounding box. Instance segmentation models are trained on the polygons in your training data, and are only compatible with annotations made in V7 with Auto-Annotate, the Brush Tool, and the Polygon Tool	Darwin Instance Segmentation
Object Detection	This type of model encloses each individual object in a bounding box. Object Detection models are trained on the bounding boxes and/or polygons in your training data, and are only compatible with annotations made in V7 with the Bounding Box Tool, Auto-Annotate, the Brush Tool, and the Polygon Tool	Deformable DETR
Classification	This type of model provides the most likely tag given to an image, based on its pixel content. Classification models are trained on the tags in your training data, which can be easily added to files in bulk from the Data page, or one by one in the workview of any file. At least two tag classes must be selected for training in order to train a classification model.	Darwin Classifier

Select the dataset that you'll be using and toggle the Instances/Images options to see the distribution of classes in your dataset by number of instances and number of files. Hover over any class to get V7's take on how it will perform.

Deselect any classes that you will not be training your model on, and click Continue for a breakdown of how your data will be split between Training, Validation, and Test sets.

Check the Summary for an estimate of how long training will last, and click Start Training. V7 will email you to let you know when training is complete.

Understand your results

Once training is complete, you'll have a few important pieces of information to assess its performance before testing it out.

Loss - all model types

The first we'll focus on is Loss. Every type of model trained in V7 will display a loss figure and loss curve. The number that V7 displays is the loss function at the latest training epoch. The lower the loss, the fewer mistakes the model is making. Here's what we want to see:

The loss curve of a well-learned model should follow an L-shape, decreasing sharply at first and approaching a flat line over time. This is a visual representation of your model making less mistakes over time.

mAP - Instance Segmentation and Object Detection models

Mean Average Precision is measured by taking the mean of all average precisions (the area under a Precision vs Recall curve) across all IoU thresholds and for all classes. This metric provides an overall model performance, irrespective of any manually-set threshold.

mAP can be filtered by different IoU thresholds in a neural network's summary page.

mAP is a strict measurement, so don’t be discouraged if the number is lower than expected. 85% can be considered a very high mAP.

Accuracy - Classification models

Accuracy is the percentage of predictions our model got right in the test set. It's computed by dividing the number of correct predictions by the total number of test examples.

If your metrics are looking good, then it's time for the fun to begin. You can now run your model through the API, and use it to automatically label your data.