This time we’re going to be assigning multiple labels to an image depending on the units appearing in it. The model will pretty much stay the same, what’s different is the dataset and how we use it. For single label classification we created a dataset where the filename contained the label. Now we’ll just call the images a number and use a csv file to hold the labels.
The dataset looks something like this:
multi-unit/
labels.csv
train/
0.png
1.png
2.png
etc...
Where labels.csv looks like this:
You can then use the dataset with the following code:
path = untar_data('https://joalon.se/datasets/multi-unit')
src = (ImageList.from_csv(path, 'labels.csv', folder='train', suffix='.png')
.split_by_rand_pct(0.2)
.label_from_df(label_delim=' '))
data = (src.transform(tfms, size=224)
.databunch().normalize(imagenet_stats))
data.show_batch(rows=3, figsize=(15,10))
This time we’ll have to create some other metrics for accuracy. The model will return a probability that the image we’re passing to it contains one of the labels, so we’ll say that a probability of over 0.2 means the model predicted the label. Here’s how to create the model with the metric:
acc_02 = partial(accuracy_thresh, thresh=0.2)
learn = cnn_learner(data, models.resnet50, metrics=[acc_02])
Do note that metrics aren’t used during training other than for showing information.
With the first dataset of about 2000 images I generated I got some pretty bad results. The model thought that a scout cavalry was a penguin, for example. So I generated up to 10 000 images with substantially more of them just containing a single unit. I’m guessing it had trouble with it being too many units as well as them having different animations/facing different ways. With the expanded dataset I got much better results. Let’s see the loss:
As long as the training loss is higher than the validation loss we’re not risking overfitting.
Now let’s try to make a prediction. I took a picture of the berserk from the aoe2 wiki:
It does classify it as a berserk yet it’s also pretty sure about the heavy swordsman. Let’s do some others:
The model does reasonably well but the dataset might still be a bit too small. In the teutonic knight image it didn’t recognize the villager, the knight or the monks and it wrongly predicted a legionary. The dataset used in the actual lesson was 32 GB of pictures of the amazon rainforest so still have some time to go before closing in on those numbers. Next part of lesson 2 is image segmentation, probably my most anticipated lesson!
Written on July 21st , 2019 by Joakim Lönnegren