Utilisation de tensorflow en FR. http://perso.univ-lemans.fr/~berger/CoursTF/CoursTF/co/tensorFlow.html
# !unzip "drive/MyDrive/Dog Vision /dog-breed-identification.zip" -d "drive/MyDrive/Dog Vision/"
# End-to-end Multi-class Dog Breed Classification
# This notebook builds an end-to-end multi-class image classifier using TensorFlow 2.0 and TensorFlow Hub.
## 1. Problem
# Identifying the breed of a dog given an image of a dog.
# import necessary tools
import tensorflow as tf
import tensorflow_hub as hub
print("TF version:", tf.__version__)
print("TF Hub version:", hub.__version__)
TF version: 2.8.2 TF Hub version: 0.12.0
# Check for GPU availability
print("GPU", "available (YESSSS!!!!!") if tf.config.list_physical_devices("GPU") else " GPU not available"
GPU available (YESSSS!!!!!
# Running this cell will provide you with a token to link your drive to this notebook
# from google.colab import drive
# drive.mount('/content/drive')
### https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly
With all machine learning models, our data has to be in numerical format. So that's what we'll be doing first (numerical representations).
## Now the data files we're working with are available on our Google Drive, we can start to check it out.
# Checkout the labels of our data
import pandas as pd
labels_csv = pd.read_csv("drive/MyDrive/labels.csv")
print(labels_csv.describe())
print(labels_csv.head())
id breed count 10222 10222 unique 10222 120 top 05beb3230462b740e5c56230eb27a7a4 scottish_deerhound freq 1 126 id breed 0 000bec180eb18c7604dcecc8fe0dba07 boston_bull 1 001513dfcb2ffafc82cccf4d8bbaba97 dingo 2 001cdf01b096e06d78e9e5112d419397 pekinese 3 00214f311d5d2247d5dfe4fe24b2303d bluetick 4 0021f9ceb3235effd7fcde7f7538ed62 golden_retriever
labels_csv.head()
id | breed | |
---|---|---|
0 | 000bec180eb18c7604dcecc8fe0dba07 | boston_bull |
1 | 001513dfcb2ffafc82cccf4d8bbaba97 | dingo |
2 | 001cdf01b096e06d78e9e5112d419397 | pekinese |
3 | 00214f311d5d2247d5dfe4fe24b2303d | bluetick |
4 | 0021f9ceb3235effd7fcde7f7538ed62 | golden_retriever |
Looking at this, we can see there are 10222 different ID's (meaning 10222 different images) and 120 different breeds.
Let's figure out how many images there are of each breed.
# How many images are there of each breed?
labels_csv["breed"].value_counts().plot.bar(figsize=(20, 10));
# What's the median number of images per class?
labels_csv["breed"].value_counts()
scottish_deerhound 126 maltese_dog 117 afghan_hound 116 entlebucher 115 bernese_mountain_dog 114 ... golden_retriever 67 komondor 67 brabancon_griffon 67 eskimo_dog 66 briard 66 Name: breed, Length: 120, dtype: int64
# What's the median number of images per class?
labels_csv["breed"].value_counts().median()
82.0
# Let's view an image
from IPython.display import Image
Image("drive/MyDrive/Dog Vision/train/000bec180eb18c7604dcecc8fe0dba07.jpg")
Let's get a list of all our image file pathnames.
labels_csv.head()
id | breed | |
---|---|---|
0 | 000bec180eb18c7604dcecc8fe0dba07 | boston_bull |
1 | 001513dfcb2ffafc82cccf4d8bbaba97 | dingo |
2 | 001cdf01b096e06d78e9e5112d419397 | pekinese |
3 | 00214f311d5d2247d5dfe4fe24b2303d | bluetick |
4 | 0021f9ceb3235effd7fcde7f7538ed62 | golden_retriever |
# Create pathnames from image ID's
filenames = [fname for fname in labels_csv['id']]
# check the first 10
filenames[:10]
['000bec180eb18c7604dcecc8fe0dba07', '001513dfcb2ffafc82cccf4d8bbaba97', '001cdf01b096e06d78e9e5112d419397', '00214f311d5d2247d5dfe4fe24b2303d', '0021f9ceb3235effd7fcde7f7538ed62', '002211c81b498ef88e1b40b9abf84e1d', '00290d3e1fdd27226ba27a8ce248ce85', '002a283a315af96eaea0e28e7163b21b', '003df8b8a8b05244b1d920bb6cf451f9', '0042188c895a2f14ef64a918ed9c7b64']
# Create pathnames from image ID's
filenames = ["drive/MyDrive/Dog Vision/train/" + fname + ".jpg" for fname in labels_csv["id"]]
# Check the first 10 filenames
filenames[:10]
['drive/MyDrive/Dog Vision/train/000bec180eb18c7604dcecc8fe0dba07.jpg', 'drive/MyDrive/Dog Vision/train/001513dfcb2ffafc82cccf4d8bbaba97.jpg', 'drive/MyDrive/Dog Vision/train/001cdf01b096e06d78e9e5112d419397.jpg', 'drive/MyDrive/Dog Vision/train/00214f311d5d2247d5dfe4fe24b2303d.jpg', 'drive/MyDrive/Dog Vision/train/0021f9ceb3235effd7fcde7f7538ed62.jpg', 'drive/MyDrive/Dog Vision/train/002211c81b498ef88e1b40b9abf84e1d.jpg', 'drive/MyDrive/Dog Vision/train/00290d3e1fdd27226ba27a8ce248ce85.jpg', 'drive/MyDrive/Dog Vision/train/002a283a315af96eaea0e28e7163b21b.jpg', 'drive/MyDrive/Dog Vision/train/003df8b8a8b05244b1d920bb6cf451f9.jpg', 'drive/MyDrive/Dog Vision/train/0042188c895a2f14ef64a918ed9c7b64.jpg']
# Check whether number of filenames matches number of actual image files
import os
if len(os.listdir("drive/MyDrive/Dog Vision/train/")) == len(filenames):
print("Filenames match actual amount of files!!! Proceed")
else:
print("Filenames do not match actual amount of files, check the target directory.")
Filenames match actual amount of files!!! Proceed
# Check an image directly from a filepath
Image(filenames[9000])
# which kind of dog is that
labels_csv["breed"][9000]
'tibetan_mastiff'
labels = labels_csv["breed"]
labels
0 boston_bull 1 dingo 2 pekinese 3 bluetick 4 golden_retriever ... 10217 borzoi 10218 dandie_dinmont 10219 airedale 10220 miniature_pinscher 10221 chesapeake_bay_retriever Name: breed, Length: 10222, dtype: object
import numpy as np
labels = labels_csv["breed"].to_numpy() # convert labels column to NumPy array
labels[:10]
array(['boston_bull', 'dingo', 'pekinese', 'bluetick', 'golden_retriever', 'bedlington_terrier', 'bedlington_terrier', 'borzoi', 'basenji', 'scottish_deerhound'], dtype=object)
len(labels)
10222
# See if number of labels matches the number of filenames
if len(labels) == len(filenames):
print("Number of labels matches number of filenames!")
else:
print("Number of labels does not match number of filenames, check data directories.")
Number of labels matches number of filenames!
# Find the unique label values
unique_breeds = np.unique(labels)
len(unique_breeds)
120
unique_breeds
array(['affenpinscher', 'afghan_hound', 'african_hunting_dog', 'airedale', 'american_staffordshire_terrier', 'appenzeller', 'australian_terrier', 'basenji', 'basset', 'beagle', 'bedlington_terrier', 'bernese_mountain_dog', 'black-and-tan_coonhound', 'blenheim_spaniel', 'bloodhound', 'bluetick', 'border_collie', 'border_terrier', 'borzoi', 'boston_bull', 'bouvier_des_flandres', 'boxer', 'brabancon_griffon', 'briard', 'brittany_spaniel', 'bull_mastiff', 'cairn', 'cardigan', 'chesapeake_bay_retriever', 'chihuahua', 'chow', 'clumber', 'cocker_spaniel', 'collie', 'curly-coated_retriever', 'dandie_dinmont', 'dhole', 'dingo', 'doberman', 'english_foxhound', 'english_setter', 'english_springer', 'entlebucher', 'eskimo_dog', 'flat-coated_retriever', 'french_bulldog', 'german_shepherd', 'german_short-haired_pointer', 'giant_schnauzer', 'golden_retriever', 'gordon_setter', 'great_dane', 'great_pyrenees', 'greater_swiss_mountain_dog', 'groenendael', 'ibizan_hound', 'irish_setter', 'irish_terrier', 'irish_water_spaniel', 'irish_wolfhound', 'italian_greyhound', 'japanese_spaniel', 'keeshond', 'kelpie', 'kerry_blue_terrier', 'komondor', 'kuvasz', 'labrador_retriever', 'lakeland_terrier', 'leonberg', 'lhasa', 'malamute', 'malinois', 'maltese_dog', 'mexican_hairless', 'miniature_pinscher', 'miniature_poodle', 'miniature_schnauzer', 'newfoundland', 'norfolk_terrier', 'norwegian_elkhound', 'norwich_terrier', 'old_english_sheepdog', 'otterhound', 'papillon', 'pekinese', 'pembroke', 'pomeranian', 'pug', 'redbone', 'rhodesian_ridgeback', 'rottweiler', 'saint_bernard', 'saluki', 'samoyed', 'schipperke', 'scotch_terrier', 'scottish_deerhound', 'sealyham_terrier', 'shetland_sheepdog', 'shih-tzu', 'siberian_husky', 'silky_terrier', 'soft-coated_wheaten_terrier', 'staffordshire_bullterrier', 'standard_poodle', 'standard_schnauzer', 'sussex_spaniel', 'tibetan_mastiff', 'tibetan_terrier', 'toy_poodle', 'toy_terrier', 'vizsla', 'walker_hound', 'weimaraner', 'welsh_springer_spaniel', 'west_highland_white_terrier', 'whippet', 'wire-haired_fox_terrier', 'yorkshire_terrier'], dtype=object)
# Example: Turn one label into an array of booleans or Turning a single label into an array of booleans.
print(labels[0])
labels[0] == unique_breeds # use comparison operator to create boolean array
boston_bull
array([False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False])
len(labels)
10222
# Turn every label into a boolean array
boolean_labels = [label == np.array(unique_breeds) for label in labels]
boolean_labels[:2]
[array([False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False]), array([False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False])]
# Example: Turning a boolean array into integers
print(labels[0]) # original label
print(np.where(unique_breeds == labels[0])[0][0]) # index where label occurs
print(boolean_labels[0].argmax()) # index where label occurs in boolean array
print(boolean_labels[0].astype(int)) # there will be a 1 where the sample label occurs
boston_bull 19 19 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
print(labels[2])
print(boolean_labels[2].astype(int))
pekinese [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
filenames[:10]
['drive/MyDrive/Dog Vision/train/000bec180eb18c7604dcecc8fe0dba07.jpg', 'drive/MyDrive/Dog Vision/train/001513dfcb2ffafc82cccf4d8bbaba97.jpg', 'drive/MyDrive/Dog Vision/train/001cdf01b096e06d78e9e5112d419397.jpg', 'drive/MyDrive/Dog Vision/train/00214f311d5d2247d5dfe4fe24b2303d.jpg', 'drive/MyDrive/Dog Vision/train/0021f9ceb3235effd7fcde7f7538ed62.jpg', 'drive/MyDrive/Dog Vision/train/002211c81b498ef88e1b40b9abf84e1d.jpg', 'drive/MyDrive/Dog Vision/train/00290d3e1fdd27226ba27a8ce248ce85.jpg', 'drive/MyDrive/Dog Vision/train/002a283a315af96eaea0e28e7163b21b.jpg', 'drive/MyDrive/Dog Vision/train/003df8b8a8b05244b1d920bb6cf451f9.jpg', 'drive/MyDrive/Dog Vision/train/0042188c895a2f14ef64a918ed9c7b64.jpg']
Since the dataset from Kaggle doesn't come with a validation set, we're going to create our own.
# Setup X & y variables
X = filenames
y = boolean_labels
len(filenames)
10222
We're going to start off experimenting with about 1000 images and increase as needed.
# Set number of images to use for experimenting
NUM_IMAGES = 1000 #@param {type:"slider", min:1000, max:10000, step:1000}
NUM_IMAGES
1000
Now let's split our data into training and validation sets. We'll use and 80/20 split (80% training data, 20% validation data).
# Import train_test_split from Scikit-Learn
from sklearn.model_selection import train_test_split
# Split them into training and validation using NUM_IMAGES
X_train, X_val, y_train, y_val = train_test_split(X[:NUM_IMAGES],
y[:NUM_IMAGES],
test_size=0.2, # 0.2 means 20%
random_state=42)
len(X_train), len(y_train), len(X_val), len(y_val)
(800, 800, 200, 200)
# Check out the training data (image file paths and labels)
X_train[:5], y_train[:2]
(['drive/MyDrive/Dog Vision/train/00bee065dcec471f26394855c5c2f3de.jpg', 'drive/MyDrive/Dog Vision/train/0d2f9e12a2611d911d91a339074c8154.jpg', 'drive/MyDrive/Dog Vision/train/1108e48ce3e2d7d7fb527ae6e40ab486.jpg', 'drive/MyDrive/Dog Vision/train/0dc3196b4213a2733d7f4bdcd41699d3.jpg', 'drive/MyDrive/Dog Vision/train/146fbfac6b5b1f0de83a5d0c1b473377.jpg'], [array([False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False]), array([False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False])])
Our labels are in numeric format but our images are still just file paths.
Since we're using TensorFlow, our data has to be in the form of Tensors.
A Tensor is a way to represent information in numbers. If you're familar with NumPy arrays (you should be), a Tensor can be thought of as a combination of NumPy arrays, except with the special ability to be used on a GPU.
Because of how TensorFlow stores information (in Tensors), it allows machine learning and deep learning models to be run on GPUs (generally faster at numerical computing).
To preprocess our images into Tensors we're going to write a function which does a few things:
image
.image
(a jpeg file) into Tensors.image
to be of shape (224, 224).image
.A good place to read about this type of function is the TensorFlow documentation on loading images.
You might be wondering why (224, 224), which is (heigh, width). It's because this is the size of input our model (we'll see this soon) takes, an image which is (224, 224, 3).
What? Where's the 3 from? We're getting ahead of ourselves but that's the number of colour channels per pixel, red, green and blue.
# Convert image to NumPy array
from matplotlib.pyplot import imread
Image = imread(filenames[42]) # read in an image
Image.shape
(257, 350, 3)
Notice the shape of image. It's (257, 350, 3). This is height, width, colour channel value.
Image.max(), Image.min()
(255, 0)
And you can easily convert it to a Tensor using tf.constant().
Image[:2]
array([[[ 89, 137, 87], [ 76, 124, 74], [ 63, 111, 59], ..., [ 76, 134, 86], [ 76, 134, 86], [ 76, 134, 86]], [[ 72, 119, 73], [ 67, 114, 68], [ 63, 111, 63], ..., [ 75, 131, 84], [ 74, 132, 84], [ 74, 131, 86]]], dtype=uint8)
Ok, now let's build that function we were talking about.
tf.constant(Image)[:2]
<tf.Tensor: shape=(2, 350, 3), dtype=uint8, numpy= array([[[ 89, 137, 87], [ 76, 124, 74], [ 63, 111, 59], ..., [ 76, 134, 86], [ 76, 134, 86], [ 76, 134, 86]], [[ 72, 119, 73], [ 67, 114, 68], [ 63, 111, 63], ..., [ 75, 131, 84], [ 74, 132, 84], [ 74, 131, 86]]], dtype=uint8)>
Now we've seen what an image looks like as a Tensor, let's make a function to preprocess them.
image
.image
(a jpeg file) into Tensors.image
to be of shape (224, 224).image
.# Define image size
IMG_SIZE = 224
def process_image(image_path, img_size=IMG_SIZE):
"""
Takes an image file path and turns it into a Tensor.
"""
# Read in image file
image = tf.io.read_file(image_path)
# Turn the jpeg image into numerical Tensor with 3 colour channels (Red, Green, Blue)
image = tf.image.decode_jpeg(image, channels=3)
# Convert the colour channel values from 0-225 values to 0-1 values
image = tf.image.convert_image_dtype(image, tf.float32)
# Resize the image to our desired size (224, 244)
image = tf.image.resize(image, size=[IMG_SIZE, IMG_SIZE])
return image
Wonderful. Now we've got a function to convert our images into Tensors, we'll now build one to turn our data into batches (more specifically, a TensorFlow BatchDataset).
Why turn our data into batches?
A batch (also called mini-batch) is a small portion of your data, say 32 (32 is generally the default batch size) images and their labels. In deep learning, instead of finding patterns in an entire dataset at the same time, you often find them one batch at a time.
Let's say you're dealing with 10,000+ images (which we are). Together, these files may take up more memory than your GPU has. Trying to compute on them all would result in an error.
Instead, it's more efficient to create smaller batches of your data and compute on one batch at a time.
TensorFlow is very efficient when your data is in batches of (image, label) Tensors. So we'll build a function to do create those first. We'll take advantage of of process_image function at the same time.
# Create a simple function to return a tuple (image, label)
def get_image_label(image_path, label):
"""
Takes an image file path name and the associated label,
processes the image and returns a tuple of (image, label).
"""
image = process_image(image_path)
return image, label
# Demo of the above
(process_image(X[42]), tf.constant(y[42]))
(<tf.Tensor: shape=(224, 224, 3), dtype=float32, numpy= array([[[0.3264178 , 0.5222886 , 0.3232816 ], [0.2537167 , 0.44366494, 0.24117757], [0.25699762, 0.4467087 , 0.23893751], ..., [0.29325107, 0.5189916 , 0.3215547 ], [0.29721776, 0.52466875, 0.33030328], [0.2948505 , 0.5223015 , 0.33406618]], [[0.25903144, 0.4537807 , 0.27294815], [0.24375686, 0.4407019 , 0.2554778 ], [0.2838985 , 0.47213382, 0.28298813], ..., [0.2785345 , 0.5027992 , 0.31004712], [0.28428748, 0.5108719 , 0.32523635], [0.28821915, 0.5148036 , 0.32916805]], [[0.20941195, 0.40692952, 0.25792548], [0.24045378, 0.43900946, 0.2868911 ], [0.29001117, 0.47937486, 0.32247734], ..., [0.26074055, 0.48414773, 0.30125174], [0.27101526, 0.49454468, 0.32096273], [0.27939945, 0.5029289 , 0.32934693]], ..., [[0.00634795, 0.03442048, 0.0258106 ], [0.01408936, 0.04459917, 0.0301715 ], [0.01385712, 0.04856448, 0.02839671], ..., [0.4220516 , 0.39761978, 0.21622123], [0.47932503, 0.45370543, 0.2696505 ], [0.48181024, 0.45828083, 0.27004552]], [[0.00222061, 0.02262166, 0.03176915], [0.01008397, 0.03669046, 0.02473482], [0.00608852, 0.03890046, 0.01207283], ..., [0.36070833, 0.33803678, 0.16216145], [0.42499566, 0.3976801 , 0.21701711], [0.4405433 , 0.4139589 , 0.23183356]], [[0.05608025, 0.06760229, 0.10401428], [0.05441074, 0.07435255, 0.05428263], [0.04734282, 0.07581793, 0.02060942], ..., [0.3397559 , 0.31265694, 0.14725602], [0.387725 , 0.360274 , 0.18714729], [0.43941984, 0.41196886, 0.23884216]]], dtype=float32)>, <tf.Tensor: shape=(120,), dtype=bool, numpy= array([False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False])>)
Now we've got a simple function to turn our image file path names and their associated labels into tuples (we can turn these into Tensors next), we'll create a function to make data batches.
Because we'll be dealing with 3 different sets of data (training, validation and test), we'll make sure the function can accomodate for each set.
We'll set a default batch size of 32 because according to Yann Lecun (one of the OG's of deep learning), friends don't let friends train with batch sizes over 32.
# Define the batch size, 32 is a good default
BATCH_SIZE = 32
# Create a function to turn data into batches
def create_data_batches(x, y=None, batch_size=BATCH_SIZE, valid_data=False, test_data=False):
"""
Creates batches of data out of image (x) and label (y) pairs.
Shuffles the data if it's training data but doesn't shuffle it if it's validation data.
Also accepts test data as input (no labels).
"""
# If the data is a test dataset, we probably don't have labels
if test_data:
print("Creating test data batches...")
data = tf.data.Dataset.from_tensor_slices((tf.constant(x))) # only filepaths
data_batch = data.map(process_image).batch(BATCH_SIZE)
return data_batch
# If the data if a valid dataset, we don't need to shuffle it
elif valid_data:
print("Creating validation data batches...")
data = tf.data.Dataset.from_tensor_slices((tf.constant(x), # filepaths
tf.constant(y))) # labels
data_batch = data.map(get_image_label).batch(BATCH_SIZE)
return data_batch
else:
# If the data is a training dataset, we shuffle it
print("Creating training data batches...")
# Turn filepaths and labels into Tensors
data = tf.data.Dataset.from_tensor_slices((tf.constant(x), # filepaths
tf.constant(y))) # labels
# Shuffling pathnames and labels before mapping image processor function is faster than shuffling images
data = data.shuffle(buffer_size=len(x))
# Create (image, label) tuples (this also turns the image path into a preprocessed image)
data = data.map(get_image_label)
# Turn the data into batches
data_batch = data.batch(BATCH_SIZE)
return data_batch
# Create training and validation data batches
train_data = create_data_batches(X_train, y_train)
val_data = create_data_batches(X_val, y_val, valid_data=True)
Creating training data batches... Creating validation data batches...
# Check out the different attributes of our data batches
train_data.element_spec, val_data.element_spec
((TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 120), dtype=tf.bool, name=None)), (TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 120), dtype=tf.bool, name=None)))
Look at that! We've got our data in batches, more specifically, they're in Tensor pairs of (images, labels) ready for use on a GPU.
But having our data in batches can be a bit of a hard concept to understand. Let's build a function which helps us visualize what's going on under the hood.
Le traitement des données par lots est un moyen efficace de traiter de gros volumes de données lorsqu'un groupe de transactions est collecté sur une certaine période de temps. Les données sont collectées, saisies et traitées, puis les résultats du traitement par lots sont produits (Hadoop se concentre sur le traitement des données par lots). Le traitement par lots nécessite des programmes distincts pour l'entrée, le traitement et la sortie. Les systèmes de paie et de facturation en sont un exemple.
En revanche, le traitement des données en temps réel implique une entrée, un traitement et une sortie continus des données. Les données doivent être traitées sur une petite période de temps (ou presque en temps réel). Les systèmes de radar, les services à la clientèle et les guichets automatiques bancaires en sont des exemples.
import matplotlib.pyplot as plt
# Create a function for viewing images in a data batch
def show_25_images(images, labels):
"""
Displays 25 images from a data batch.
"""
# Setup the figure
plt.figure(figsize=(10, 10))
# Loop through 25 (for displaying 25 images)
for i in range(25):
# Create subplots (5 rows, 5 columns)
ax = plt.subplot(5, 5, i+1)
# Display an image
plt.imshow(images[i])
# Add the image label as the title
plt.title(unique_breeds[labels[i].argmax()])
# Turn gird lines off
plt.axis("off")
unique_breeds[y[0].argmax()]
'boston_bull'
To make computation efficient, a batch is a tighly wound collection of Tensors.
So to view data in a batch, we've got to unwind it.
We can do so by calling the as_numpy_iterator() method on a data batch.
This will turn our a data batch into something which can be iterated over.
Passing an iterable to next() will return the next item in the iterator.
In our case, next will return a batch of 32 images and label pairs.
en français
Pour que le calcul soit efficace, un lot est un ensemble de tenseurs fortement enroulés. Pour visualiser les données d'un lot, il faut donc le dérouler. Nous pouvons le faire en appelant la méthode as_numpy_iterator() sur un lot de données. Cela transformera notre lot de données en quelque chose qui pourra être répété. Le passage d'un itérable à next() renverra l'élément suivant dans l'itérateur. Dans notre cas, next renverra un lot de 32 images et paires d'étiquettes.
Note
: Running the cell below and loading images may take a little while.
train_data
<BatchDataset shapes: ((None, 224, 224, 3), (None, 120)), types: (tf.float32, tf.bool)>
# Visualize training images from the training data batch
train_images, train_labels = next(train_data.as_numpy_iterator())
show_25_images(train_images, train_labels)
Look at all those beautiful dogs!
# Visualize validation images from the validation data batch
val_images, val_labels = next(val_data.as_numpy_iterator())
show_25_images(val_images, val_labels)
Even more dogs!
Question : Why does running the cell above and viewing validation images return the same dogs each time?
Now our data is ready, let's prepare it modelling. We'll use an existing model from TensorFlow Hub.
TensorFlow Hub is a resource where you can find pretrained machine learning models for the problem you're working on.
Using a pretrained machine learning model is often referred to as transfer learning .
Why use a pretrained model?
Building a machine learning model and training it on lots from scratch can be expensive and time consuming.
Transfer learning helps eliviate some of these by taking what another model has learned and using that information with your own problem.
Since we know our problem is image classification (classifying different dog breeds), we can navigate the TensorFlow Hub page by our problem domain (image).
We start by choosing the image problem domain, and then can filter it down by subdomains, in our case, image classification.
Doing this gives a list of different pretrained models we can apply to our task.
Clicking on one gives us information about the model as well as instructions for using it.
For example, clicking on the mobilenet_v2_130_224 model, tells us this model takes an input of images in the shape 224, 224. It also says the model has been trained in the domain of image classification.
Let's try it out.
Before we build a model, there are a few things we need to define:
https://tfhub.dev/google/imagenet/mobilenet_v2_130_224/classification/4
# Setup input shape to the model
INPUT_SHAPE = [None, IMG_SIZE, IMG_SIZE, 3] # batch, height, width, colour channels
# Setup output shape of the model
OUTPUT_SHAPE = len(unique_breeds) # number of unique labels
# Setup model URL from TensorFlow Hub
MODEL_URL = "https://tfhub.dev/google/imagenet/mobilenet_v2_130_224/classification/4"
Now we've got the inputs, outputs and model we're using ready to go. We can start to put them together
There are many ways of building a model in TensorFlow but one of the best ways to get started is to use the Keras API.
Defining a deep learning model in Keras can be as straightforward as saying, "here are the layers of the model, the input shape and the output shape, let's go!"
Knowing this, let's create a function which:
All of these steps can be found here: https://www.tensorflow.org/guide/keras/overview
## We'll take a look at the code first, then dicuss each part.
# Create a function which builds a Keras model
def create_model(input_shape=INPUT_SHAPE, output_shape=OUTPUT_SHAPE, model_url=MODEL_URL):
print("Building model with:", MODEL_URL)
# Setup the model layers
model = tf.keras.Sequential([
hub.KerasLayer(MODEL_URL), # Layer 1 (input layer)
tf.keras.layers.Dense(units=OUTPUT_SHAPE,
activation="softmax") # Layer 2 (output layer)
])
# Compile the model
model.compile(
loss=tf.keras.losses.CategoricalCrossentropy(), # Our model wants to reduce this (how wrong its guesses are)
optimizer=tf.keras.optimizers.Adam(), # A friend telling our model how to improve its guesses
metrics=["accuracy"] # We'd like this to go up
)
# Build the model
model.build(INPUT_SHAPE) # Let the model know what kind of inputs it'll be getting
return model
What's happening here?
There are two ways to do this in Keras, the functional and sequential API. We've used the sequential.
Which one should you use?
The Keras documentation states the functional API is the way to go for defining complex models but the sequential API (a linear stack of layers) is perfectly fine for getting started, which is what we're doing.
The first layer we use is the model from TensorFlow Hub (hub.KerasLayer(MODEL_URL). So our first layer is actually an entire model (many more layers). This input layer takes in our images and finds patterns in them based on the patterns mobilenet_v2_130_224 has found.
The next layer (tf.keras.layers.Dense()) is the output layer of our model. It brings all of the information discovered in the input layer together and outputs it in the shape we're after, 120 (the number of unique labels we have).
The activation="softmax" parameter tells the output layer, we'd like to assign a probability value to each of the 120 labels somewhere between 0 & 1. The higher the value, the more the model believes the input image should have that label. If we were working on a binary classification problem, we'd use activation="sigmoid".
For more on which activation function to use, see the article Which Loss and Activation Functions Should I Use?
This one is best explained with a story.
Let's say you're at the international hill descending championships. Where your start standing on top of a hill and your goal is to get to the bottom of the hill. The catch is you're blindfolded.
Luckily, your friend Adam is standing at the bottom of the hill shouting instructions on how to get down.
At the bottom of the hill there's a judge evaluating how you're doing. They know where you need to end up so they compare how you're doing to where you're supposed to be. Their comparison is how you get scored.
Transferring this to model.compile() terminology:
We use model.build() whenever we're using a layer from TensorFlow Hub to tell our model what input shape it can expect.
In this case, the input shape is [None, IMG_SIZE, IMG_SIZE, 3] or [None, 224, 224, 3] or [batch_size, img_height, img_width, color_channels].
Batch size is left as None as this is inferred from the data we pass the model. In our case, it'll be 32 since that's what we've set up our data batches as.
Now we've gone through each section of the function, let's use it to create a model.
We can call summary() on our model to get idea of what our model looks like.
# Create a model and check its details
model = create_model()
model.summary()
Building model with: https://tfhub.dev/google/imagenet/mobilenet_v2_130_224/classification/4 Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= keras_layer (KerasLayer) (None, 1001) 5432713 _________________________________________________________________ dense (Dense) (None, 120) 120240 ================================================================= Total params: 5,552,953 Trainable params: 120,240 Non-trainable params: 5,432,713 _________________________________________________________________
The non-trainable parameters are the patterns learned by mobilenet_v2_130_224 and the trainable parameters are the ones in the dense layer we added.
This means the main bulk of the information in our model has already been learned and we're going to take that and adapt it to our own problem.
We've got a model ready to go but before we train it we'll make some callbacks.
Callbacks are helper functions a model can use during training to do things such as save a models progress, check a models progress or stop training early if a model stops improving.
The two callbacks we're going to add are a TensorBoard callback and an Early Stopping callback.
TensorBoard helps provide a visual way to monitor the progress of your model during and after training.
It can be used directly in a notebook to track the performance measures of a model such as loss and accuracy.
To set up a TensorBoard callback and view TensorBoard in a notebook, we need to do three things:
# Load the TensorBoard notebook extension
%load_ext tensorboard
import datetime
# Create a function to build a TensorBoard callback
def create_tensorboard_callback():
# Create a log directory for storing TensorBoard logs
logdir = os.path.join("drive/MyDrive/Dog Vision/logs",
# Make it so the logs get tracked whenever we run an experiment
datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
return tf.keras.callbacks.TensorBoard(logdir)
https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/EarlyStopping
Early stopping helps prevent overfitting by stopping a model when a certain evaluation metric stops improving. If a model trains for too long, it can do so well at finding patterns in a certain dataset that it's not able to use those patterns on another dataset it hasn't seen before (doesn't generalize).
It's basically like saying to our model, "keep finding patterns until the quality of those patterns starts to go down."
# Create early stopping (once our model stops improving, stop training)
early_stopping = tf.keras.callbacks.EarlyStopping(monitor="val_accuracy",
patience=3) # stops after 3 rounds of no improvements
Our first model is only going to be trained on 1000 images. Or trained on 800 images and then validated on 200 images, meaning 1000 images total or about 10% of the total data.
We do this to make sure everything is working. And if it is, we can step it up later and train on the entire training dataset.
The final parameter we'll define before training is NUM_EPOCHS (also known as number of epochs).
NUM_EPOCHS defines how many passes of the data we'd like our model to do. A pass is equivalent to our model trying to find patterns in each dog image and see which patterns relate to each label.
If NUM_EPOCHS=1, the model will only look at the data once and will probably score badly because it hasn't a chance to correct itself. It would be like you competing in the international hill descent championships and your friend Adam only being able to give you 1 single instruction to get down the hill.
What's a good value for NUM_EPOCHS?
This one is hard to say. 10 could be a good start but so could 100. This is one of the reasons we created an early stopping callback. Having early stopping setup means if we set NUM_EPOCHS to 100 but our model stops improving after 22 epochs, it'll stop training.
Along with this, let's quickly check if we're still using a GPU.
# Check again if GPU is available (otherwise computing will take a looooonnnnggggg time)
print("GPU", "available (YESSSSSSSS!!!!)" if tf.config.list_physical_devices("GPU") else "not available :(")
GPU available (YESSSSSSSS!!!!)
# How many rounds should we get the model to look through the data?
NUM_EPOCHS = 100 #@param {type:"slider", min:10, max:100, step:10}
Boom! We've got a GPU running and NUM_EPOCHS setup. Let's create a simple function which trains a model. The function will:
# Build a function to train and return a trained model
def train_model():
"""
Trains a given model and returns the trained version.
"""
# Create a model
model = create_model()
# Create new TensorBoard session everytime we train a model
tensorboard = create_tensorboard_callback()
# Fit the model to the data passing it the callbacks we created
model.fit(x=train_data,
epochs=NUM_EPOCHS,
validation_data=val_data,
validation_freq=1, # check validation metrics every epoch
callbacks=[tensorboard, early_stopping])
return model
Note: When training a model for the first time, the first epoch will take a while to load compared to the rest. This is because the model is getting ready and the data is being initialised. Using more data will generally take longer, which is why we've started with ~1000 images. After the first epoch, subsequent epochs should take a few seconds.
# Fit the model to the data
# model = train_model()
Building model with: https://tfhub.dev/google/imagenet/mobilenet_v2_130_224/classification/4 Epoch 1/100 25/25 [==============================] - 327s 13s/step - loss: 5.0637 - accuracy: 0.0428 - val_loss: 3.3795 - val_accuracy: 0.2800 Epoch 2/100 25/25 [==============================] - 4s 160ms/step - loss: 1.8686 - accuracy: 0.6474 - val_loss: 2.1698 - val_accuracy: 0.4900 Epoch 3/100 25/25 [==============================] - 4s 159ms/step - loss: 0.5774 - accuracy: 0.9565 - val_loss: 1.6924 - val_accuracy: 0.5400 Epoch 4/100 25/25 [==============================] - 4s 159ms/step - loss: 0.2593 - accuracy: 0.9809 - val_loss: 1.4848 - val_accuracy: 0.5950 Epoch 5/100 25/25 [==============================] - 4s 157ms/step - loss: 0.1456 - accuracy: 0.9999 - val_loss: 1.4178 - val_accuracy: 0.6100 Epoch 6/100 25/25 [==============================] - 4s 157ms/step - loss: 0.0990 - accuracy: 0.9989 - val_loss: 1.3732 - val_accuracy: 0.6200 Epoch 7/100 25/25 [==============================] - 4s 161ms/step - loss: 0.0740 - accuracy: 1.0000 - val_loss: 1.3470 - val_accuracy: 0.6200 Epoch 8/100 25/25 [==============================] - 4s 161ms/step - loss: 0.0597 - accuracy: 1.0000 - val_loss: 1.3261 - val_accuracy: 0.6300 Epoch 9/100 25/25 [==============================] - 4s 161ms/step - loss: 0.0490 - accuracy: 1.0000 - val_loss: 1.3060 - val_accuracy: 0.6500 Epoch 10/100 25/25 [==============================] - 4s 159ms/step - loss: 0.0414 - accuracy: 1.0000 - val_loss: 1.2912 - val_accuracy: 0.6600 Epoch 11/100 25/25 [==============================] - 4s 161ms/step - loss: 0.0354 - accuracy: 1.0000 - val_loss: 1.2808 - val_accuracy: 0.6500 Epoch 12/100 25/25 [==============================] - 4s 166ms/step - loss: 0.0308 - accuracy: 1.0000 - val_loss: 1.2737 - val_accuracy: 0.6550 Epoch 13/100 25/25 [==============================] - 4s 163ms/step - loss: 0.0277 - accuracy: 1.0000 - val_loss: 1.2628 - val_accuracy: 0.6600
Question : It looks like our model might be overfitting (getting far better results on the training set than the validation set), what are some ways to prevent model overfitting? Hint: this may involve searching something like "ways to prevent overfitting in a deep learning model?".
Note : Overfitting to begin with is a good thing. It means our model is learning something.
Now our model has been trained, we can make its performance visual by checking the TensorBoard logs.
The TensorBoard magic function (%tensorboard) will access the logs directory we created earlier and viualize its contents.
%tensorboard --logdir drive/MyDrive/Dog\ Vision/logs
Thanks to our early_stopping callback, the model stopped training after 26 or so epochs (in my case, yours might be slightly different). This is because the validation accuracy failed to improve for 3 epochs.
But the good new is, we can definitely see our model is learning something. The validation accuracy got to 65% in only a few minutes.
This means, if we were to scale up the number of images, hopefully we'd see the accuracy increase.
Before we scale up and train on more data, let's see some other ways we can evaluate our model. Because although accuracy is a pretty good indicator of how our model is doing, it would be even better if we could could see it in action.
Making predictions with a trained model is as calling predict() on it and passing it data in the same format the model was trained on.
val_data
<BatchDataset shapes: ((None, 224, 224, 3), (None, 120)), types: (tf.float32, tf.bool)>
# Make predictions on the validation data (not used to train on)
predictions = model.predict(val_data, verbose=1) # verbose shows us how long there is to go
predictions
7/7 [==============================] - 1s 108ms/step
array([[0.00909202, 0.00399168, 0.00763268, ..., 0.00131403, 0.00528854, 0.00284021], [0.00106611, 0.00093494, 0.00089609, ..., 0.0029972 , 0.00358844, 0.00169737], [0.00075154, 0.00462682, 0.00026993, ..., 0.01285275, 0.0020741 , 0.00589033], ..., [0.0032309 , 0.00312911, 0.00483632, ..., 0.00036114, 0.02114588, 0.00185815], [0.00437818, 0.00295467, 0.00232756, ..., 0.00301398, 0.01513689, 0.00617788], [0.00194496, 0.0018128 , 0.00060322, ..., 0.00239779, 0.00498244, 0.00348416]], dtype=float32)
np.sum(predictions[0])
0.9999999
np.sum(predictions[1])
1.0
# Check the shape of predictions
predictions.shape
(200, 120)
len(y_val)
200
len(unique_breeds)
120
So this means that we've got an array of 200 by 120 so we have 200.
Making predictions with our model returns an array with a different value for each label.
In this case, making predictions on the validation data (200 images) returns an array (predictions) of arrays, each containing 120 different values (one for each unique dog breed).
These different values are the probabilities or the likelihood the model has predicted a certain image being a certain breed of dog. The higher the value, the more likely the model thinks a given image is a specific breed of dog.
Let's see how we'd convert an array of probabilities into an actual label.
# First prediction
index = 0
print(predictions[index])
print(f"Max value (probability of prediction): {np.max(predictions[index])}") # the max probability value predicted by the model
print(f"Sum: {np.sum(predictions[index])}") # because we used softmax activation in our model, this will be close to 1
print(f"Max index: {np.argmax(predictions[index])}") # the index of where the max value in predictions[0] occurs
print(f"Predicted label: {unique_breeds[np.argmax(predictions[index])]}") # the predicted label
[0.00909202 0.00399168 0.00763268 0.00547596 0.00360195 0.00308712 0.00655701 0.02783835 0.04636196 0.00163443 0.02721482 0.00172934 0.00081654 0.02432057 0.00143126 0.00296814 0.00675685 0.00127035 0.00023899 0.00035505 0.00620183 0.02435892 0.01171977 0.00091398 0.00733749 0.00192161 0.00015451 0.0062303 0.0007502 0.005833 0.00059962 0.00458026 0.00922254 0.0109583 0.00363489 0.00616283 0.00481876 0.00115611 0.00596298 0.005562 0.00867489 0.00273058 0.00782177 0.00544046 0.01745908 0.00066087 0.00190487 0.0040133 0.00045553 0.00194388 0.01497252 0.0044313 0.0011021 0.00089189 0.00031274 0.00783574 0.00235983 0.00258014 0.00031688 0.01287452 0.00898917 0.00340211 0.00414114 0.01055263 0.00147633 0.00223399 0.00186825 0.00224116 0.00302522 0.02355317 0.00141562 0.02688068 0.00497073 0.00217645 0.01817009 0.00069496 0.00277378 0.01038027 0.01406768 0.04485447 0.00890487 0.00667025 0.00568641 0.00047745 0.00274653 0.00105361 0.00402492 0.00517609 0.00168408 0.00227396 0.00179296 0.01243708 0.00832966 0.01907983 0.00637868 0.02370819 0.00586851 0.00946544 0.01048381 0.00030533 0.02710809 0.01161164 0.00960138 0.00068389 0.00422603 0.00084843 0.00165466 0.00088037 0.00334331 0.00070115 0.00412377 0.06495062 0.00343722 0.11114166 0.00099272 0.00103404 0.02056473 0.00131403 0.00528854 0.00284021] Max value (probability of prediction): 0.11114165931940079 Sum: 0.9999998807907104 Max index: 113 Predicted label: walker_hound
unique_breeds[0]
'affenpinscher'
So the max value the probability of prediction the maximum this can be as one because remember the total of a soft max function is between 0 1 so the highest a single value can be is 1. So this is saying we're predicting the highest label in here with a 28.19% prediction probability.
Max index: 26
Predicted label: cairn
# First prediction
index = 42
print(predictions[index])
print(f"Max value (probability of prediction): {np.max(predictions[index])}") # the max probability value predicted by the model
print(f"Sum: {np.sum(predictions[index])}") # because we used softmax activation in our model, this will be close to 1
print(f"Max index: {np.argmax(predictions[index])}") # the index of where the max value in predictions[0] occurs
print(f"Predicted label: {unique_breeds[np.argmax(predictions[index])]}") # the predicted label
[0.00181275 0.00184801 0.00142292 0.0101094 0.00904594 0.00035731 0.001898 0.00187211 0.00053704 0.00663512 0.0023174 0.01757792 0.00276312 0.0026783 0.00676418 0.02983673 0.00478711 0.00055183 0.00094433 0.0008553 0.00303893 0.00478842 0.00135996 0.00765211 0.00436202 0.01908158 0.04374519 0.0002893 0.00303924 0.04326033 0.00113263 0.02623418 0.02781493 0.01186815 0.00460918 0.0109467 0.00150619 0.00538766 0.00986879 0.00264446 0.00239379 0.01859355 0.00204275 0.00825957 0.00359756 0.00301052 0.01180106 0.00392077 0.01020842 0.00070981 0.00204326 0.00109509 0.00244695 0.00347855 0.00834198 0.00118122 0.01213204 0.0081569 0.00691161 0.00191381 0.01017956 0.00926101 0.01782845 0.00226443 0.0016387 0.01333259 0.00153065 0.00031276 0.01472326 0.03672179 0.00475213 0.0049333 0.00435892 0.02290073 0.00836107 0.00297264 0.00812542 0.00433936 0.03011041 0.00425418 0.00341654 0.07128713 0.00072913 0.00844184 0.00167228 0.00576352 0.00113436 0.00094226 0.00886855 0.03384679 0.00237555 0.00601533 0.00021029 0.00136014 0.00150828 0.02214428 0.00607917 0.00381435 0.00909107 0.00161977 0.04109683 0.00259781 0.00639656 0.00242203 0.02024747 0.00340358 0.00715457 0.00061821 0.00482435 0.00468546 0.00179711 0.00340788 0.01052035 0.01010794 0.00133722 0.00146533 0.01551325 0.00168521 0.0055997 0.01241318] Max value (probability of prediction): 0.0712871327996254 Sum: 1.0 Max index: 81 Predicted label: norwich_terrier
unique_breeds[113]
'walker_hound'
So one methodology of evaluating a model is to go okay. Only give me the predictions that have a confidence value over point seventy five. Every prediction that's max value is under point seven five gets pushed to the wayside. Those are things you might want to pass to a human classifier but everything over point seven five you might want to let the model predict itself.
Having this information is great but it would be even better if we could compare a prediction to its true label and original image.
To help us, let's first build a little function to convert prediction probabilities into predicted labels.
Note: Prediction probabilities are also known as confidence levels.
# Turn prediction probabilities into their respective label (easier to understand)
def get_pred_label(prediction_probabilities):
"""
Turns an array of prediction probabilities into a label.
"""
return unique_breeds[np.argmax(prediction_probabilities)]
# Get a predicted label based on an array of prediction probabilities
pred_label = get_pred_label(predictions[74])
pred_label
'chihuahua'
Wonderful! Now we've got a list of all different predictions our model has made, we'll do the same for the validation images and validation labels.
Remember, the model hasn't trained on the validation data, during the fit() function, it only used the validation data to evaluate itself. So we can use the validation images to visually compare our models predictions with the validation labels.
Since our validation data (val_data) is in batch form, to get a list of validation images and labels, we'll have to unbatch it (using unbatch()) and then turn it into an iterator using as_numpy_iterator().
Let's make a small function to do so.
val_data
<BatchDataset shapes: ((None, 224, 224, 3), (None, 120)), types: (tf.float32, tf.bool)>
# Create a function to unbatch a batched dataset
def unbatchify(data):
"""
Takes a batched dataset of (image, label) Tensors and returns separate arrays
of images and labels.
"""
images = []
labels = []
# Loop through unbatched data
for image, label in data.unbatch().as_numpy_iterator():
images.append(image)
labels.append(unique_breeds[np.argmax(label)])
return images, labels
# Unbatchify the validation data
val_images, val_labels = unbatchify(val_data)
val_images[0], val_labels[0]
(array([[[0.29599646, 0.43284872, 0.3056691 ], [0.26635826, 0.32996926, 0.22846507], [0.31428418, 0.2770141 , 0.22934894], ..., [0.77614343, 0.82320225, 0.8101595 ], [0.81291157, 0.8285351 , 0.8406944 ], [0.8209297 , 0.8263737 , 0.8423668 ]], [[0.2344871 , 0.31603682, 0.19543913], [0.3414841 , 0.36560842, 0.27241898], [0.45016077, 0.40117094, 0.33964607], ..., [0.7663987 , 0.8134138 , 0.81350833], [0.7304248 , 0.75012016, 0.76590735], [0.74518913, 0.76002574, 0.7830809 ]], [[0.30157745, 0.3082587 , 0.21018331], [0.2905954 , 0.27066195, 0.18401104], [0.4138316 , 0.36170745, 0.2964005 ], ..., [0.79871625, 0.8418535 , 0.8606443 ], [0.7957738 , 0.82859945, 0.8605655 ], [0.75181633, 0.77904975, 0.8155256 ]], ..., [[0.9746779 , 0.9878955 , 0.9342279 ], [0.99153054, 0.99772066, 0.9427856 ], [0.98925114, 0.9792082 , 0.9137934 ], ..., [0.0987601 , 0.0987601 , 0.0987601 ], [0.05703771, 0.05703771, 0.05703771], [0.03600177, 0.03600177, 0.03600177]], [[0.98197854, 0.9820659 , 0.9379411 ], [0.9811992 , 0.97015417, 0.9125648 ], [0.9722316 , 0.93666023, 0.8697186 ], ..., [0.09682598, 0.09682598, 0.09682598], [0.07196062, 0.07196062, 0.07196062], [0.0361607 , 0.0361607 , 0.0361607 ]], [[0.97279435, 0.9545954 , 0.92389745], [0.963602 , 0.93199134, 0.88407487], [0.9627158 , 0.9125331 , 0.8460338 ], ..., [0.08394483, 0.08394483, 0.08394483], [0.0886985 , 0.0886985 , 0.0886985 ], [0.04514172, 0.04514172, 0.04514172]]], dtype=float32), 'cairn')
Nailed it!
More specifically, we want to be able to view an image, its predicted label and its actual label (true label).
The first function we'll create will:
def plot_pred(prediction_probabilities, labels, images, n=1):
"""
View the prediction, ground truth label and image for sample n.
"""
pred_prob, true_label, image = prediction_probabilities[n], labels[n], images[n]
# Get the pred label
pred_label = get_pred_label(pred_prob)
# Plot image & remove ticks
plt.imshow(image)
plt.xticks([])
plt.yticks([])
# Change the color of the title depending on if the prediction is right or wrong
if pred_label == true_label:
color = "green"
else:
color = "red"
# Change plot title to be predicted, probability of prediction and truth label
plt.title("{} {:2.0f}% ({})".format(pred_label,
np.max(pred_prob)*100,
true_label),
color=color)
# View an example prediction, original image and truth label
plot_pred(prediction_probabilities=predictions,
labels=val_labels,
images=val_images)
# View an example prediction, original image and truth label
plot_pred(prediction_probabilities=predictions,
labels=val_labels,
images=val_images,
n=77)
Nice! Making functions to help visual your models results are really helpful in understanding how your model is doing.
Since we're working with a multi-class problem (120 different dog breeds), it would also be good to see what other guesses our model is making. More specifically, if our model predicts a certain label with 24% probability, what else did it predict?
Let's build a function to demonstrate. The function will:
def plot_pred_conf(prediction_probabilities, labels, n=1):
"""
Plots the top 10 highest prediction confidences along with
the truth label for sample n.
"""
pred_prob, true_label = prediction_probabilities[n], labels[n]
# Get the predicted label
pred_label = get_pred_label(pred_prob)
# Find the top 10 prediction confidence indexes
top_10_pred_indexes = pred_prob.argsort()[-10:][::-1]
# Find the top 10 prediction confidence values
top_10_pred_values = pred_prob[top_10_pred_indexes]
# Find the top 10 prediction labels
top_10_pred_labels = unique_breeds[top_10_pred_indexes]
# Setup plot
top_plot = plt.bar(np.arange(len(top_10_pred_labels)),
top_10_pred_values,
color="grey")
plt.xticks(np.arange(len(top_10_pred_labels)),
labels=top_10_pred_labels,
rotation="vertical")
# Change color of true label
if np.isin(true_label, top_10_pred_labels):
top_plot[np.argmax(top_10_pred_labels == true_label)].set_color("green")
else:
pass
plot_pred_conf(prediction_probabilities=predictions,
labels=val_labels,
n=1)
Wonderful! Now we've got some functions to help us visualize our predictions and evaluate our model, let's check out a few.
# Let's check a few predictions and their different values
i_multiplier = 20
num_rows = 3
num_cols = 2
num_images = num_rows*num_cols
plt.figure(figsize=(5*2*num_cols, 5*num_rows))
for i in range(num_images):
plt.subplot(num_rows, 2*num_cols, 2*i+1)
plot_pred(prediction_probabilities=predictions,
labels=val_labels,
images=val_images,
n=i+i_multiplier)
plt.subplot(num_rows, 2*num_cols, 2*i+2)
plot_pred_conf(prediction_probabilities=predictions,
labels=val_labels,
n=i+i_multiplier)
plt.tight_layout(h_pad=1.0)
plt.show()
After training a model, it's a good idea to save it. Saving it means you can share it with colleagues, put it in an application and more importantly, won't have to go through the potentially expensive step of retraining it.
The format of an entire saved Keras model is h5. So we'll make a function which can take a model as input and utilise the save() method to save it as a h5 file to a specified directory.
def save_model(model, suffix=None):
"""
Saves a given model in a models directory and appends a suffix (str)
for clarity and reuse.
"""
# Create model directory with current time
modeldir = os.path.join("drive/MyDrive/Dog Vision/models",
datetime.datetime.now().strftime("%Y%m%d-%H%M%s"))
model_path = modeldir + "-" + suffix + ".h5" # save format of model
print(f"Saving model to: {model_path}...")
model.save(model_path)
return model_path
If we've got a saved model, we'd like to load it, let's create a function which can take a model path and use the tf.keras.models.load_model() function to load it into the notebook.
Because we're using a component from TensorFlow Hub (hub.KerasLayer) we'll have to pass this as a parameter to the custom_objects parameter.
def load_model(model_path):
"""
Loads a saved model from a specified path.
"""
print(f"Loading saved model from: {model_path}")
model = tf.keras.models.load_model(model_path,
custom_objects={"KerasLayer":hub.KerasLayer})
return model
# Save our model trained on 1000 images
save_model(model, suffix="1000-images-Adam")
Saving model to: drive/MyDrive/Dog Vision/models/20201225-13371608903425-1000-images-Adam.h5...
'drive/MyDrive/Dog Vision/models/20201225-13371608903425-1000-images-Adam.h5'
# Load our model trained on 1000 images
model_1000_images = load_model('drive/MyDrive/models/20201225-13261608902806-1000-images-Adam.h5')
Loading saved model from: drive/MyDrive/models/20201225-13261608902806-1000-images-Adam.h5
Compare the two models (the original one and loaded one). We can do so easily using the evaluate() method.
# Evaluate the pre-saved model
model.evaluate(val_data)
7/7 [==============================] - 1s 103ms/step - loss: 5.6264 - accuracy: 0.0000e+00
[5.553521156311035, 0.0]
# Evaluate the loaded model
model_1000_images.evaluate(val_data)
7/7 [==============================] - 1s 104ms/step - loss: 5.5535 - accuracy: 0.0000e+00
[5.553521156311035, 0.0]
Now we know our model works on a subset of the data, we can start to move forward with training one on the full data.
Above, we saved all of the training filepaths to X and all of the training labels to y. Let's check them out.
# Remind ourselves of the size of the full dataset
len(X), len(y)
(10222, 10222)
X[:10]
['drive/MyDrive/Dog Vision/train/000bec180eb18c7604dcecc8fe0dba07.jpg', 'drive/MyDrive/Dog Vision/train/001513dfcb2ffafc82cccf4d8bbaba97.jpg', 'drive/MyDrive/Dog Vision/train/001cdf01b096e06d78e9e5112d419397.jpg', 'drive/MyDrive/Dog Vision/train/00214f311d5d2247d5dfe4fe24b2303d.jpg', 'drive/MyDrive/Dog Vision/train/0021f9ceb3235effd7fcde7f7538ed62.jpg', 'drive/MyDrive/Dog Vision/train/002211c81b498ef88e1b40b9abf84e1d.jpg', 'drive/MyDrive/Dog Vision/train/00290d3e1fdd27226ba27a8ce248ce85.jpg', 'drive/MyDrive/Dog Vision/train/002a283a315af96eaea0e28e7163b21b.jpg', 'drive/MyDrive/Dog Vision/train/003df8b8a8b05244b1d920bb6cf451f9.jpg', 'drive/MyDrive/Dog Vision/train/0042188c895a2f14ef64a918ed9c7b64.jpg']
len(X_train)
800
There we go! We've got over 10,000 images and labels in our training set.
Before we can train a model on these, we'll have to turn them into a data batch.
The beautiful thing is, we can use our create_data_batches() function from above which also preprocesses our images for us (thank you past us for writing a helpful function).
# Turn full training data in a data batch
full_data = create_data_batches(X, y)
Creating training data batches...
full_data
<BatchDataset shapes: ((None, 224, 224, 3), (None, 120)), types: (tf.float32, tf.bool)>
Our data is in a data batch, all we need now is a model.
And surprise, we've got a function for that too! Let's use create_model() to instantiate another model.
# Instantiate a new model for training on the full dataset
full_model = create_model()
Building model with: https://tfhub.dev/google/imagenet/mobilenet_v2_130_224/classification/4
Since we've made a new model instance, full_model, we'll need some callbacks too.
# Create full model callbacks
# TensorBoard callback
full_model_tensorboard = create_tensorboard_callback()
# Early stopping callback
# Note: No validation set when training on all the data, therefore can't monitor validation accruacy
full_model_early_stopping = tf.keras.callbacks.EarlyStopping(monitor="accuracy",
patience=3)
Note: Since running the cell below will cause the model to train on all of the data (10,000+) images, it may take a fairly long time to get started and finish. However, thanks to our full_model_early_stopping callback, it'll stop before it starts going too long.
Note: Running the cell below will take a little while (maybe up to 30 minutes for the first epoch) because the GPU we're using in the runtime has to load all of the images into memory.
full_model.fit(x=full_data,
epochs=NUM_EPOCHS,
callbacks=[full_model_tensorboard, full_model_early_stopping])
Epoch 1/100 320/320 [==============================] - 5208s 16s/step - loss: 2.4516 - accuracy: 0.4737 Epoch 2/100 320/320 [==============================] - 50s 157ms/step - loss: 0.4136 - accuracy: 0.8790 Epoch 3/100 320/320 [==============================] - 42s 131ms/step - loss: 0.2302 - accuracy: 0.9424 Epoch 4/100 320/320 [==============================] - 43s 134ms/step - loss: 0.1424 - accuracy: 0.9700 Epoch 5/100 320/320 [==============================] - 48s 149ms/step - loss: 0.1008 - accuracy: 0.9826 Epoch 6/100 320/320 [==============================] - 49s 152ms/step - loss: 0.0786 - accuracy: 0.9863 Epoch 7/100 320/320 [==============================] - 49s 153ms/step - loss: 0.0553 - accuracy: 0.9920 Epoch 8/100 320/320 [==============================] - 48s 151ms/step - loss: 0.0428 - accuracy: 0.9956 Epoch 9/100 320/320 [==============================] - 49s 153ms/step - loss: 0.0335 - accuracy: 0.9978 Epoch 10/100 320/320 [==============================] - 47s 146ms/step - loss: 0.0323 - accuracy: 0.9959 Epoch 11/100 320/320 [==============================] - 47s 148ms/step - loss: 0.0229 - accuracy: 0.9989 Epoch 12/100 320/320 [==============================] - 48s 150ms/step - loss: 0.0227 - accuracy: 0.9976 Epoch 13/100 320/320 [==============================] - 47s 147ms/step - loss: 0.0194 - accuracy: 0.9985 Epoch 14/100 320/320 [==============================] - 48s 150ms/step - loss: 0.0136 - accuracy: 0.9994 Epoch 15/100 320/320 [==============================] - 49s 152ms/step - loss: 0.0147 - accuracy: 0.9987 Epoch 16/100 320/320 [==============================] - 49s 154ms/step - loss: 0.0159 - accuracy: 0.9981 Epoch 17/100 320/320 [==============================] - 48s 151ms/step - loss: 0.0159 - accuracy: 0.9983
<tensorflow.python.keras.callbacks.History at 0x7fefdc5043c8>
Even on a GPU, our full model took a while to train. So it's a good idea to save it.
We can do so using our save_model() function.
Challenge: It may be a good idea to incorporate the save_model() function into a train_model() function. Or look into setting up a checkpoint callback.
save_model(full_model, suffix="full-image-set-mobilenetv2-Adam")
Saving model to: drive/MyDrive/Dog Vision/models/20201225-16081608912480-full-image-set-mobilenetv2-Adam.h5...
'drive/MyDrive/Dog Vision/models/20201225-16081608912480-full-image-set-mobilenetv2-Adam.h5'
To monitor the model whilst it trains, we'll load TensorBoard (it should update every 30-seconds or so whilst the model trains).
# Load in the full model
loaded_full_model = load_model('drive/MyDrive/Dog Vision/models/20201225-16081608912480-full-image-set-mobilenetv2-Adam.h5')
Loading saved model from: drive/MyDrive/Dog Vision/models/20201225-16081608912480-full-image-set-mobilenetv2-Adam.h5
len(X)
10222
%tensorboard --logdir drive/MyDrive/Dog\ Vision/logs
Reusing TensorBoard on port 6007 (pid 1381), started 2:40:55 ago. (Use '!kill 1381' to kill it.)
Since our model has been trained on images in the form of Tensor batches, to make predictions on the test data, we'll have to get it into the same format.
Luckily we created create_data_batches() earlier which can take a list of filenames as input and convert them into Tensor batches.
To make predictions on the test data, we'll:
Get the test image filenames. Convert the filenames into test data batches using create_data_batches() and setting the test_data parameter to True (since there are no labels with the test images). Make a predictions array by passing the test data batches to the predict() function.
# Load test image filenames (since we're using os.listdir(), these already have .jpg)
test_path = "drive/MyDrive/Dog Vision/test/"
test_filenames = [test_path + fname for fname in os.listdir(test_path)]
test_filenames[:10]
['drive/MyDrive/Dog Vision/test/f4270652c14534bd1916311b24261b83.jpg', 'drive/MyDrive/Dog Vision/test/e8505566fd9116b812e306ca196b62da.jpg', 'drive/MyDrive/Dog Vision/test/eb448b29df9391fb8b84c88c3a3313e6.jpg', 'drive/MyDrive/Dog Vision/test/eda4e3377eb5ad8a73026c08c76a613d.jpg', 'drive/MyDrive/Dog Vision/test/f4a2ee1dd5542da8b0150fe8f7a2b7c3.jpg', 'drive/MyDrive/Dog Vision/test/e93dde9e36ff6a41cd0e72fb01703be6.jpg', 'drive/MyDrive/Dog Vision/test/f49dbff2463687f867bfe1bc88f0a7c3.jpg', 'drive/MyDrive/Dog Vision/test/edf164f8974510c5936ee7224f3d7d56.jpg', 'drive/MyDrive/Dog Vision/test/efdf443fb4a6ad5964c18d6676f36652.jpg', 'drive/MyDrive/Dog Vision/test/ed4c608e160f24d9cca8f560b046daa2.jpg']
# How many test images are there?
len(test_filenames)
10357
# Create test data batch
test_data = create_data_batches(test_filenames, test_data=True)
Creating test data batches...
Note: Since there are 10,000+ test images, making predictions could take a while, even on a GPU. So beware running the cell below may take up to an hour.
# Make predictions on test data batch using the loaded full model
test_predictions = loaded_full_model.predict(test_data,
verbose=1)
324/324 [==============================] - 5083s 16s/step
# Check out the test predictions
test_predictions[:10]
array([[2.0112580e-08, 2.1171239e-08, 2.4507127e-08, ..., 6.7030271e-07, 5.9114016e-09, 6.1050753e-10], [2.2136067e-10, 1.0466246e-08, 1.9926205e-09, ..., 6.7800852e-03, 1.1990259e-05, 2.4394886e-08], [8.3025817e-09, 8.6744605e-11, 8.0413814e-10, ..., 1.4587031e-08, 2.1771426e-07, 2.8993969e-07], ..., [1.3067154e-12, 9.6459181e-11, 1.8441353e-12, ..., 1.6384671e-11, 5.6399996e-13, 7.6899147e-14], [4.5346393e-10, 7.7672667e-08, 1.7834777e-11, ..., 1.1289533e-09, 3.1667000e-12, 1.3890408e-10], [4.2566360e-13, 5.7505677e-13, 6.1184621e-12, ..., 1.6115807e-07, 1.0927422e-09, 1.7085367e-13]], dtype=float32)
test_predictions.shape
(10357, 120)
ten thousand three and fifty seven so that's how many test images we have. And each one of them has 120 different prediction probabilities.
Looking at the Kaggle sample submission, it looks like they want the models output probabilities each for label along with the image ID's.
http://www.kaggle.com/c/dog-breed-identification/overview/evaluation
To get the data in this format, we'll:
# list(unique_breeds)
# ["id"] + list(unique_breeds)
# Create pandas DataFrame with empty columns
preds_df = pd.DataFrame(columns=["id"] + list(unique_breeds))
preds_df.head()
id | affenpinscher | afghan_hound | african_hunting_dog | airedale | american_staffordshire_terrier | appenzeller | australian_terrier | basenji | basset | beagle | bedlington_terrier | bernese_mountain_dog | black-and-tan_coonhound | blenheim_spaniel | bloodhound | bluetick | border_collie | border_terrier | borzoi | boston_bull | bouvier_des_flandres | boxer | brabancon_griffon | briard | brittany_spaniel | bull_mastiff | cairn | cardigan | chesapeake_bay_retriever | chihuahua | chow | clumber | cocker_spaniel | collie | curly-coated_retriever | dandie_dinmont | dhole | dingo | doberman | ... | norwegian_elkhound | norwich_terrier | old_english_sheepdog | otterhound | papillon | pekinese | pembroke | pomeranian | pug | redbone | rhodesian_ridgeback | rottweiler | saint_bernard | saluki | samoyed | schipperke | scotch_terrier | scottish_deerhound | sealyham_terrier | shetland_sheepdog | shih-tzu | siberian_husky | silky_terrier | soft-coated_wheaten_terrier | staffordshire_bullterrier | standard_poodle | standard_schnauzer | sussex_spaniel | tibetan_mastiff | tibetan_terrier | toy_poodle | toy_terrier | vizsla | walker_hound | weimaraner | welsh_springer_spaniel | west_highland_white_terrier | whippet | wire-haired_fox_terrier | yorkshire_terrier |
---|
0 rows × 121 columns
test_path
'drive/MyDrive/Dog Vision/test/'
# Append test image ID's to predictions DataFrame
test_path = "drive/MyDrive/Dog Vision/test/"
preds_df["id"] = [os.path.splitext(path)[0] for path in os.listdir(test_path)]
preds_df.head()
id | affenpinscher | afghan_hound | african_hunting_dog | airedale | american_staffordshire_terrier | appenzeller | australian_terrier | basenji | basset | beagle | bedlington_terrier | bernese_mountain_dog | black-and-tan_coonhound | blenheim_spaniel | bloodhound | bluetick | border_collie | border_terrier | borzoi | boston_bull | bouvier_des_flandres | boxer | brabancon_griffon | briard | brittany_spaniel | bull_mastiff | cairn | cardigan | chesapeake_bay_retriever | chihuahua | chow | clumber | cocker_spaniel | collie | curly-coated_retriever | dandie_dinmont | dhole | dingo | doberman | ... | norwegian_elkhound | norwich_terrier | old_english_sheepdog | otterhound | papillon | pekinese | pembroke | pomeranian | pug | redbone | rhodesian_ridgeback | rottweiler | saint_bernard | saluki | samoyed | schipperke | scotch_terrier | scottish_deerhound | sealyham_terrier | shetland_sheepdog | shih-tzu | siberian_husky | silky_terrier | soft-coated_wheaten_terrier | staffordshire_bullterrier | standard_poodle | standard_schnauzer | sussex_spaniel | tibetan_mastiff | tibetan_terrier | toy_poodle | toy_terrier | vizsla | walker_hound | weimaraner | welsh_springer_spaniel | west_highland_white_terrier | whippet | wire-haired_fox_terrier | yorkshire_terrier | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | f4270652c14534bd1916311b24261b83 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | e8505566fd9116b812e306ca196b62da | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2 | eb448b29df9391fb8b84c88c3a3313e6 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | eda4e3377eb5ad8a73026c08c76a613d | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4 | f4a2ee1dd5542da8b0150fe8f7a2b7c3 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 rows × 121 columns
# Add the prediction probabilities to each dog breed column
preds_df[list(unique_breeds)] = test_predictions
preds_df.head()
id | affenpinscher | afghan_hound | african_hunting_dog | airedale | american_staffordshire_terrier | appenzeller | australian_terrier | basenji | basset | beagle | bedlington_terrier | bernese_mountain_dog | black-and-tan_coonhound | blenheim_spaniel | bloodhound | bluetick | border_collie | border_terrier | borzoi | boston_bull | bouvier_des_flandres | boxer | brabancon_griffon | briard | brittany_spaniel | bull_mastiff | cairn | cardigan | chesapeake_bay_retriever | chihuahua | chow | clumber | cocker_spaniel | collie | curly-coated_retriever | dandie_dinmont | dhole | dingo | doberman | ... | norwegian_elkhound | norwich_terrier | old_english_sheepdog | otterhound | papillon | pekinese | pembroke | pomeranian | pug | redbone | rhodesian_ridgeback | rottweiler | saint_bernard | saluki | samoyed | schipperke | scotch_terrier | scottish_deerhound | sealyham_terrier | shetland_sheepdog | shih-tzu | siberian_husky | silky_terrier | soft-coated_wheaten_terrier | staffordshire_bullterrier | standard_poodle | standard_schnauzer | sussex_spaniel | tibetan_mastiff | tibetan_terrier | toy_poodle | toy_terrier | vizsla | walker_hound | weimaraner | welsh_springer_spaniel | west_highland_white_terrier | whippet | wire-haired_fox_terrier | yorkshire_terrier | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | f4270652c14534bd1916311b24261b83 | 2.01126e-08 | 2.11712e-08 | 2.45071e-08 | 1.79404e-08 | 1.61681e-06 | 0.014534 | 8.37178e-08 | 8.64262e-06 | 2.98818e-06 | 4.4162e-06 | 5.04365e-08 | 0.00053157 | 8.9471e-07 | 4.38523e-07 | 5.70528e-08 | 7.82026e-06 | 1.66205e-06 | 2.18418e-07 | 2.02031e-07 | 1.13614e-06 | 1.73754e-08 | 6.01959e-10 | 2.86194e-09 | 3.07693e-10 | 8.55663e-07 | 6.65707e-08 | 1.15605e-08 | 0.000112519 | 1.05061e-07 | 1.32484e-06 | 3.21867e-08 | 2.16543e-08 | 2.6659e-06 | 6.97206e-05 | 5.11411e-09 | 7.15629e-09 | 9.14932e-07 | 9.32881e-08 | 8.19149e-08 | ... | 5.3284e-07 | 2.72668e-07 | 2.72569e-06 | 1.83256e-08 | 1.39135e-09 | 5.54169e-06 | 0.000163973 | 8.988e-10 | 6.74077e-08 | 8.06347e-07 | 3.02688e-06 | 1.26626e-07 | 4.20163e-06 | 2.74677e-06 | 4.44376e-07 | 4.2416e-07 | 3.17764e-07 | 9.24809e-09 | 6.70915e-08 | 2.71242e-05 | 1.49581e-08 | 1.86434e-08 | 5.19268e-08 | 3.01165e-09 | 1.48861e-08 | 1.41174e-07 | 7.71443e-10 | 8.00778e-08 | 7.30101e-07 | 4.67324e-06 | 9.9329e-08 | 6.81398e-10 | 4.55624e-08 | 3.14308e-05 | 1.00956e-07 | 5.77384e-07 | 3.9521e-07 | 6.70303e-07 | 5.9114e-09 | 6.10508e-10 |
1 | e8505566fd9116b812e306ca196b62da | 2.21361e-10 | 1.04662e-08 | 1.99262e-09 | 3.64201e-09 | 0.728761 | 0.00650459 | 0.000646437 | 0.0244998 | 3.46113e-06 | 0.026641 | 8.93233e-09 | 2.72354e-11 | 1.99109e-06 | 1.53078e-06 | 1.14689e-06 | 0.00124037 | 1.57826e-06 | 4.99011e-07 | 1.70856e-08 | 5.14319e-06 | 3.71266e-10 | 1.34199e-06 | 6.29028e-09 | 9.84522e-11 | 5.06777e-07 | 1.01697e-09 | 3.18055e-09 | 0.00127258 | 1.59955e-07 | 0.0713419 | 6.27554e-10 | 2.955e-09 | 9.22047e-08 | 9.06254e-06 | 1.10862e-07 | 3.66775e-08 | 3.5016e-06 | 0.000697399 | 2.0234e-07 | ... | 1.84571e-08 | 3.2872e-06 | 3.60944e-10 | 1.22155e-09 | 0.000284014 | 4.94841e-10 | 0.00126072 | 5.22379e-10 | 8.02796e-11 | 0.000668249 | 1.13677e-06 | 5.84605e-10 | 1.41152e-09 | 0.000161906 | 3.01312e-06 | 1.39833e-08 | 2.8353e-10 | 2.2675e-07 | 1.66505e-08 | 3.5481e-07 | 1.94067e-09 | 6.78264e-06 | 8.95229e-08 | 6.33456e-08 | 1.16605e-06 | 6.09031e-06 | 2.34884e-08 | 8.24237e-10 | 1.49408e-10 | 2.84969e-08 | 3.96628e-07 | 6.44832e-05 | 2.1116e-06 | 0.00244086 | 1.03223e-07 | 2.87282e-06 | 7.46239e-09 | 0.00678009 | 1.19903e-05 | 2.43949e-08 |
2 | eb448b29df9391fb8b84c88c3a3313e6 | 8.30258e-09 | 8.67446e-11 | 8.04138e-10 | 1.08432e-07 | 0.000517335 | 1.5479e-07 | 1.3791e-05 | 7.2845e-07 | 5.56809e-09 | 2.17368e-07 | 5.33338e-10 | 4.39326e-09 | 4.24454e-07 | 1.61675e-06 | 0.000581392 | 7.65866e-08 | 2.05725e-08 | 9.75134e-06 | 1.2225e-10 | 2.63223e-07 | 2.59691e-07 | 0.102638 | 0.000272346 | 5.4808e-08 | 1.03401e-06 | 0.305873 | 1.53873e-09 | 5.63568e-09 | 0.00026042 | 1.52446e-08 | 0.000614489 | 3.27108e-06 | 1.11003e-06 | 8.46671e-09 | 1.80835e-07 | 4.01441e-07 | 2.30783e-08 | 2.06195e-07 | 2.62528e-09 | ... | 5.3551e-08 | 2.73382e-08 | 2.09256e-08 | 2.35925e-08 | 2.06632e-06 | 4.01622e-06 | 4.73156e-07 | 2.04653e-08 | 5.20014e-08 | 0.000248648 | 0.112404 | 0.000304084 | 3.69386e-08 | 4.61409e-12 | 8.88896e-09 | 4.34868e-07 | 1.22388e-11 | 5.16845e-10 | 1.2819e-06 | 3.20843e-08 | 3.03991e-09 | 2.50014e-07 | 4.88398e-08 | 1.55156e-07 | 0.474941 | 3.20625e-06 | 4.45225e-09 | 8.92089e-05 | 1.61282e-05 | 1.78455e-07 | 4.51097e-08 | 1.33106e-08 | 9.056e-06 | 7.62345e-06 | 4.04678e-09 | 3.35672e-07 | 2.37941e-08 | 1.4587e-08 | 2.17714e-07 | 2.8994e-07 |
3 | eda4e3377eb5ad8a73026c08c76a613d | 9.26537e-07 | 3.26635e-08 | 6.82871e-08 | 5.18794e-10 | 1.68871e-08 | 8.37221e-10 | 3.40981e-08 | 2.20778e-12 | 2.59647e-08 | 1.79103e-14 | 7.14173e-12 | 1.03288e-08 | 5.67362e-07 | 2.00219e-08 | 1.93838e-08 | 9.29444e-09 | 0.0458083 | 1.99199e-10 | 7.12945e-08 | 3.27742e-12 | 1.47258e-07 | 3.90517e-11 | 5.2532e-11 | 6.66923e-08 | 1.56831e-07 | 4.07952e-10 | 6.04243e-13 | 5.98747e-09 | 4.55859e-07 | 1.00133e-10 | 3.5815e-07 | 8.98262e-09 | 1.19e-07 | 4.47095e-08 | 3.9413e-06 | 2.93385e-11 | 4.30888e-09 | 3.07327e-10 | 3.99583e-10 | ... | 3.38149e-07 | 1.07301e-09 | 6.71087e-13 | 7.4899e-07 | 1.39308e-07 | 2.31655e-09 | 1.16629e-10 | 2.69249e-10 | 2.29621e-08 | 3.01086e-08 | 7.4069e-11 | 3.98214e-10 | 1.91746e-08 | 8.78446e-07 | 1.70588e-09 | 1.47072e-08 | 1.38548e-09 | 8.54242e-10 | 8.38556e-12 | 9.22921e-08 | 1.11886e-10 | 6.77873e-11 | 1.7012e-12 | 1.01504e-10 | 4.66486e-09 | 0.00116486 | 8.98588e-12 | 9.13578e-09 | 8.17984e-06 | 3.62986e-09 | 1.66068e-10 | 7.52141e-10 | 4.2179e-11 | 9.81949e-08 | 6.14364e-09 | 5.17044e-12 | 7.7925e-13 | 2.78491e-08 | 6.82078e-10 | 3.85598e-12 |
4 | f4a2ee1dd5542da8b0150fe8f7a2b7c3 | 7.61158e-11 | 1.37542e-07 | 1.44641e-09 | 2.25871e-08 | 1.90936e-08 | 4.10887e-10 | 6.35165e-10 | 5.94552e-09 | 1.18957e-06 | 1.88869e-08 | 0.00458039 | 1.74683e-09 | 1.02185e-07 | 6.06061e-09 | 5.15069e-08 | 9.61646e-10 | 2.23766e-11 | 1.72951e-11 | 7.86473e-09 | 2.20291e-12 | 7.41175e-08 | 5.08411e-06 | 1.31718e-09 | 5.02238e-08 | 3.79764e-11 | 1.12728e-08 | 1.69794e-11 | 2.40775e-10 | 1.33196e-07 | 7.85809e-12 | 7.5683e-10 | 1.01179e-05 | 2.66024e-10 | 1.93837e-11 | 1.22406e-09 | 3.10886e-05 | 1.25605e-10 | 1.02279e-08 | 3.58843e-07 | ... | 1.21214e-10 | 2.83235e-11 | 0.0167767 | 2.42694e-07 | 5.32177e-10 | 1.39834e-10 | 3.50069e-10 | 6.68611e-11 | 2.78724e-09 | 1.17418e-08 | 5.41092e-07 | 4.2353e-10 | 3.22427e-08 | 3.21732e-10 | 8.59664e-10 | 8.40538e-11 | 2.07964e-08 | 5.11871e-10 | 0.978502 | 1.83954e-10 | 3.47994e-08 | 1.64592e-10 | 1.26807e-12 | 5.57661e-05 | 3.08911e-11 | 2.59266e-08 | 8.32829e-07 | 6.59586e-10 | 6.47303e-07 | 4.21823e-08 | 1.76337e-08 | 3.36395e-12 | 3.40941e-09 | 5.09157e-08 | 1.41207e-07 | 9.04239e-10 | 3.83715e-07 | 4.42624e-08 | 2.55945e-06 | 1.94814e-09 |
5 rows × 121 columns
# Save our predictions DataFrame to CSV for submission to Kaggle
preds_df.to_csv("drive/MyDrive/Dog Vision/full_submission_1_mobilienetV2_adam.csv",
index=False)
It's great being able to make predictions on a test dataset already provided for us.
But how could we use our model on our own images?
The premise remains, if we want to make predictions on our own custom images, we have to pass them to the model in the same format the model was trained on.
To do so, we'll:
create_data_batches()
. And since our custom images won't have labels, we set the test_data parameter
to True
.Note: To make predictions on custom images, I've uploaded pictures of my own to a directory located at drive/My Drive/Data/dogs/ (as seen in the cell below). In order to make predictions on your own images, you will have to do something similar.
# Get custom image filepaths
custom_path = "drive/MyDrive/Dog Vision/mes-chiens/"
custom_image_paths = [custom_path + fname for fname in os.listdir(custom_path)]
custom_image_paths
['drive/MyDrive/Dog Vision/mes-chiens/images-w479.jpeg', 'drive/MyDrive/Dog Vision/mes-chiens/images-w478.jpeg', 'drive/MyDrive/Dog Vision/mes-chiens/images-w542.jpeg', 'drive/MyDrive/Dog Vision/mes-chiens/images-w477.jpeg', 'drive/MyDrive/Dog Vision/mes-chiens/1607427432441.jpeg', 'drive/MyDrive/Dog Vision/mes-chiens/1607427432430.jpeg', 'drive/MyDrive/Dog Vision/mes-chiens/1607427432429.jpeg', 'drive/MyDrive/Dog Vision/mes-chiens/1607427432485.jpeg', 'drive/MyDrive/Dog Vision/mes-chiens/1607427432482.jpeg', 'drive/MyDrive/Dog Vision/mes-chiens/1607427432504.jpeg', 'drive/MyDrive/Dog Vision/mes-chiens/1607427432499.jpeg', 'drive/MyDrive/Dog Vision/mes-chiens/1607427432445.jpeg', 'drive/MyDrive/Dog Vision/mes-chiens/Download (5).jpg', 'drive/MyDrive/Dog Vision/mes-chiens/Download (4).jpg', 'drive/MyDrive/Dog Vision/mes-chiens/Download (3).jpg', 'drive/MyDrive/Dog Vision/mes-chiens/Download (2).jpg', 'drive/MyDrive/Dog Vision/mes-chiens/Download (8).jpg', 'drive/MyDrive/Dog Vision/mes-chiens/Download (9).jpg', 'drive/MyDrive/Dog Vision/mes-chiens/Download (6).jpg', 'drive/MyDrive/Dog Vision/mes-chiens/Download (7).jpg']
# Turn custom image into batch (set to test data because there are no labels)
custom_data = create_data_batches(custom_image_paths, test_data=True)
Creating test data batches...
custom_data
<BatchDataset shapes: (None, 224, 224, 3), types: tf.float32>
# Make predictions on the custom data
custom_preds = loaded_full_model.predict(custom_data)
custom_preds
array([[7.2157842e-11, 2.2818722e-09, 1.6608503e-08, ..., 3.5622949e-11, 6.0952758e-08, 2.1087553e-05], [9.9273289e-05, 8.9125633e-06, 6.6407611e-06, ..., 2.7269151e-04, 3.4167135e-07, 4.5934615e-07], [3.6818055e-09, 3.9134576e-09, 1.0759997e-07, ..., 1.9939009e-10, 3.6704992e-07, 7.6268343e-09], ..., [1.4935681e-06, 2.5248237e-10, 2.8038405e-09, ..., 1.9543399e-10, 5.8166666e-11, 2.1809663e-08], [2.6793785e-07, 1.6586291e-11, 2.1111300e-11, ..., 5.4462781e-09, 1.8247366e-10, 4.3379192e-10], [6.6590631e-09, 4.1176083e-11, 4.3010560e-12, ..., 1.4219248e-11, 7.9193110e-11, 1.9993893e-10]], dtype=float32)
custom_preds.shape
(20, 120)
Now we've got some predictions arrays, let's convert them to labels and compare them with each image.
# Get custom image prediction labels
custom_pred_labels = [get_pred_label(custom_preds[i]) for i in range(len(custom_preds))]
custom_pred_labels
['rottweiler', 'great_dane', 'siberian_husky', 'labrador_retriever', 'german_shepherd', 'german_shepherd', 'german_shepherd', 'german_shepherd', 'german_shepherd', 'german_shepherd', 'german_shepherd', 'german_shepherd', 'pug', 'pug', 'pug', 'pug', 'pug', 'pug', 'pug', 'pug']
# Get custom images (our unbatchify() function won't work since there aren't labels)
custom_images = []
# Loop through unbatched data
for image in custom_data.unbatch().as_numpy_iterator():
custom_images.append(image)
# Check custom image predictions
plt.figure(figsize=(10, 10))
for i, image in enumerate(custom_images):
plt.subplot(5, 5, i+1)
plt.xticks([])
plt.yticks([])
plt.title(custom_pred_labels[i])
plt.imshow(image)
Woah! What an effort. If you've made it this far, you've just gone end-to-end on a multi-class image classification problem.
This is the same style of problem self-driving cars have, except with different data.
If you're looking on where to go next, you've got plenty of options.
You could try to improve the full model we trained in this notebook in a few ways (there are a fair few options). Since our early experiment (using only 1000 images) hinted at our model overfitting (the results on the training set far outperformed the results on the validation set), one goal going forward would be to try and prevent it.
If you're ever after more, one of the best ways to find out something is to search for something like:
And when you see an example you think might be beyond your reach (because it looks too complicated), remember, if in doubt, run the code. Try and reproduce what you see. This is the best way to get hands-on and build your own knowledge.
No one starts out knowing how to do everything single thing. They just get better are knowing what to look for.
Woah ! Quel effort ! Si vous êtes arrivés jusqu'ici, vous venez de vous attaquer de bout en bout à un problème de classification d'images multi-classes.
Il s'agit du même type de problème que les voitures à conduite autonome, mais avec des données différentes.
Si vous souhaitez savoir où aller ensuite, vous avez de nombreuses options.
Vous pouvez essayer d'améliorer le modèle complet que nous avons formé dans ce carnet de plusieurs façons (il existe un grand nombre d'options). Puisque notre première expérience (utilisant seulement 1000 images) laissait entrevoir un surajustement du modèle (les résultats de l'ensemble d'entraînement dépassaient de loin les résultats de l'ensemble de validation), un objectif pour l'avenir serait d'essayer de l'empêcher.
Essayer un autre modèle de TensorFlow Hub - Peut-être qu'un modèle différent serait plus performant sur notre jeu de données. Une option serait d'expérimenter avec un modèle pré-entraîné différent de TensorFlow Hub ou de regarder dans le module tf.keras.applications. Augmentation des données - Prenez les images d'entraînement et manipulez-les (recadrage, redimensionnement) ou déformez-les (retournement, rotation) pour créer encore plus de données d'entraînement pour le modèle. Consultez la documentation sur les images TensorFlow pour découvrir un grand nombre de fonctions que vous pouvez utiliser sur les images. Une bonne idée serait d'essayer de reproduire les techniques de cet exemple de cahier de classification d'images de chats et de chiens pour notre problème de races de chiens. Ajustement fin - Le modèle que nous avons utilisé dans ce carnet provenait directement de TensorFlow Hub, nous avons pris ce qu'il avait déjà appris d'un autre jeu de données (ImageNet) et l'avons appliqué au nôtre. Une autre option consiste à utiliser ce que le modèle connaît déjà et à l'adapter à notre propre jeu de données (photos de chiens). Cela signifierait que tous les modèles du modèle seraient mis à jour pour être plus spécifiques aux photos de chiens plutôt qu'aux images générales. Si vous voulez en savoir plus, l'une des meilleures façons de trouver quelque chose est de faire une recherche du type :
"Comment améliorer un modèle de classification d'images TensorFlow 2.x ?" "Meilleures pratiques de classification d'images avec TensorFlow 2.x" "Apprentissage par transfert pour la classification d'images avec TensorFlow 2.x". Et lorsque vous voyez un exemple que vous pensez être hors de votre portée (parce qu'il semble trop compliqué), rappelez-vous, en cas de doute, exécutez le code. Essayez de reproduire ce que vous voyez. C'est la meilleure façon de mettre la main à la pâte et de développer vos propres connaissances.
Personne ne commence par savoir comment faire tout et n'importe quoi. On s'améliore simplement en sachant ce qu'il faut chercher.