Utilisation de tensorflow en FR. http://perso.univ-lemans.fr/~berger/CoursTF/CoursTF/co/tensorFlow.html

Project: https://github.com/paloukari/NIH-Chest-X-rays-Classification/blob/master/src/v2-train-simple-xray-cnn-multi-binarizer.ipynb

Get our workspace ready

Getting our data ready (turning into Tensors)

With all machine learning models, our data has to be in numerical format. So that's what we'll be doing first (numerical representations).

Accessing the data and checking out the labels.

Looking at this, we can see there are 10222 different ID's (meaning 10222 different images) and 120 different breeds.

Let's figure out how many images there are of each breed.

Getting images and their labels

Let's get a list of all our image file pathnames.

Since we've got our training image filepaths in a list, let's prepare our labels.

convert labels column to NumPy array

Creating our own validation set

Since the dataset from Kaggle doesn't come with a validation set, we're going to create our own.

The most important concept in machine learning (the 3 sets; Training set, Validation set, test set)

We're going to start off experimenting with about 1000 images and increase as needed.

VERY IMPORTANT FCT FOR ME

Now let's split our data into training and validation sets. We'll use and 80/20 split (80% training data, 20% validation data).

Preprocessing images (turning images into Tensors)

Our labels are in numeric format but our images are still just file paths.

Since we're using TensorFlow, our data has to be in the form of Tensors.

A Tensor is a way to represent information in numbers. If you're familar with NumPy arrays (you should be), a Tensor can be thought of as a combination of NumPy arrays, except with the special ability to be used on a GPU.

Because of how TensorFlow stores information (in Tensors), it allows machine learning and deep learning models to be run on GPUs (generally faster at numerical computing).

To preprocess our images into Tensors we're going to write a function which does a few things:

  1. Takes an image filename as input.
  2. Uses TensorFlow to read the file and save it to a variable, image.
  3. Turn our image (a jpeg file) into Tensors.
  4. Resize the image to be of shape (224, 224).
  5. Return the modified image.

A good place to read about this type of function is the TensorFlow documentation on loading images.

You might be wondering why (224, 224), which is (heigh, width). It's because this is the size of input our model (we'll see this soon) takes, an image which is (224, 224, 3).

What? Where's the 3 from? We're getting ahead of ourselves but that's the number of colour channels per pixel, red, green and blue.

Notice the shape of image. It's (257, 350, 3). This is height, width, colour channel value.

And you can easily convert it to a Tensor using tf.constant().

Ok, now let's build that function we were talking about.

Now we've seen what an image looks like as a Tensor, let's make a function to preprocess them.

  1. Takes an image filename as input.
  2. Uses TensorFlow to read the file and save it to a variable, image.
  3. Turn our image (a jpeg file) into Tensors.
  4. Resize the image to be of shape (224, 224).
  5. Return the modified image.

Creating a function for preprocessing images

Creating data batches

Turning our data into batches

Wonderful. Now we've got a function to convert our images into Tensors, we'll now build one to turn our data into batches (more specifically, a TensorFlow BatchDataset).

Why turn our data into batches?

A batch (also called mini-batch) is a small portion of your data, say 32 (32 is generally the default batch size) images and their labels. In deep learning, instead of finding patterns in an entire dataset at the same time, you often find them one batch at a time.

Let's say you're dealing with 10,000+ images (which we are). Together, these files may take up more memory than your GPU has. Trying to compute on them all would result in an error.

Instead, it's more efficient to create smaller batches of your data and compute on one batch at a time.

TensorFlow is very efficient when your data is in batches of (image, label) Tensors. So we'll build a function to do create those first. We'll take advantage of of process_image function at the same time.

Now we've got a simple function to turn our image file path names and their associated labels into tuples (we can turn these into Tensors next), we'll create a function to make data batches.

Because we'll be dealing with 3 different sets of data (training, validation and test), we'll make sure the function can accomodate for each set.

We'll set a default batch size of 32 because according to Yann Lecun (one of the OG's of deep learning), friends don't let friends train with batch sizes over 32.

Look at that! We've got our data in batches, more specifically, they're in Tensor pairs of (images, labels) ready for use on a GPU.

But having our data in batches can be a bit of a hard concept to understand. Let's build a function which helps us visualize what's going on under the hood.

Visualizing data batches

Our Data is now in batches, however, these can be a little hard to uderstand/comprehend, let's visualize them!

What is Data Batches? (Qu'est-ce que les lots de données ?)

Le traitement des données par lots est un moyen efficace de traiter de gros volumes de données lorsqu'un groupe de transactions est collecté sur une certaine période de temps. Les données sont collectées, saisies et traitées, puis les résultats du traitement par lots sont produits (Hadoop se concentre sur le traitement des données par lots). Le traitement par lots nécessite des programmes distincts pour l'entrée, le traitement et la sortie. Les systèmes de paie et de facturation en sont un exemple.

En revanche, le traitement des données en temps réel implique une entrée, un traitement et une sortie continus des données. Les données doivent être traitées sur une petite période de temps (ou presque en temps réel). Les systèmes de radar, les services à la clientèle et les guichets automatiques bancaires en sont des exemples.

To make computation efficient, a batch is a tighly wound collection of Tensors.

So to view data in a batch, we've got to unwind it.

We can do so by calling the as_numpy_iterator() method on a data batch.

This will turn our a data batch into something which can be iterated over.

Passing an iterable to next() will return the next item in the iterator.

In our case, next will return a batch of 32 images and label pairs.

en français

Pour que le calcul soit efficace, un lot est un ensemble de tenseurs fortement enroulés. Pour visualiser les données d'un lot, il faut donc le dérouler. Nous pouvons le faire en appelant la méthode as_numpy_iterator() sur un lot de données. Cela transformera notre lot de données en quelque chose qui pourra être répété. Le passage d'un itérable à next() renverra l'élément suivant dans l'itérateur. Dans notre cas, next renverra un lot de 32 images et paires d'étiquettes.

Note: Running the cell below and loading images may take a little while.

Look at all those beautiful dogs!

Question : Rerun the cell above, why do you think a different set of images is displayed each time you run it?

Even more dogs!

Question : Why does running the cell above and viewing validation images return the same dogs each time?

Creating and training a model

Now our data is ready, let's prepare it modelling. We'll use an existing model from TensorFlow Hub.

TensorFlow Hub is a resource where you can find pretrained machine learning models for the problem you're working on.

Using a pretrained machine learning model is often referred to as transfer learning .

Why use a pretrained model?

Building a machine learning model and training it on lots from scratch can be expensive and time consuming.

Transfer learning helps eliviate some of these by taking what another model has learned and using that information with your own problem.

How do we choose a model?

Since we know our problem is image classification (classifying different dog breeds), we can navigate the TensorFlow Hub page by our problem domain (image).

We start by choosing the image problem domain, and then can filter it down by subdomains, in our case, image classification.

Doing this gives a list of different pretrained models we can apply to our task.

Clicking on one gives us information about the model as well as instructions for using it.

For example, clicking on the mobilenet_v2_130_224 model, tells us this model takes an input of images in the shape 224, 224. It also says the model has been trained in the domain of image classification.

Let's try it out.

Building a model

Before we build a model, there are a few things we need to define:

https://tfhub.dev/google/imagenet/mobilenet_v2_130_224/classification/4

Now we've got the inputs, outputs and model we're using ready to go. We can start to put them together

There are many ways of building a model in TensorFlow but one of the best ways to get started is to use the Keras API.

Defining a deep learning model in Keras can be as straightforward as saying, "here are the layers of the model, the input shape and the output shape, let's go!"

Knowing this, let's create a function which:

All of these steps can be found here: https://www.tensorflow.org/guide/keras/overview

What's happening here?

Setting up the model layers

There are two ways to do this in Keras, the functional and sequential API. We've used the sequential.

Which one should you use?

The Keras documentation states the functional API is the way to go for defining complex models but the sequential API (a linear stack of layers) is perfectly fine for getting started, which is what we're doing.

The first layer we use is the model from TensorFlow Hub (hub.KerasLayer(MODEL_URL). So our first layer is actually an entire model (many more layers). This input layer takes in our images and finds patterns in them based on the patterns mobilenet_v2_130_224 has found.

The next layer (tf.keras.layers.Dense()) is the output layer of our model. It brings all of the information discovered in the input layer together and outputs it in the shape we're after, 120 (the number of unique labels we have).

The activation="softmax" parameter tells the output layer, we'd like to assign a probability value to each of the 120 labels somewhere between 0 & 1. The higher the value, the more the model believes the input image should have that label. If we were working on a binary classification problem, we'd use activation="sigmoid".

For more on which activation function to use, see the article Which Loss and Activation Functions Should I Use?

Compiling the model

This one is best explained with a story.

Let's say you're at the international hill descending championships. Where your start standing on top of a hill and your goal is to get to the bottom of the hill. The catch is you're blindfolded.

Luckily, your friend Adam is standing at the bottom of the hill shouting instructions on how to get down.

At the bottom of the hill there's a judge evaluating how you're doing. They know where you need to end up so they compare how you're doing to where you're supposed to be. Their comparison is how you get scored.

Transferring this to model.compile() terminology:

Building the model

We use model.build() whenever we're using a layer from TensorFlow Hub to tell our model what input shape it can expect.

In this case, the input shape is [None, IMG_SIZE, IMG_SIZE, 3] or [None, 224, 224, 3] or [batch_size, img_height, img_width, color_channels].

Batch size is left as None as this is inferred from the data we pass the model. In our case, it'll be 32 since that's what we've set up our data batches as.

Now we've gone through each section of the function, let's use it to create a model.

We can call summary() on our model to get idea of what our model looks like.

Create a model and check its details

The non-trainable parameters are the patterns learned by mobilenet_v2_130_224 and the trainable parameters are the ones in the dense layer we added.

This means the main bulk of the information in our model has already been learned and we're going to take that and adapt it to our own problem.

Creating callbacks (tensorflow keras model callbacks)

We've got a model ready to go but before we train it we'll make some callbacks.

Callbacks are helper functions a model can use during training to do things such as save a models progress, check a models progress or stop training early if a model stops improving.

The two callbacks we're going to add are a TensorBoard callback and an Early Stopping callback.

TensorBoard Callback

TensorBoard helps provide a visual way to monitor the progress of your model during and after training.

It can be used directly in a notebook to track the performance measures of a model such as loss and accuracy.

To set up a TensorBoard callback and view TensorBoard in a notebook, we need to do three things:

  1. Load the TensorBoard notebook extension.
  2. Create a TensorBoard callback which is able to save logs to a directory and pass it to our model's fit() function.
  3. Visualize the our models training logs using the %tensorboard magic function (we'll do this later on).

Early Stopping Callback

https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/EarlyStopping

Early stopping helps prevent overfitting by stopping a model when a certain evaluation metric stops improving. If a model trains for too long, it can do so well at finding patterns in a certain dataset that it's not able to use those patterns on another dataset it hasn't seen before (doesn't generalize).

It's basically like saying to our model, "keep finding patterns until the quality of those patterns starts to go down."

Training a model (on a subset of data)

Our first model is only going to be trained on 1000 images. Or trained on 800 images and then validated on 200 images, meaning 1000 images total or about 10% of the total data.

We do this to make sure everything is working. And if it is, we can step it up later and train on the entire training dataset.

The final parameter we'll define before training is NUM_EPOCHS (also known as number of epochs).

NUM_EPOCHS defines how many passes of the data we'd like our model to do. A pass is equivalent to our model trying to find patterns in each dog image and see which patterns relate to each label.

If NUM_EPOCHS=1, the model will only look at the data once and will probably score badly because it hasn't a chance to correct itself. It would be like you competing in the international hill descent championships and your friend Adam only being able to give you 1 single instruction to get down the hill.

What's a good value for NUM_EPOCHS?

This one is hard to say. 10 could be a good start but so could 100. This is one of the reasons we created an early stopping callback. Having early stopping setup means if we set NUM_EPOCHS to 100 but our model stops improving after 22 epochs, it'll stop training.

Along with this, let's quickly check if we're still using a GPU.

Boom! We've got a GPU running and NUM_EPOCHS setup. Let's create a simple function which trains a model. The function will:

Note: When training a model for the first time, the first epoch will take a while to load compared to the rest. This is because the model is getting ready and the data is being initialised. Using more data will generally take longer, which is why we've started with ~1000 images. After the first epoch, subsequent epochs should take a few seconds.

Question : It looks like our model might be overfitting (getting far better results on the training set than the validation set), what are some ways to prevent model overfitting? Hint: this may involve searching something like "ways to prevent overfitting in a deep learning model?".

Note : Overfitting to begin with is a good thing. It means our model is learning something.

Checking the TensorBoard logs

Now our model has been trained, we can make its performance visual by checking the TensorBoard logs.

The TensorBoard magic function (%tensorboard) will access the logs directory we created earlier and viualize its contents.

Thanks to our early_stopping callback, the model stopped training after 26 or so epochs (in my case, yours might be slightly different). This is because the validation accuracy failed to improve for 3 epochs.

But the good new is, we can definitely see our model is learning something. The validation accuracy got to 65% in only a few minutes.

This means, if we were to scale up the number of images, hopefully we'd see the accuracy increase.

Making and evaluating predictions using a trained model

Before we scale up and train on more data, let's see some other ways we can evaluate our model. Because although accuracy is a pretty good indicator of how our model is doing, it would be even better if we could could see it in action.

Making predictions with a trained model is as calling predict() on it and passing it data in the same format the model was trained on.

So this means that we've got an array of 200 by 120 so we have 200.

Making predictions with our model returns an array with a different value for each label.

In this case, making predictions on the validation data (200 images) returns an array (predictions) of arrays, each containing 120 different values (one for each unique dog breed).

These different values are the probabilities or the likelihood the model has predicted a certain image being a certain breed of dog. The higher the value, the more likely the model thinks a given image is a specific breed of dog.

Let's see how we'd convert an array of probabilities into an actual label.

So the max value the probability of prediction the maximum this can be as one because remember the total of a soft max function is between 0 1 so the highest a single value can be is 1. So this is saying we're predicting the highest label in here with a 28.19% prediction probability.

Max index: 26

Predicted label: cairn

So one methodology of evaluating a model is to go okay. Only give me the predictions that have a confidence value over point seventy five. Every prediction that's max value is under point seven five gets pushed to the wayside. Those are things you might want to pass to a human classifier but everything over point seven five you might want to let the model predict itself.

Having this information is great but it would be even better if we could compare a prediction to its true label and original image.

To help us, let's first build a little function to convert prediction probabilities into predicted labels.

Note: Prediction probabilities are also known as confidence levels.

little function to convert prediction probabilities into predicted labels.

Wonderful! Now we've got a list of all different predictions our model has made, we'll do the same for the validation images and validation labels.

Remember, the model hasn't trained on the validation data, during the fit() function, it only used the validation data to evaluate itself. So we can use the validation images to visually compare our models predictions with the validation labels.

Since our validation data (val_data) is in batch form, to get a list of validation images and labels, we'll have to unbatch it (using unbatch()) and then turn it into an iterator using as_numpy_iterator().

Let's make a small function to do so.

Create a function to unbatch a batched dataset

Nailed it!

Now we've got ways to get:

Let's make some functions to make these all a bit more visualize.

More specifically, we want to be able to view an image, its predicted label and its actual label (true label).

The first function we'll create will:

Well this is very exciting because now we're actually seeing our model's predictions compared to actual images dog vision is officially coming to life. This is the kind of functionality we wanted from the very start to be able to pass our model and image and it make a prediction. There's a really exciting time. Dog vision is officially coming together. So now we've got one function to visualize what our model's predicting.

Nice! Making functions to help visual your models results are really helpful in understanding how your model is doing.

Since we're working with a multi-class problem (120 different dog breeds), it would also be good to see what other guesses our model is making. More specifically, if our model predicts a certain label with 24% probability, what else did it predict?

Let's build a function to demonstrate. The function will:

Plot the top 10 prediction probability values and labels, coloring the true label green.

The top 10 prediction labels so these are the top 10 predictions that our model is made. pomeranian is the predicted label because it's got the actual predicted label because this is got the highest value prediction value. The top value and the true label is the one in green so Scotich_terrier.

Wonderful! Now we've got some functions to help us visualize our predictions and evaluate our model, let's check out a few.

Saving and reloading a trained model

After training a model, it's a good idea to save it. Saving it means you can share it with colleagues, put it in an application and more importantly, won't have to go through the potentially expensive step of retraining it.

The format of an entire saved Keras model is h5. So we'll make a function which can take a model as input and utilise the save() method to save it as a h5 file to a specified directory.

If we've got a saved model, we'd like to load it, let's create a function which can take a model path and use the tf.keras.models.load_model() function to load it into the notebook.

Because we're using a component from TensorFlow Hub (hub.KerasLayer) we'll have to pass this as a parameter to the custom_objects parameter.

Create a function to load a trained model

Compare the two models (the original one and loaded one). We can do so easily using the evaluate() method.

Training a big dog model on the full data

Now we know our model works on a subset of the data, we can start to move forward with training one on the full data.

Above, we saved all of the training filepaths to X and all of the training labels to y. Let's check them out.

There we go! We've got over 10,000 images and labels in our training set.

Before we can train a model on these, we'll have to turn them into a data batch.

The beautiful thing is, we can use our create_data_batches() function from above which also preprocesses our images for us (thank you past us for writing a helpful function).

Create a data batch with the full data set

Our data is in a data batch, all we need now is a model.

And surprise, we've got a function for that too! Let's use create_model() to instantiate another model.

Create a model for full model

Create full model callbacks

Since we've made a new model instance, full_model, we'll need some callbacks too.

Fit the full model to the full data

Note: Since running the cell below will cause the model to train on all of the data (10,000+) images, it may take a fairly long time to get started and finish. However, thanks to our full_model_early_stopping callback, it'll stop before it starts going too long.

Note: Running the cell below will take a little while (maybe up to 30 minutes for the first epoch) because the GPU we're using in the runtime has to load all of the images into memory.

Saving and reloading the full model

Even on a GPU, our full model took a while to train. So it's a good idea to save it.

We can do so using our save_model() function.

Challenge: It may be a good idea to incorporate the save_model() function into a train_model() function. Or look into setting up a checkpoint callback.

To monitor the model whilst it trains, we'll load TensorBoard (it should update every 30-seconds or so whilst the model trains).

Load in the full model

Making predictions on the test dataset

Since our model has been trained on images in the form of Tensor batches, to make predictions on the test data, we'll have to get it into the same format.

Luckily we created create_data_batches() earlier which can take a list of filenames as input and convert them into Tensor batches.

To make predictions on the test data, we'll:

Get the test image filenames. Convert the filenames into test data batches using create_data_batches() and setting the test_data parameter to True (since there are no labels with the test images). Make a predictions array by passing the test data batches to the predict() function.

So now we've got the file names towards our test data.

Create test data batch

Make predictions on test data batch using the loaded full model

Note: Since there are 10,000+ test images, making predictions could take a while, even on a GPU. So beware running the cell below may take up to an hour.

So these are all prediction probabilities for the 10000 images

ten thousand three and fifty seven so that's how many test images we have. And each one of them has 120 different prediction probabilities.

Preparing test dataset predictions for Kaggle

Looking at the Kaggle sample submission, it looks like they want the models output probabilities each for label along with the image ID's.

http://www.kaggle.com/c/dog-breed-identification/overview/evaluation

To get the data in this format, we'll:

Create pandas DataFrame with empty columns

Boom! Let's now export our predictions DataFrame to CSV so we can submit it to Kaggle.

Save our predictions DataFrame to CSV for submission to Kaggle

Making predictions on custom images

It's great being able to make predictions on a test dataset already provided for us.

But how could we use our model on our own images?

The premise remains, if we want to make predictions on our own custom images, we have to pass them to the model in the same format the model was trained on.

To do so, we'll:

Note: To make predictions on custom images, I've uploaded pictures of my own to a directory located at drive/My Drive/Data/dogs/ (as seen in the cell below). In order to make predictions on your own images, you will have to do something similar.

Create data batches

Make predictions on the custom data

Custom threads 20 images. So we've got 120 prediction probabilities for each one.

Now we've got some predictions arrays, let's convert them to labels and compare them with each image.

What's next?

Woah! What an effort. If you've made it this far, you've just gone end-to-end on a multi-class image classification problem.

This is the same style of problem self-driving cars have, except with different data.

If you're looking on where to go next, you've got plenty of options.

You could try to improve the full model we trained in this notebook in a few ways (there are a fair few options). Since our early experiment (using only 1000 images) hinted at our model overfitting (the results on the training set far outperformed the results on the validation set), one goal going forward would be to try and prevent it.

  1. Trying another model from TensorFlow Hub - Perhaps a different model would perform better on our dataset. One option would be to experiment with a different pretrained model from TensorFlow Hub or look into the tf.keras.applications module.
  2. Data augmentation - Take the training images and manipulate (crop, resize) or distort them (flip, rotate) to create even more training data for the model to learn from. Check out the TensorFlow images documentation for a whole bunch of functions you can use on images. A great idea would be to try and replicate the techniques in this example cat vs. dog image classification notebook for our dog breeds problem.
  3. Fine-tuning - The model we used in this notebook was directly from TensorFlow Hub, we took what it had already learned from another dataset (ImageNet) and applied it to our own. Another option is to use what the model already knows and fine-tune this knowledge to our own dataset (pictures of dogs). This would mean all of the patterns within the model would be updated to be more specific to pictures of dogs rather than general images.

If you're ever after more, one of the best ways to find out something is to search for something like:

And when you see an example you think might be beyond your reach (because it looks too complicated), remember, if in doubt, run the code. Try and reproduce what you see. This is the best way to get hands-on and build your own knowledge.

No one starts out knowing how to do everything single thing. They just get better are knowing what to look for.

Quelle est la suite ?

Woah ! Quel effort ! Si vous êtes arrivés jusqu'ici, vous venez de vous attaquer de bout en bout à un problème de classification d'images multi-classes.

Il s'agit du même type de problème que les voitures à conduite autonome, mais avec des données différentes.

Si vous souhaitez savoir où aller ensuite, vous avez de nombreuses options.

Vous pouvez essayer d'améliorer le modèle complet que nous avons formé dans ce carnet de plusieurs façons (il existe un grand nombre d'options). Puisque notre première expérience (utilisant seulement 1000 images) laissait entrevoir un surajustement du modèle (les résultats de l'ensemble d'entraînement dépassaient de loin les résultats de l'ensemble de validation), un objectif pour l'avenir serait d'essayer de l'empêcher.

Essayer un autre modèle de TensorFlow Hub - Peut-être qu'un modèle différent serait plus performant sur notre jeu de données. Une option serait d'expérimenter avec un modèle pré-entraîné différent de TensorFlow Hub ou de regarder dans le module tf.keras.applications. Augmentation des données - Prenez les images d'entraînement et manipulez-les (recadrage, redimensionnement) ou déformez-les (retournement, rotation) pour créer encore plus de données d'entraînement pour le modèle. Consultez la documentation sur les images TensorFlow pour découvrir un grand nombre de fonctions que vous pouvez utiliser sur les images. Une bonne idée serait d'essayer de reproduire les techniques de cet exemple de cahier de classification d'images de chats et de chiens pour notre problème de races de chiens. Ajustement fin - Le modèle que nous avons utilisé dans ce carnet provenait directement de TensorFlow Hub, nous avons pris ce qu'il avait déjà appris d'un autre jeu de données (ImageNet) et l'avons appliqué au nôtre. Une autre option consiste à utiliser ce que le modèle connaît déjà et à l'adapter à notre propre jeu de données (photos de chiens). Cela signifierait que tous les modèles du modèle seraient mis à jour pour être plus spécifiques aux photos de chiens plutôt qu'aux images générales. Si vous voulez en savoir plus, l'une des meilleures façons de trouver quelque chose est de faire une recherche du type :

"Comment améliorer un modèle de classification d'images TensorFlow 2.x ?" "Meilleures pratiques de classification d'images avec TensorFlow 2.x" "Apprentissage par transfert pour la classification d'images avec TensorFlow 2.x". Et lorsque vous voyez un exemple que vous pensez être hors de votre portée (parce qu'il semble trop compliqué), rappelez-vous, en cas de doute, exécutez le code. Essayez de reproduire ce que vous voyez. C'est la meilleure façon de mettre la main à la pâte et de développer vos propres connaissances.

Personne ne commence par savoir comment faire tout et n'importe quoi. On s'améliore simplement en sachant ce qu'il faut chercher.