{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Python for Signal Processing\n", "Danilo Greco, PhD - danilo.greco@uniparthenope.it - University of Naples Parthenope" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "9WjsBPzEFKMW" }, "source": [ "# Deep Learning Example\n", "\n", "This example is inspired by [MIT Deep Learning](https://deeplearning.mit.edu).\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "0SyO72xOFKMX" }, "source": [ "## Pre-requisites:\n", "\n", "It is possible to run this notebook locally on a jupyter installation or we recommend that you run this this notebook in the cloud on Google Colab because it is possible to test the execution both on the standard CPU kernel and on a GPU powered one.\n", "\n", "You can switch between them by selecting the following menuitems: Edit $\\rightarrow$ Notebook settings $\\rightarrow$ Hardware accelerator\n", "\n", "Local execution requires to [install TensorFlow](https://www.tensorflow.org/install/), which is not always an easy task, and [tf.keras](https://www.tensorflow.org/guide/keras) which\n", "is TensorFlow's high-level API for building and training deep learning models. \n", "\n", "You may find detailed documentation at [Keras Guide](https://www.tensorflow.org/guide/keras).\n", "\n", "## Required imports:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 34 }, "colab_type": "code", "executionInfo": { "elapsed": 9048, "status": "ok", "timestamp": 1588525634389, "user": { "displayName": "alberto cabri", "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GiK1udzlVWHuhhAmhv--XV0kKVt6Vo722O-iPJD=s64", "userId": "13479437405885655509" }, "user_tz": -120 }, "id": "gHoRYxKGFKMe", "outputId": "da845f03-5c00-4788-9936-c213e58a21d5" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2.5.0\n" ] } ], "source": [ "# TensorFlow and tf.keras\n", "import tensorflow as tf\n", "from tensorflow import keras\n", "from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense\n", "\n", "# Commonly used modules\n", "import numpy as np\n", "import os\n", "import sys\n", "\n", "# Images, plots, display, and visualization\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "\n", "print(tf.__version__)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "YRGrsd9uFKOb" }, "source": [ "## Example: classification of MNIST dataset with Convolutional Neural Networks\n", "\n", "We are now going to build a convolutional neural network (CNN) classifier to classify images of handwritten digits in the MNIST dataset.\n", "\n", "The MNIST dataset containss 70,000 grayscale images of handwritten digits at a resolution of 28 by 28 pixels. The task is to take one of these images as input and predict the most likely digit contained in the image (along with a relative confidence in this prediction):\n", "\n", "\n", "\n", "The images are 28x28 NumPy arrays, with pixel values ranging between 0 and 255. \n", "\n", "The *labels* are an array of integers, ranging from 0 to 9.\n", "\n", "The dataset can be downloaded from this repository:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 51 }, "colab_type": "code", "executionInfo": { "elapsed": 1588, "status": "ok", "timestamp": 1588525640927, "user": { "displayName": "alberto cabri", "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GiK1udzlVWHuhhAmhv--XV0kKVt6Vo722O-iPJD=s64", "userId": "13479437405885655509" }, "user_tz": -120 }, "id": "34UQqKpyFKOk", "outputId": "67910638-9af9-4c86-a654-791b627dbddc" }, "outputs": [], "source": [ "(train_images, train_labels), (test_images, test_labels) = keras.datasets.mnist.load_data()\n", "\n", "# reshape images to specify that it's a single channel\n", "train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)\n", "test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "CHsgfXYLFKOs" }, "source": [ "Let's convert the image pixels into a range of 0 to 1 before feeding to the neural network model by dividing the values of both the *training set* and the *testing set* by 255." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "colab": {}, "colab_type": "code", "id": "NLeTnDZWFKOv" }, "outputs": [], "source": [ "def preprocess_images(imgs): # should work for both a single image and multiple images\n", " sample_img = imgs if len(imgs.shape) == 2 else imgs[0]\n", " assert sample_img.shape in [(28, 28, 1), (28, 28)], sample_img.shape # make sure images are 28x28 and single-channel (grayscale)\n", " return imgs / 255.0\n", "\n", "train_images = preprocess_images(train_images)\n", "test_images = preprocess_images(test_images)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "ftNdQX_ZFKPD" }, "source": [ "Display the first *count* images from the *training set* and display the class name below each image. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 95 }, "colab_type": "code", "executionInfo": { "elapsed": 2599, "status": "ok", "timestamp": 1588525661367, "user": { "displayName": "alberto cabri", "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GiK1udzlVWHuhhAmhv--XV0kKVt6Vo722O-iPJD=s64", "userId": "13479437405885655509" }, "user_tz": -120 }, "id": "WvQDj48KFKPH", "outputId": "a8e27fed-bf30-496b-8e9d-ebd2d34f9ba6" }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "count = 15\n", "plt.figure(figsize=(16,1))\n", "for i in range(count):\n", " plt.subplot(1,count,i+1)\n", " plt.xticks([])\n", " plt.yticks([])\n", " plt.grid(False)\n", " plt.imshow(train_images[i].reshape(28, 28), cmap=plt.cm.binary)\n", " plt.xlabel(train_labels[i])" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "EqYEqUJxFKPS" }, "source": [ "### Build the model\n", "\n", "Building the neural network requires configuring the layers of the model, then compiling the model. Here follows one possible configuration that produces good results on the MNIST dataset:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 442 }, "colab_type": "code", "executionInfo": { "elapsed": 1066, "status": "ok", "timestamp": 1588526094624, "user": { "displayName": "alberto cabri", "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GiK1udzlVWHuhhAmhv--XV0kKVt6Vo722O-iPJD=s64", "userId": "13479437405885655509" }, "user_tz": -120 }, "id": "oWOa5LOVFKPT", "outputId": "f810f4ed-ffce-4d53-b6d5-6015747ae0e4" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model: \"sequential\"\n", "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "conv2d (Conv2D) (None, 26, 26, 32) 320 \n", "_________________________________________________________________\n", "conv2d_1 (Conv2D) (None, 24, 24, 64) 18496 \n", "_________________________________________________________________\n", "max_pooling2d (MaxPooling2D) (None, 12, 12, 64) 0 \n", "_________________________________________________________________\n", "dropout (Dropout) (None, 12, 12, 64) 0 \n", "_________________________________________________________________\n", "flatten (Flatten) (None, 9216) 0 \n", "_________________________________________________________________\n", "dense (Dense) (None, 128) 1179776 \n", "_________________________________________________________________\n", "dropout_1 (Dropout) (None, 128) 0 \n", "_________________________________________________________________\n", "dense_1 (Dense) (None, 10) 1290 \n", "=================================================================\n", "Total params: 1,199,882\n", "Trainable params: 1,199,882\n", "Non-trainable params: 0\n", "_________________________________________________________________\n", "None\n" ] } ], "source": [ "model = keras.Sequential()\n", "# 32 convolution filters used each of size 3x3\n", "model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)))\n", "# 64 convolution filters used each of size 3x3\n", "model.add(Conv2D(64, (3, 3), activation='relu'))\n", "# choose the best features via pooling\n", "model.add(MaxPooling2D(pool_size=(2, 2)))\n", "# randomly turn neurons on and off to improve convergence\n", "model.add(Dropout(0.25))\n", "# flatten since too many dimensions, we only want a classification output\n", "model.add(Flatten())\n", "# fully connected to get all relevant data\n", "model.add(Dense(128, activation='relu'))\n", "# one more dropout\n", "model.add(Dropout(0.5))\n", "# output a softmax to squash the matrix into output probabilities\n", "model.add(Dense(10, activation='softmax'))\n", "print(model.summary())" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "fUHo8mPbFKPb" }, "source": [ "Before the model is ready for training, it needs a few more settings. These are added during the model's *compile* step:\n", "\n", "* *Loss function* - measures how accurate the model is during training, we want to minimize this with the optimizer.\n", "* *Optimizer* - how the model is updated based on the data it sees and its loss function.\n", "* *Metrics* - used to monitor the training and testing steps. \"accuracy\" is the fraction of images that are correctly classified." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "colab": {}, "colab_type": "code", "id": "gPDJn26KFKPe" }, "outputs": [], "source": [ "model.compile(optimizer=tf.optimizers.Adam(), \n", " loss='sparse_categorical_crossentropy',\n", " metrics=['accuracy'])" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "CzydQVk1FKPm" }, "source": [ "### Train the model\n", "\n", "Being in a context of *supervised learning*, training the neural network model requires to call the `model.fit` method on the training data `train_images` and the relevant ground truth `train_labels`.\n", "\n", "To make predictions we use a test set `test_images` and check the predictions against the `test_labels` array. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 204 }, "colab_type": "code", "executionInfo": { "elapsed": 26577, "status": "ok", "timestamp": 1588526136887, "user": { "displayName": "alberto cabri", "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GiK1udzlVWHuhhAmhv--XV0kKVt6Vo722O-iPJD=s64", "userId": "13479437405885655509" }, "user_tz": -120 }, "id": "UuerNpGCFKPn", "outputId": "02b987bf-c11b-4b23-a167-df97dc79b190" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(60000, 28, 28, 1)\n", "Epoch 1/5\n", "1730/1875 [==========================>...] - ETA: 19s - loss: 0.2041 - accuracy: 0.9383" ] } ], "source": [ "print(train_images.shape)\n", "history = model.fit(train_images, train_labels, epochs=5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 34 }, "colab_type": "code", "executionInfo": { "elapsed": 4267, "status": "ok", "timestamp": 1588526145251, "user": { "displayName": "alberto cabri", "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GiK1udzlVWHuhhAmhv--XV0kKVt6Vo722O-iPJD=s64", "userId": "13479437405885655509" }, "user_tz": -120 }, "id": "QQWml8C1Hh2j", "outputId": "4777244a-6854-4538-9e00-fde981cb64a0" }, "outputs": [], "source": [ "acc = np.mean(history.history['accuracy'])\n", "'Current accuracy on the training data is {:.2f} %'.format(acc*100)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "lMJhdLVsFKP4" }, "source": [ "### Evaluate generalization accuracy\n", "\n", "Model performance is assessed on the test dataset:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 68 }, "colab_type": "code", "executionInfo": { "elapsed": 1589, "status": "ok", "timestamp": 1588526157021, "user": { "displayName": "alberto cabri", "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GiK1udzlVWHuhhAmhv--XV0kKVt6Vo722O-iPJD=s64", "userId": "13479437405885655509" }, "user_tz": -120 }, "id": "AaC6oI7iFKP5", "outputId": "4cbd4542-10a9-4b02-9a65-e18d69489ca3" }, "outputs": [], "source": [ "print(test_images.shape)\n", "test_loss, test_acc = model.evaluate(test_images, test_labels)\n", "\n", "'Test accuracy: {:.2f} %'.format(test_acc*100)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "VnEM-SL5FKQA" }, "source": [ "Most often, the accuracy on the test dataset is a little less than the accuracy on the training dataset. This gap between training accuracy and test accuracy is an example of *overfitting*. In our case, the accuracy is better, due to successful regularization accomplished with the Dropout layers." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "accelerator": "GPU", "colab": { "collapsed_sections": [], "name": "HW_Deep_Learning.ipynb", "provenance": [ { "file_id": "https://github.com/lexfridman/mit-deep-learning/blob/master/tutorial_deep_learning_basics/deep_learning_basics.ipynb", "timestamp": 1587905678271 } ], "toc_visible": true }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.6" } }, "nbformat": 4, "nbformat_minor": 1 }