Fruit Drawing Classification Web-App

Gurkanwal Singh Kang
11 min readMar 20, 2021

Written by: Gurkanwal, Nandini and Ritika

Introduction

CNNs have become the state-of-the-art computer vision technique from a long time now. Among the various sorts of neural networks (others include recurrent neural networks (RNN), long STM (LSTM), artificial neural networks (ANN), etc.), CNNs are easily the foremost popular. These convolutional neural network models are ubiquitous within the image data space and work great on computer vision tasks such as image classification, image recognition, object detection, and many more. Building Machine Learning models has become a common task that seems uncomplicated for many, but deploying these Machine learning models so that different people could use what you have built should be the next step. So that’s why with this project we would not only train a CNN model to classify fruits based on the drawings made by the user but also, we would create a web-app for the same so that everyone could use it and it would make the whole process intuitive as well. Our primary focus with this project was to create a complete Data Science project which would include frontend as well as backend.

Related Work

One of the famous datasets in the machine learning field used for image-based classification/processing is the MNIST handwritten digit dataset [2]. MNIST (Modified National Institute of Standards and Technology) may be a well-known dataset utilized in Computer Vision that was built by Yann Le Cun et. al. It is composed of images that are handwritten digits (0–9), split into a training set of fifty ,000 images and a test set of 10,000 where each image is of 28 x 28 pixels in terms of height and width. This dataset is usually used for practicing any algorithm made for image classification because the dataset is fairly easy to overcome.

There have been various variations made which are similar to this MNIST dataset like the Fashion MNIST [3], Sign language MNIST [4] and the Chinese MNIST [5], etc. Various machine learning and deep learning models have been developed using these datasets to carry out different researches.

The dataset which is used in this project — quickdraw dataset has also been made just like the MNIST dataset which means it also consists of 28x28 grayscale bitmap in NumPy .npy format, that can be ingested using np.load(). The configuration of 28x28 makes quickdraw dataset a perfect replacement for any existing code used for processing MNIST data. So, if you’re trying to find something fancier than 10 handwritten digits, you’ll try processing over 300 different classes of doodles. William Malone [6] created a drawing app using HTML5 canvas and JavaScript that would help us design the frontend of our project while deploying it to the local server.

O. Rakhmanov [7] tested the classification accuracy of hand drawn sketches with SVM and ANN, without using image feature extraction algorithms and compared the results with the findings of a number of important state-of-art researches. Their findings show that existing methods are reasonable to accept, even though the results of our experiments also produced some valuable results. J. Li [8] has introduced FSVM — fuzzy support vector machine in which a membership function is defined to classify images which are unclassifiable using conventional SVM. For the input vector of SVM and FSVM, they used combined image feature histogram. Being compared with the traditional SVM, FSVM shows an equivalent result as SVM for the pictures within the classifiable regions, and for those within the unclassifiable regions, FSVM generates better result than SVM. D. Ciregan [9] proposed biologically plausible, wide and deep artificial neural network architectures that on the very competitive MNIST handwriting benchmark, is the first to achieve near-human performance. On a traffic sign recognition benchmark their method outperforms humans by a factor of two.

TOOLSET USED:

1.1. Python 3:

It is a programming language with many in-built libraries for deep learning, computer vision and so many other applications that make building projects so easier and fun to do.

1.2. Keras (2.2.4):

It is an open-source library in python used for building neural networks [11]. Keras speeds up the related processes to a great extent. It reduces the amount of user actions that are needed for common use cases, and it gives clear, comprehensible and actionable error messages. This library is mainly used for either convolutional networks or recurrent networks or a combination of both [11].

1.3. TensorFlow (1.13.1):

It is a free and open-source software math library extensively used in the domain of machine learning. TensorFlow is often used across a variety of tasks but features a particular specialise in training and inference of deep neural networks [12].

1.4. Flask:

Flask is a web framework that provides the tools, libraries and technologies that allow you to make an online application. This web application can be used to develop a blog, a wiki or go as big as a web-based calendar application or a commercial website [13].

flask-ngrok (0.0.25): an easy thanks to demo Flask apps from your machine. Makes your Flask apps running on localhost available over the web via the superb ngrok tool [14].

1.5. Google Colab:

It is a product from Google Research is especially compatible to machine learning, data analysis and education as it also provides GPU support to everyone for free. It allows anybody to write down and execute arbitrary python code through the browser [15].

DATASET DESCRIPTION

The dataset used for this project is the quickdraw dataset [16] which is an open-source dataset. The Quick Draw Dataset is a collection of fifty million drawings across 345 categories, contributed by players of the sport Quick, Draw!. The drawings were captured as timestamped vectors. These were tagged with metadata such as what the player was asked to draw and during which country the player was located.

There are 4 formats in this dataset: First and foremost are the raw files stored in (.ndjson) format. These files encode the complete set of data for every doodle and consists of the timing information for each stroke of every picture which is drawn. There is also a simplified version, stored within the same format (.ndjson), which has some preprocessing applied to normalize the info . The simplified version is additionally available as a binary format for more efficient storage and transfer. There are samples of the way to read the files using both Python and NodeJS. The last format uses the simplified data and renders it into a 28x28 grayscale bitmap which is in a numpy .npy format. These files could be ingested using np.load().

We have used this npy format for our CNN Model. Our dataset consists of npy files depicting 5 classes of fruits: Apple, Banana, Grapes, Pineapple and Strawberry and sample drawing images for these classes are shown in the Fig. 1.

Fig. 1Sample drawings from the dataset used

PROPOSED METHOD

Fig. 2 shows the workflow of our project and each step would be discussed in the upcoming sections

Fig. 2Project Workflow

Along with this, we imported the dependencies required in our project for model generation and training.

Fig. 3 Dependencies required while implementing the backend
Fig. 4 Dependencies required while implementing the frontend

2.1. Data Preprocessing:

The images from our quickdraw dataset are already preprocessed to a uniform 28*28 pixel image size. We needed to combine our data so we can use it for training and testing. We only used 10 000 samples for this model. We then split the features and labels (X and y). Finally, we split data between train and test into 80–20 ratio. As pixels of a grayscale image lie between 0 and 255, so we normalized values between 0 and 1 (X/255). These steps are shown in the Fig. 5, 6, 7and 8.

Fig. 5 Loading Dataset
Fig. 6 Adding column for label for each class
Fig. 7 Merging arrays and splitting the feature and labels
Fig. 8 Encoding and Reshaping dataset

2.2. Model development:

The CNN model used to classify fruit drawings is designed as:

• Convolutional Layer: Filters = 30, Kernel size = (3 * 3)

• Max Pooling Layer: Pool size = (2 * 2)

• Convolutional Layer: Filters = 15, Kernel size = (3 * 3)

• Max Pooling Layer: Pool size = (2 * 2)

• DropOut Layer: Dropping 20% of neurons.

• Flatten Layer

• Dense/Fully Connected Layer: Neurons = 128, Activation function = Relu

• Dense/Fully Connected Layer: Neurons = 50, Activation function = Softmax

Input shape: pixels x width x height: 1 x 28 x 28

We run our model for 15 epochs and with a batch size of 200 and is shown in the Fig. 9.

Fig. 9 Model Training

2.3. Results and Discussion:

After training the above CNN Model, the classifier reached 93.96% accuracy after 15 epochs, which is enough for the recognition app.

Fig. 10 Final Accuracy

The Final accuracy is shown in Fig. 10 and confusion matrix is shown in Fig. 11.

Fig. 11 Confusion Matrix

As we will see, most of the drawings were well classified. However, some classes seem to be harder to differentiate than others: Apple with Grapes since both have round figures or Pineapple with Strawberry as both have patterns inside their body. This could have happened because of the similarities in the shapes of the drawings. Some images that were misclassified by the model is shown in the Fig. 12.

Fig. 12 Misclassified Images

2.4. Saving the Model:

Now that our model is ready, we would like to embed it into a Flask Web-App. To do so, it is more convenient to save (serialize) our model using pickle as shown in Fig. 13.

Fig. 13 Saving the model using pickle

DEVELOPING OUR DRAWING WEB-APP WITH FLASK

3.1. Flask

Flask is a web micro-framework written in Python. It allows you to style a solid and professional web application. In order to create the fronted of a project it is advised to follow this kind of a structure:

· app.py: It consists of the main code which will run the Flask application. It will contain the various routes for our application, answer HTTP requests and choose what to display within the templates. In our case, it’ll also call our CNN classifier, operate pre-processing steps for our input file and make prediction.

· Templates folder: A HTML file which can receive Python objects and is also linked to the Flask application is the Template. Therefore, our html pages will be stored in this folder.

· Static folder: Dynamically generated style sheets, scripts, images and other elements are stored in this folder. We haven’t made this folder since we used inline JavaScript and CSS files in the HTML files.

We took use of flask-ngrok library to run our application using Google Colab. For the deployment of our project, we required these files:

· Frontend.ipynb file

· Saved model .pkl file

· draw.html and results.html in the templates folder

3.2. Get the user Input

The second a part of this project is to take input from the user in terms of a drawing which will be classified by our trained model. To do so, we’ll first design the drawing area using javascript and HTML5.

· We used inline JavaScript and CSS in these HTML Files.

· We set our drawing area with the tag.

· Then the drawCanvas() javascript function is called contained in draw.js.

· We initialize our form in order to make use of the POST method to send data to our flask instance/app.py.

· action = “{{url_for(‘predict’)}” is again Jinja syntax which would specify the path used in app.py while submitting the form.

· We add an extra hidden field to our form which will be used to transfer the image.<input type = “hidden“ id =’url’ name = ‘url’ value = “”>

This Javascript code allows us to style and interact with our drawing area.

· The drawCanvas() initialises the canvas’ main functions that will allow interactions with the user’s mouse.

· When the user clicks on the canvas, addClick() saves the cursor’s position.

· redraw() is used to clears the canvas.

The Web-app looks like as shown in Fig. 14.

Fig. 14 Web-App frontend

The image drawn in the box before sending it through the form using the hidden input field set earlier in results.html, is encoded in base64. frontend.ipynb reverses this encoding process.

save() is called when the user clicks on the ‘Submit your drawing button’. It will send the base64 encoded image through the shape.

3.3. Make predictions

The images drawn in the frontend are ingested to our saved model with the help of frontend.ipynb where flask implementation has been done.

Main points of code written in frontend.ipynb:

• Initializing the app and specifying the template folder which is done using this line of code: app = flask.Flask(__name__, template_folder =’templates’)

• Define the routes (only two for our app):

· @app.route(‘/’) : It is our default path and will return the draw.html template.

· @app.route(‘/predict’) : It is called when clicking on the ‘Submit your drawing’ button. This returns the results.html template upon processing the user input.

• The POST action triggers the predict function from the form. It will then proceed as follows:

• Firstly, access the base64 encoded drawing input with request.form[‘url’] , where ‘url’ is the name of the hidden input field in the form which contains the encoded image.

• Then decode the image and set it into an array.

• Followed by resizing and reshaping the image to get a 28 * 28 input for our model.

• Perform the prediction using our CNN classifier.

• We must find the array’s highest probability and get the corresponding class using the pre-defined dictionary, as model.predict() returns a probablity for each class in a single array.

• In the end, return the results.html template and pass the previously made prediction as parameter:

return render_template(‘results.html’, prediction= final_pred)

3.4. Display the results

Finally, to display the prediction computed in frontend.ipynb file, we use the results.html. The last step is to launch our web-app and test our project using flask-ngrok as shown in Fig. 15.

Fig. 15 Running app using flask-ngrok

Here are the test results as shown in the Fig. 16, 17 and 18.

Fig. 16 Test Result 1
Fig. 17 Test Result 2
Fig. 18 Test Result 3

CONCLUSION

In this project, we have seen how to develop a Flask-based Drawing web-app that uses a previously built CNN model to classify drawings made by the user. This is one of the many possible use cases of Flask to deploy machine learning models. In fact, an infinity of use cases is often found. This project could be used in association with any Android Application made for kids so that they could learn shapes/fruits etc. in an intuitive way. This project included every step which is required to make a complete data science project.

--

--