machine learning

First Name Based Gender Prediction with Convolutional Neural Network (CNN)

If you do remember one of my previous articles regarding the ongoing researches for enriching customer’s contacts with the help of Machine Learning techniques in ShoutOUT (I know you don’t remember. So here’s the link), I talked about guessing the person’s gender based on the name. Now here I am, with the results of the research 🙂 .


First of all, why have I used Convolutional Neural Network (CNN) for this? Well, I just want to check the performance with CNN 🙂 . CNN may not be the optimum solution for this, but the results I got are amazing. As we all know, (I hope you know about CNN) CNN is mainly used in domains like image classification, voice recognition etc. You may find very few articles about text classification with CNN. 


There’s one more thing, I have used Deep Learning Studio provided by DeepCognition for FREE OF CHARGE (Yes it is FOC). Kudos for them as they make my life easier and easier. 


I was able to get these results as a consequence of many days of my hard work and many trials and errors. Finally, I am here to present the final fine tuned-model which I made with Deep Learning Studio. The dataset I have used is US Census database.


The following is my model.

Convolutional Neural Network Model
CNN Model for Gender Classification


First of all, I have written a preprocessor to filter out unique records with the maximum length of a name being 10. Then each character of the name is replaced by the character index based on the english alphabet. If the name is less than 10 characters, the rest of the characters are replaced with 0 padding until the length becomes 10. After the preprocess, I got 96236 records and splitted thess records by 80,10,10 as training, validation and test, respectively.

Ex: John – 10;15;8;14;0;0;0;0;0;0



Each input is converted in to a vector of the output dimension 280 by applying an embedded layer, and 3 convolutional layers have been used with the filter sizes 1,2,3 & activation function as tanh. The output of each convolutional layer goes from the global max pooling layer and the results are merged. The results go through a fully connected layer with sigmoid activation and a dropout layer with the ratio of 0.2. Finally, the results go through another fully connected layer with the output dimension 2 and sigmoid activation.



training accuracy

Training accuracy is around 93%.

training loss

Training loss is as less as 0.169. validation accuracy graph

The maximum validation accuracy I got is around 90%.

validation loss

The validation loss is around 0.25.

As you may have noticed, the validation performance a little bit lower compared to the training results. It might be the case that the training model is a little bit overfitted. That, I have to figure out. Anyway, I am really happy with the results I got. 🙂

Your comments are welcome. 🙂

If you are interested, I can share my Deep Learning Studio Model for a try out. Just comment below. Thanks again Deep Learning Studio for providing such a valuable product FOC.