Deep Neural Networks Detect Sexual Orientation From Faces
A recent paper (due to be published in the Journal of Personal and Social Psychology) by Stanford researchers Yilun Wang and Michal Kosinski on detecting people’s sexuality using a few facial images has created turmoil in two different communities; contrastingly for the delight of one and dismay of the other. While it has come as a great news for the data science fraternity as machine learning algorithms continue to become more and more observant and intelligent, the research raises serious concerns for the communities of homosexual people whose privacy seems to be blatantly compromised through the use of any such analysis.
Using facial features derived from just a single image, the developed model succeeds in distinguishing between gay and heterosexual men in 81% of cases while the success rate is 71% in the cases of women. If the number of images per person in the data set is increased to five, the model is correct for 91% and 83% of times, respectively, for men and women. The high accuracies in the latter case can be misleading, since increase in the number of images per person increases the chances of repeatability in the test set as the train-test split was randomised. Even after considering it, these results show a tremendous increase from what humans were able to achieve (accuracy of 61% for men and 53% for women) in a similar, independent study conducted in the year 2009.
The algorithm is based on cross validated logistic regression which uses VGG-Face for extracting the facial features. A Deep Neural Network, pre-trained on a different dataset for a completely different purpose, is used to reduce the chances of overfitting.
The dataset used for training and validating this algorithm comprised of 35,326 images of 14,776 different individuals. The dataset largely comprised of White Americans which provides it with a fairly local nature, thus making any generalization seem preposterous. Also, the fact that the images were sourced from a dating website where distinctive sexual features will presumably be very prominent, general validity of the study becomes debatable. The researchers, as discussed in the paper, have taken a few measures to check against these loopholes. They used the same dataset but took human help to classify the images manually and verified that the results were consistent with the earlier studies. Sourcing another set of images from social networking sites such as Facebook and checking if the results were consistent was yet another validation method used. Apart from that, AUC (Area Under the Curve) is used as an evaluation matrix, which is sensitive to both the classes, thus addressing the problem of class imbalance in datasets.
This research gives a considerable push to the field of data science and also aids to our understanding of social psychology. Nonetheless, there is no denial of the fact that it could be used as a potential weapon for those radical groups that promote fear and hatred among the marginalized community. The authors of the paper have addressed this issue in the discussion section as well, by clearly specifying that do not want to propagate the idea that sexuality can be, and more importantly should be, determined based on facial features. Yet, the fact that this algorithm was built using openly available libraries and that it is not too difficult for someone, less responsible, to build something similar, makes the situation worrisome. The paper only acts as a medium for making the concerned and affected communities abreast of the fact that something like this could exist.
This research also plunges us back into the age old debate of physiognomy (determining character traits of a person based on his/her facial features). In our very honest, personal opinion it only makes the arguments of the better side weaker. Even though the scientific community, right now, is not totally convinced of the validity of this finding; most of the criticism that the paper has received is not related to it. A more important debate is whether such research puts a lot of power in the hands of the oppressors leaving the already underprivileged exposed to even more threat. If yes, then how and where do we stop?
Read the preprint version of the paper at https://psyarxiv.com/hv28a/.