First, the authors infer that the computer is picking up on features that humans are unable to detect. The authors find that their deep neural network does a better job of guessing sexual orientation based on facial photographs than do humans solicited via Amazon Mechanical Turk.Second, the authors suggest that these features are the results of prenatal hormone exposure. Based on this result, they argue that the neural net is picking up on features that humans are unable to detect.While the initial news reports did a dismal job of explaining how accurate the algorithm is, the original research paper is pretty clear and we do not intend to question the authors' methods here.In other words, we will assume that their input data are reasonable and that the black box is functioning as it should.In the present case study, we look at the output side: what interpretations or conclusions are reasonable from a given set of results?
Therefore, all we need to do is to clearly and carefully examine what is going into the black box, and what comes out.As illustrated in the diagram below, there are typically several steps that people go through when they construct quantitative arguments. Second, they feed these data into some the analytic machinery, represented below by the black box.Third, this machinery (ANOVAs, computational Bayesian analyses, deep learning algorithms, etc.) returns results of some sort.From an internet dating website, the authors selected photographs of nearly 8000 men and nearly 7000 women, equally distributed between heterosexual and homosexual orientations.They find that not only can their algorithm predict sexual orientation better than chance; it can outperform human judgement. We thank Michal Kosinski, whose work we address in this case study, for constructive feedback on the original text .