Sneaky AI: Specification Gaming and the Shortcomings of Machine Learning

March 7th, 2019

This article is by Sydney Firmin and originally appeared on the Alteryx Data Science Blog here:


Artificial Intelligence is a very exciting field of study. It has always seemed like the stuff of science fiction. However, Artificial Intelligence (AI) is becoming more and more prevalent and ingrained in our society. Machine Learning, a sub-field of AI where computers learn how to solve a task by incrementally improving their performance, has become commonplace in a wide variety of industries and applications.

Examples of machine learning in business include the well-known filtering of spam emails or product reviewscredit card fraud detection, and even programming barbies to have interactive conversations.

Machine learning can be a very powerful way to solve problems. However, it is important to remember that machine learning and AI solutions can only be as good as the parameters and data it is given. There are many ways in which current machine learning and AI techniques are limited.

Examples of unexpected machine learning outcomes and AI behavior are everywhere.

Published in 2018, The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities (link) is a collection of (verified) anecdotes from researchers in artificial life and evolutionary computation on surprising adaptations and behaviors from evolutionary algorithms. It is another interesting and entertaining read that is well worth your time.

There is a fun blog called AI Weirdness dedicated to the unexpected behaviors and shortcomings of neural networks. If you have a chance to check it out, I would highly recommend it. A couple of my favorite experiments are Skyknit and Naming Guinea Pigs.

A very popular example of specification gaming is the deep neural network that was trained to identify potential skin cancer. The neural network was given a training set of thousands of images of benign and malignant skin lesions (moles). Instead of leveraging the features of the skin lesions to determine how to categorize an image, the neural network learned that images with a ruler in the frame were more likely to be malignant. This makes sense because malignant skin lesions are more likely to be photographed with a ruler for future documentation.

A similar example comes from the field of digital evolution algorithms, where researchers were investigating the problem of catastrophic forgetting in neural networks; where a neural network will learn a new skill at the cost of forgetting an old one. The researchers presented the neural network with food items one at a time, where half of the items were nutritious and the other half were poisonous (link). They found that high-performing neural networks were able to correctly identify edible food with almost no internal connections. Puzzled, they found after some investigation that the neural networks were exploiting the pattern in which edible food was presented to them, which was every other item. This issue was easily solved by randomizing the order of food options.