Successful AI Comes From Diversity and Teamwork

May 23rd, 2018

AIs are Individuals, Just Like People

You’ve heard that everyone’s fingerprints are unique. Even the fingerprints of identical twins are different from each other! Just like their fingerprints, every human is unique. People are born into this world with their own personalized set of DNA, followed by a life of unique experiences. One person’s weakness can be another person’s strength. This is valuable because when people interact and build relationships, they learn to collaborate. It’s like the individual pieces of a giant puzzle coming together to unite and create something new together.


AIs are built on different algorithms which contain the specific steps and instructions for the AI to function, similar to how DNA contains the specific traits of a person.


And, believe it or not, any instance of artificial intelligence (AI) can also be considered a unique “individual.” AIs are built on different algorithms which contain the specific steps and instructions for the AI to function, similar to how DNA contains the specific traits of a person. The life experiences of humans are mirrored in the different data sets that are used to train the algorithms. Each algorithm has its own strengths and weaknesses, which determine the types of problems, data sets, and industries they can best work for. For example, an algorithm that is best at processing text data (such as those used in medical diagnoses), may not be as strong at complex numeric data (such as those used for assessing financial risk).

Different datasets and different business problems are best solved using different algorithms. Every dataset contains unique information that reflects the individual events and characteristics of a business. Each dataset tells its own story and reveals its own patterns, giving businesses insightsinto which areas are doing well and which areas need help. Due to the variety of situations and conditions, one algorithm cannot successfully solve every possible business problem or dataset. Because of this, we don’t know in advance which algorithm will work best for a particular use case and the clever people who build AIs end up spending a lot of time building, tuning, and comparing models.


AIs Can Be Formed Into Teams, Just Like People

When building a team of people, diversity is a vital element for the team’s success and growth. Various studies from around the world have shown how diversity in the workplace results in more accurate group thinking, a more careful and detailed approach to problem solving, and more innovation. A 2015 McKinsey report on 366 public companies found that those in the top quartile for ethnic and racial diversity in management were 35% more likely to have financial returns above their industry mean, and those in the top quartile for gender diversity were 15% more likely to have returns above the industry mean.

Unique experiences and perspectives promote critical thinking and allow individuals to challenge each other and prevent personal biases from eliminating the best solution. A recent Harvard Business Review article said, “Diverse teams are more likely to constantly reexamine facts and remain objective. They may also encourage greater scrutiny of each member’s actions, keeping their joint cognitive resources sharp and vigilant.”


Diversity is not only applicable to people, but is also seen within technology itself.


By avoiding groupthink, diverse teams of people are more innovative. A team of people with the same background and experiences does not lead to trying new things or taking new risks because the scope of knowledge is so narrow. Diverse teams widen the scope of knowledge and experience, allowing team members to learn from each other and join forces to create the new opportunities that more often lead to innovation.

Diversity is not only important to people, but is also important to technology. In data science jargon, teams of algorithms are called “ensembles” or “blenders.” Ensembles are similar to a project team: each algorithm's strengths balance out the weaknesses of another, just as brainstorming among diverse people overcomes biases that narrow their ability to find the best solutions. Ensemble models typically outperform individual AIs because of their diversity, and the more data that is fed into the models, the more they “learn” and the stronger they can become when blended together.


Benchmarking AI Diversity vs. Accuracy

I decided to test the benefits of diversity by training dozens of algorithms using 27 different datasets from a range of industries such as healthcare, finance, real-estate, telecommunications, and energy. The diverse algorithms included simple linear models and went all the way up to deep learning. I wanted to see whether there was a “best” algorithm that works for many business problems. Then, I built ensembles of algorithms and compared their accuracy to individual algorithms. I wanted to see whether teams of AIs outperform individual AIs.

Below is the count of how many times a particular algorithm was the most accurate individual algorithm for any of the datasets:


More than half of the algorithms that were able to rank first on a dataset were not the most accurate for any other dataset!


The diversity of algorithms earning top accuracy rankings demonstrates the need to test as many different algorithms as possible to find the best one for your data.


You may have heard a lot of buzz about deep learning, and I’ve met people who believe the hype and want to use deep learning to solve every business problem. Here’s how deep learning algorithms ranked on typical business problems:


The best ranking a deep learning algorithm earned was 7th place. Now, this does not mean that deep learning is worthless, but rather that it tends to perform best on specialist data types such as images.


Moving on to ensembles, I looked at their accuracy rankings versus individual algorithms:


Ensembles were almost always more accurate than individual algorithms, and when they weren’t in first place, they came second by only the slightest of margins.

How diverse were these ensembles? We can see the results below:


Most commonly, the top ranked ensembles were composed of unique algorithms. For each row of data, any particular algorithm may be the most accurate algorithm. Allocating optimal voting powers to each algorithm ensures that if one algorithm gets a particular row wrong, the other algorithms can balance it out, improving accuracy.


Why Creating AI Diversity Is Difficult

There are several obstacles to effectively creating AI diversity. To start, machine learning libraries are notoriously difficult to install on a single computer because of incompatible system requirements and versions. Even if we can get all of the libraries onto a single computer, these models are typically manually designed and constructed, which is time-consuming and error-prone.

As a result, staff don’t have the time to look for the best algorithm for each problem. Staff members often develop a favorite, go-to algorithm, even when that particular algorithm is not the best fit for the dataset being analyzed. This algorithm bias is a convenient time-saver, but it decreases the chance for other algorithms to be considered and lowers the opportunity for diversity on any project that person works on. Without diversity, accuracy suffers, which equates to lower sales and profit.

If your staff can’t find the time to find the best algorithm for each problem, there is no way that they will have the time to also ensemble diverse models. Ensemble models take 10 times longer to build than a single model. There is so much to do with not enough time, and not nearly enough people.


Why Creating AI Diversity Is Easy

There is a solution to all of these barriers without the need to wait around and hope that more staff are hired to tackle these issues in a timely manner. The solution is efficient and accurate: just replace manual AI building with automated machine learning.

With just one click, DataRobot’s automated machine learning platform automatically:

  • Chooses a short list of AI algorithms to try on your data.
  • Virtualizes each AI library along with the appropriately supported system versions.
  • Designs and trains all of those algorithms.
  • Objectively compares the candidate algorithms in order to find the most accurate one.
  • Creates ensemble models that combine a diverse set of algorithms.
  • Explains the models and ensembles in plain language, showing:
    • which inputs are the most important,
    • the patterns found for each input, and
    • worked examples of AI decisions, including the detailed reasons (which data values led to that decision).
  • Puts the AI models into production and makes them available through a simple Rest API.


There is no better way to ensure that your AI team is just as diverse and works together just as well as your human one than with DataRobot.


DataRobot has hundreds of algorithms ready to fit all kinds of data sets and industries. This wide selection of algorithms allows the platform to personalize and accurately address each and every problem.

When you hire new staff, you read the CVs and select a short list of candidates, hold job interviews on the short list, and ultimately choose the best human candidate for that job. Similarly, DataRobot selects a short list of AI algorithms and requires each algorithm to compete against other algorithms to prove its worth on your data.

Not only that, DataRobot focuses on ensuring that the limitations of time and resources are eliminated by automating and processing the most tedious and manual steps. There is no better way to ensure that your AI team is just as diverse and works together just as well as your human team than with DataRobot.