Bias Versus Variance

This article is by Sydney Firmin and originally appeared on the Alteryx Data Science Blog here:


There are two types of model errors when making an estimate; bias and variance. Understanding both of these types of errors, as well as how they relate to one another is fundamentally important to understanding model overfitting, underfitting, and complexity.



Most models make assumptions about the functional relationships between variables, which allows the model to estimate a target variable. Not all models make the same assumptions, which is why a data scientist or analyst needs to determine the best possible assumptions for a given data set.


Bias is the difference between a model’s estimated values and the “true” values for a variable. Bias can be thought of as errors caused by incorrect assumptions in the learning algorithm. Bias can also be introduced through the training data, if the training data is not representative of the population it was drawn from. In the fields of science and engineering, bias is referred to as accuracy.



Variance can be described as the error caused by sensitivity to small variances in the training data set, or how much an estimate for a given data point will change if a different training data set is used. High variance can cause an algorithm to base estimates on the random noise found in a training data set, as opposed to the true relationship between variables. In the fields of science and engineering, bias referred to as precision.

Some variance is expected when training a model with different subsets of data. However, the hope is that the machine learning algorithm will be able to distinguish between noise and the true relationship between variables. Small training data sets often lead to high variance models. A model with low variance will be relatively stable when the training data is altered (e.g., if you add or remove a point of training data).

High variance is associated with overfitting a model, where a model will perform well on its given training data set but generalize to unseen observations poorly. Overfitting happens when the models capture and describe random noise in the training data set, as well as the underlying pattern in the data.