Based on these classifications, the methods for aggregation and visualization of the data needs to adjust accordingly. For example, if you were to map car manufacturing data like the image below, and your data set included year-to-year manufacturing figures – it makes more sense to stick to an annual order. If you try to sort the values by highest value, your readers will have trouble following the order of the years (1978, 1979, 1980, etc). Ideally, ordinal data should be sorted by its order as opposed to the alphabetical sorting of the names in the values (if you were mapping month-by-month for example).
There is much more to cover but hopefully this post offers a basic guideline to help you determine what type of data you are trying to visualize. In my next installment of the Three Pillars of Data Series, I will address visual encoding and how to determine what markers to use in order to accurately display these data attributes.
This is just one example of how to classify data attributes and there are more advanced ones out there that may be even better to use. For example, it’s hard to classify data that is calculated in percentage. But I still believe this post is a good start and easy to remember. So now you can start to think about the data and what you can do, but also what you shouldn’t do! Just following some of these guidelines will get rid of some basic mistakes in your visualizations.
In the next post I will also show how we can use the step of classifying the data to better select the appropriate method to represent the data.
For more detailed reading on data attributes, I would recommend:
Mosteller, Frederick & Tukey, John W. (1977) Data analysis and regression. A second course in statistics ch.5 Addison-Wesley Series in Behavioral Science: Quantitative Methods, (Reading, Mass.: Addison-Wesley)
P.F.Velleman & L.Wilkinson (1993) "Nominal, Ordinal, Interval, and Ratio Typologies are Misleading" The American Statistician (1993), vol.47 no.1 pp.65-72