This article is by Sydney Firmin and originally appeared on the Alteryx Data Science Blog here: https://community.alteryx.com/t5/Engine-Works-Blog/How-to-Enrich-Your-Geographical-Trade-Areas-with-Demographic/ba-p/340489https://community.alteryx.com/t5/Data-Science-Blog/Embedding-a-Model-in-a-Python-SDK-Tool/ba-p/330602
In order to maximise the accuracy of our outputs, we should look to obtain demographic information at the lowest level of detail we can find, and, the newest information we can find.
In the UK, the Office of National Statistics release mid-year population estimates, broken down by age and gender at the level of Lower Super Output Areas (commonly referred to as LSOAs). In the UK, LSOAs are geographical areas that contain some average of 1000 residents and 650 households, whilst other characteristics such as social homogeneity also play a role in defining their boundaries.
When I say ‘loose method’ what exactly do I mean? The process is simple:
- Take the station trade area and spatially match it against our population area
- For each matched population area, create an intersect object between itself and the trade area
- Identify the size of the original population area
- Identify the size of the associated intersect polygon area
- From these two values, calculate the % of which the population area is contained within the trade area
- Multiply this value by the demographic variables we have
In this case, I have created a 5-mile boundary around each store using the trade area tool. I have also checked the option to ‘Eliminate Overlap’ as I anticipate our customers will only ever travel to the closest store.
Next, we need to input our demographic information that we have deemed appropriate to map against our data. In this case, I have used the LSOA data mentioned earlier, with population numbers broken down by age and gender.
Once we have inputted both sources onto our canvas, we need to perform our spatial match of the two datasets. Here, we have configured the spatial match to return objects that ‘touched or intersected’ the second object.
For each of your trade areas, you will now have a list of the population areas that are, at least in part, within your trade areas.
Now that we know which population objects are linked to which trade areas alongside the actual spatial objects for each, and the intersect polygon for each match, we can work out the overlap rate of a population area onto a trade area.
To do this, I will use two spatial info tools, but you can also use a formula tool which supports spatial functions.
Once we have these values, it is just a case of identifying the demographic attributes we wish to bring in, joining these to this file, and multiplying it by the percentage overlap.
A complete workflow can be found here, and a sample visualisation that I have built with this data can be found here.