Helping the World, One Piece of Data at a Time

July 2nd, 2019

This article is by Zeba Hasan and originally appeared on the Alteryx Engine Works Blog here:


Working with non-profits in underprivileged areasImage 1.png

Something I always had a hard time understanding was how I could be of help to nonprofit organizations and individuals in need using my skillset. I can’t help people while sitting behind my computer screen, can I? Well folks, there is a workaround! Your favorite nonprofit organization might need an extra set of hands on their administrative side. Do they have unmet reporting needs? Are they unable to accurately identify what the needs in their communities are or understand what their data is telling them? NPOs often need assistance in these areas, and you might be able to help.

All nonprofits have financial reports needed to be run for accounting purposes. Unfortunately, many nonprofits lack the resources to fulfill these needs. I’ve found some great examples of ways Alteryx can be used to support the back-end functions in companies. These use cases can be replicated to lessen the administrative burden for nonprofits and allow them more time to make even more of a difference in the communities they serve.

One of the use cases I found to help local nonprofits is to cleanse and layer their data on interactive reports that help organizations best understand their data. In this example, I used publicly available crime data for the city of Chicago from Data.Gov. In Alteryx, I imported the dataset and highlighted which community areas have the highest crime report rates, what type of crimes are occurring at a higher frequency, and where these crimes are occurring.

This itself is great insight for nonprofits to pinpoint where a community’s struggle points are. Do they struggle with drug crime, theft, vandalism, etc.? Without knowing where the struggle points of a community are, there’s no real way to know how to best serve a community. In order to better visualize for others, I was able to create a few simple insights in Tableau using my Alteryx .tde output.

As you can see below, there’s a high frequency of Theft throughout the city. We also see how drastically the crime rates have dropped from the early 2000’s to today. With this information quickly accessible, nonprofits can shift their focus on where communities really need help.

image 3.pngimage 2.png

The next step here is to bring in demographic data for these very communities to see what age ranges exist in most of these communities, the average household income, graduation rates, and other relevant fields to see where connections can be drawn to crime rates. We have seen many inverse correlations between crime rates and graduation rates or employment rates. Having that extra bit of insight here may push for more after-school activities for students or free tutoring available for students in high-crime areas. This may push those crime rates down even further and allow nonprofits to truly cater to the communities they serve.

Analyzing correlation between environment and health impacts

I’ve always had an interest in seeing how our physical environment can affect our health. In this example, I selected a health issue (stomach ailments), and searched for a correlation between the health issue and a variety of an individual’s environment and/or habits. I began by collecting anonymous data from participating individuals on their stomach ailments via Google Sheets. The Google Sheets Input tool within Alteryx really streamlines this process by allowing you to connect directly to the survey I created via Google Sheets.

image 4.pngimage 5.png

image 6.png

Now that my survey results were brought in using the Google Sheets Input tool, I began thinking about what demographic variables may play a role in affecting our stomach health. I connected to the Experian database available through Alteryx as my choice of dataset and selected relevant variables for my test.

To see whether there was a strong correlation between the presence of a stomach condition and these environmental factors, I used a few tools within the Predictive section – the Logistic Regression and Score tools.

Starting with the Logistic Regression tool, I was able to point out what I wanted my Target Variable to be and what variables I wanted to test correlation with. The Target variable in my dataset was the answer to the user question – ‘Do you currently have a stomach condition (Yes or No)’. I brought in the Score tool to attach a correlation p-value to both the ‘Yes’ and ‘No’ user responses. In my case, I looked at whether a p-value was scored high in the ‘Yes’ response. If this was the case, I flagged this as a High correlation and High chance of experiencing a stomach condition.

One thing I found was those living in zip codes 83704 and 86535 tend to spend more on take-out food, tobacco, and alcoholic beverages. This led to a correlation of greater than .60, a score I classified as being a High Correlation rank.

image 8.png

Of course, this study is preliminary, and I hope to further this study by asking volunteers to provide more information on their own personal spending in areas like alcohol, tobacco, sugar, fruits/veggies, etc. For now, I provided participating volunteers with these results, showing them how their zip code demographics may play into their own stomach ailments.

The cool thing that can be done from here is once you realize which fields provide you a stronger correlation to an individual having a stomach issue, you can then gather responses from individuals who may have an undiagnosed stomach condition and predict their likelihood of having stomach ailments later on in life. This could be super useful for a variety of different sectors, including health groups and insurance industries. It could be a fantastic way to watch health symptoms a bit closer for individuals who have a higher likelihood for certain medical conditions while also allowing insurance companies to gather a more accurate cost on an individual’s medical estimates. Given the impact studies like this can have on patient care, I look forward to furthering my use case and sharing these findings with healthcare organizations to help drive change in the industry and in my community.

Even the smallest bit of insight can better our communities and show just how powerful data can be. Go out there and discover where your curiosities are, what issues you see in your community or world today and take a shot at seeing where that path will take you. Helping even one individual is like helping all of humanity, and I’m sure you have it in you to better humanity through the power of data.