This article was written by Sarah Khatry and originally appeared on the DataRobot Blog here: https://www.datarobot.com/blog/trusted-ai-cornerstones-ethics-overview/
In two recent blog posts, I’ve talked about two cornerstones of trusted AI: evaluating model performance and setting up proper operational oversight. In this post, I’ll review the third and perhaps most important step: creating models that are ethical and principled.
When it comes to ethical AI, high-minded principles get you only so far. There is no one-size-fits-all approach. AI systems and the data they use can cross national borders and diverse cultures. They can become part of a web of complex social interactions that require the perspective of multiple stakeholders to fully understand. So how do you navigate AI pragmatically but still ethically? You have to think about privacy, bias and fairness, interpretability and transparency, and ultimately what the impact of a system will be.
Best Practices for Ensuring Privacy
Because consumer privacy is of the utmost importance, your data team must be aware of data that might contain personally identifiable information (PII), even if it’s as simple as an email address. This data should not be used to train a machine learning model. DataRobot has automatic PII detection available for on-premise installations.
You must also understand the singular risks that AI poses to privacy. For example, consumer purchasing behavior can be mined for sensitive information about the health, housing, employment, and even the marital status of customers. This information might help with ad targeting, but at a substantial risk of backlash.
Similarly, AI systems can be subject to attacks designed to find information about your enterprise or customers. For example, a model inversion attack aims to use white-box information on a model to reconstruct the data used to train a model.
Fair and Unbiased AI
Algorithmic bias is discussed a lot in AI. It is a tough topic to navigate. We must not only deal with the mathematically complex task of reducing bias in the data, but also with the challenge of determining what’s fair. Fairness is situationally dependent. It’s a reflection of your values, ethics, and legal regulations.
Training data is the largest source of bias in an AI system. It might contain historical patterns of bias. Bias might also be a result of data collection or sampling methods that misrepresent ground truth. Ultimately, machine learning learns from data, but that data comes from us—our decisions and systems.
In DataRobot, the AI, on the other hand, implements ethics with hard quantities you can measure and behaviors and practices you can control or override. With the right modeling, you can have even more stringent requirements than you would of any employee across the dimensions of trust outlined in these blog posts. The challenge is to think ahead, identify the desired behavior of a system that reflects your values, and then take proactive steps to guarantee it.
An impact assessment is a powerful tool for your organization because it brings all AI stakeholders, not just your data science team and end business users, to the table.
Conduct the first impact assessment before modeling begins, after you’ve identified and evaluated initial data sources. Revisit the process at different stages of the process, such as model evaluation, production, and after the model is deployed. Check in on system behavior against your original understanding of a use case, and as it evolves, regularly, at the cadence of any other system-level review of your business.
We establish trust all the time in our interactions with people. We know how to recognize the importance of eye contact or a firm handshake. Some of us establish trust through punctuality or our professional credentials.
The trust signals found in AI aren’t eye contact or a diploma on the wall, but they serve the same purpose. By understanding how to build trust in AI, you can begin to find maximum value in its day-to-day use.