How to Build and Govern Trusted AI Systems: Process

This article was written by Scott Reed and originally appeared on the DataRobot Blog here:


Trusted AI as a culture and practice is difficult at any level; from an individual data scientist trying to understand data disparity in a vacuum to an organization trying to govern multiple models in production.

However, just because it’s difficult, trusted AI doesn’t have to be an unattainable goal. There is a path forward: a framework that revolves around people, process, and technology. In our first joint blog post, we learned about different stakeholders in any AI system lifecycle and how their collaboration is crucial to implementing effective processes and building technological guardrails that collectively stand up an ethical system. Our focus today will be on the processes that our stakeholders utilize to create structure, repeatability, and standardization.

All AI-supported decisions are not equal. Using a risk assessment matrix, we can decide where to put the boundaries when it comes to the model’s input versus a potential human intervention. One solution is to use a decision system with ascending levels of risk, plausibility, and mitigation strategy. Once an AI-supported decision type is determined, we can now conduct an impact assessment that will enable stakeholders to maintain control and have a failsafe method for an override if necessary.

There are many steps to building an AI system. First, a business sponsor will champion an idea. Then a data scientist might gather data and work with business analysts to understand the context. Next, if machine learning is a feasible solution, a model is built and validated. Finally, a model may be put into production and predictions will be made on new data. At each step, there are different stakeholders and perspectives. In order to unify stakeholders’ opinions and fully comprehend the risks at each level, the creation of an impact assessment can be an effective tool. The collaboration and diversity-centered approach yield a true impact analysis of the AI system including stakeholders’ points of view, data provenance, model building, bias and fairness, and model deployment.

The trick to ensuring that a model continues providing value in deployment is to support it with strong lifecycle management and governance. By continuously monitoring our models in production, we can quickly identify issues, such as data drift or prediction latency during high traffic, and take action. We can even instill humility by allowing users to set up triggers and actions when criteria are met, such as predictions near the threshold. These guardrails allow stakeholders to remain confident in the AI system and establish trust.