Responsible AI in action, Part 2: Complete an impact assessment

Kate B
Data Science at Microsoft
8 min readDec 12, 2023

--

This is the second in a series of Responsible AI (RAI) articles:

Image generated with Bing Image Creator.

Introduction

A Responsible AI (RAI) impact assessment is the process a product team follows to identify and assess the potential risks and harms of an AI system. It is a new process, and some organizations may be reluctant to consider it, giving reasons such as:

  • It’s too early in the AI technology life cycle to do impact assessments. RAI is still mostly academic.
  • Concerns about AI are so new. How can we expect product teams to know about potential risks and harms from AI?
  • It seems like RAI discussions will only devolve into messy disagreements and take time away from design/development/deployment timelines.

This article takes a proactive and pragmatic approach to RAI impact assessments. It introduces the concept of an RAI impact assessment, describes its core elements, and includes recommendations for completing it.

The following suggestions are based on experience and learnings from a Microsoft organization comprised of 1700 employees and multiple product teams delivering both internal tools and external applications. Adapt the suggestions below to fit your organization, products, and circumstances.

What is a Responsible AI impact assessment?

An RAI impact assessment is the primary method for guiding a team through the process of examining an AI system and aligning it to responsible AI principles and standards. The questions it examines include: What are the use cases for the AI system? Who are the stakeholders? How do we monitor and measure AI? Who might be harmed and how? How do we prevent these harms?

To help you get started, the documentation and templates below are available for download:

  • Microsoft Responsible AI Standard, v2: The foundation of an impact assessment consists of the documented values aligned around six principles: Accountability, Transparency, Fairness, Reliability & Safety, Privacy & Security, and Inclusiveness. The standard articulates goals and requirements for each principle. For a product to align with these AI principles, it must meet all relevant goals described in the standard.
  • Microsoft Responsible AI Impact Assessment Template: This document should be completed by each team responsible for the AI system.
  • Microsoft Responsible AI Impact Assessment Guide: This companion guide to the RAI impact assessment template helps frame conversations about RAI, and includes FAQs, examples, activities, and a case study. Consider it a primary reference for mapping out how to conduct an RAI impact assessment.

Core concepts and activities

For more detail about the concepts and activities introduced here, see the Microsoft Responsible AI Impact Assessment Guide.

Identify system use cases

The assessment starts with a compilation of system use cases:

  • Intended use cases: Uses the system is being explicitly built to support.
  • Unsupported use cases: Uses that are not being tested for or supported.
  • Misuses: Use cases that are explicitly unsupported or malicious.

This list of use cases becomes the foundation for the impact assessment.

Identify stakeholders

Stakeholders are people, groups, or roles with an interest in the AI system. The RAI impact assessment describes two types:

  • Direct stakeholders include anyone who is hands-on or directly affected by the system. Examples include end users, website administrators, operations teams, product teams (developers, designers, product managers, testers), downstream applications, systems integrators, or malicious users.
  • Indirect stakeholders include those who will not interact with a system directly but may be affected by an AI system’s downstream effects, such as bystanders, or individuals and communities that may be harmed by the short- or long-term use of the system. Other types of indirect stakeholders may include regulators or civil society organizations.

The value of a stakeholder-identification activity is that it expands the team’s idea of who could be affected by the AI system. The following is an example of stakeholder analysis for an application built to predict hospital admissions:

Table 1: Stakeholders

Identify potential harms

An RAI impact assessment is a type of risk analysis: Risks are identified and assessed based on an organization’s responsible AI principles and goals for the AI system. With the system’s use cases and stakeholders described, the team has the necessary context to find potential harms. Harms can originate from a variety of sources, such as stakeholder expectations, UI design, internal or external dependencies, cybersecurity threats, or operational issues. When the product team involved in the assessment includes multiple disciplines, roles, and experiences, the broader perspective can generate a wider range of potential harms to consider. Learn more about the importance of team diversity in the earlier article in this series, Part 1: Get started.

The RAI Impact Assessment Guide includes a list of potential harms to help get a brainstorming activity started. This is an example of what output from this activity might look like:

Table 2: Harms

Describe system failure scenarios

Another risk dimension to consider is, If the AI system fails within the context of the intended use cases, what does that look like? And what are the consequences? An example of how failure might manifest with an AI system is when model accuracy or performance degrades, and false positives/false negatives are not detected and handled. Real-world examples of failures include a facial recognition system biased toward certain facial characteristics or a brittle image recognition system that doesn’t distinguish a school bus from a snowplow.

Mitigate harms

Once harms and risks are identified, the team decides on the priorities and proposes measures to mitigate the harms: What can be done to minimize a risk or harm? The Microsoft Responsible AI Standard, v2 includes some ideas for mitigations. Taking an approach to apply multiple controls to mitigate harms is a recommended practice. An example of a layered approach is illustrated below:

Table 3: Layered mitigations

Tailor the approach for the RAI impact assessment

Consider modifying the RAI template and process to fit your circumstances. For example, our profiled organization implemented modifications as described below.

Annotate the RAI Impact Assessment Template

Consider annotating the RAI Impact Assessment Template to add more context and clarification for readers and reviewers. Some annotations to consider:

  • Include the target number of users expected to be supported at each stage of the product release cycle (Section 1.2, System lifecycle stage).
  • Include a system architecture diagram (Section 1.3, System description). This can be particularly helpful for reviewers to better understand overall data flow and workflow.
  • Fit for purpose is a set of goals that rolls up to the Accountability principle described in the Microsoft Responsible AI Standards v2. It can be a subjective concept. Add more context and provide an example even if it is what it means to not be fit for purpose (Section 2.1, Assessment of fitness for purpose). For example, an AI system trained on one region’s data and then deployed to other regions where regional differences are significant may not be correct. Another example would be where an AI system claims to make a process more efficient without having the metrics or measurements to support the claim.
  • For a large organization with multiple product teams, it can be helpful to pre-populate the template with a starter set of known harms and risks (Section 3, Adverse impact). For instance, if your organization develops applications on top of large language models (LLMs), include known risks in the template such as: Users exposed to inappropriate content; outputs might be inaccurate; quality-of-service disparity, such as a language translation service that performs better for some languages and dialects than others; content or responses are fabricated and not factual; users might trust inaccurate outputs without verifying them; and lack of disclosure of AI-generated content or citation.

Align RAI goals to product release stages

The foundation for the RAI impact assessment is the Microsoft Responsible AI Standard, v2, which documents 17 goals that align to the AI principles: fairness; reliability and safety; privacy and security; inclusiveness; accountability; and transparency. Each of the goals has a set of requirements. Consider phasing these requirements into the existing product release stages.

In our profiled organization, a product goes through three release phases: Private preview, public preview, and general availability (GA). Private preview is a release to a limited set of customers, public preview expands the customer base, and a GA release is a full product release. While you may have different terminology or release criteria, a phased release approach can help ease a team’s transition into an RAI practice, allowing for learning along the way as team members become familiar with the concepts and process. Here’s an example of how to phase in the RAI goals:

Table 4: RAI aligned to release stages

A note about team collaboration

Product teams today operate in complex environments: across cultures, across time zones, and across functions and organizations. To protect the integrity of the process and outcomes, understanding how a team works and communicates will make the difference between a solid assessment and one that is not. If the team is small and co-located, then in-person meetings may work best. When team members are geographically dispersed, collaboration may be more effective through asynchronous communication channels.

Wrapping up

The purpose of a good RAI impact assessment is to identify the potential risks and harms of an AI system and introduce mitigations to reduce negative consequences. The templates and guidance introduced in this article can help a team put responsible AI principles to work. Consider adjustments to better align with organizational requirements and product team processes.

Plan to evaluate your AI system on an ongoing basis: Use cases change, system updates affect functionality, new technology is introduced, and data drifts. As Sam Altman said recently in a keynote, “We believe that gradual iterative deployment is the best way to address safety challenges of AI.

AI systems have the potential to affect many people directly and indirectly, in positive and negative ways. Responsible AI can help teams build and deploy AI products in a way that minimizes harms. If you have the talent and passion to develop AI solutions, please consider these recommendations and join us in the commitment to innovate responsibly.

Useful links

Here are resources to help:

Acknowledgments

Special thanks to Stanley Lin, Kris Bock, Mickey Vorvoreanu, and Kathy Walker for their collaboration on this article; it would not have happened without many helpful hours of collaboration.

Kate Baroni is on LinkedIn.

See the other articles in this series:

--

--

Kate B
Data Science at Microsoft

Broad experience with data, ML/AI, security. Current focus on responsible AI, genetic genealogy.