Auditing Algorithmic Risk

How do we know whether algorithmic systems are working as intended? A set of simple frameworks can help even nontechnical organizations check the functioning of their AI tools.

Reading Time: 21 min 


Permissions and PDF

Paul Garland

Artificial intelligence, large language models (LLMs), and other algorithms are increasingly taking over bureaucratic processes traditionally performed by humans, whether it’s deciding who is worthy of credit, a job, or admission to college, or compiling a year-end review or hospital admission notes.

But how do we know that these systems are working as intended? And who might they be unintentionally harming?

Given the highly sophisticated and stochastic nature of these new technologies, we might throw up our hands at such questions. After all, not even the engineers who build these systems claim to understand them entirely or to know how to predict or control them. But given their ubiquity and the high stakes in many use cases, it is important that we find ways to answer questions about the unintended harms they may cause. In this article, we offer a set of tools for auditing and improving the safety of any algorithm or AI tool, regardless of whether those deploying it understand its inner workings.

Algorithmic auditing is based on a simple idea: Identify failure scenarios for people who might get hurt by an algorithmic system, and figure out how to monitor for them. This approach relies on knowing the complete use case: how the technology is being used, by and for whom, and for what purpose. In other words, each algorithm in each use case requires separate consideration of the ways it can be used for — or against — someone in that scenario.

This applies to LLMs as well, which require an application-specific approach to harm measurement and mitigation. LLMs are complex, but it’s not their technical complexity that makes auditing them a challenge; rather, it’s the myriad use cases to which they are applied. The way forward is to audit how they are applied, one use case at a time, starting with those in which the stakes are highest.

The auditing frameworks we present below require input from diverse stakeholders, including affected communities and domain experts, through inclusive, nontechnical discussions to address the critical questions of who could be harmed and how. Our approach works for any rule-based system that affects stakeholders, including generative AI, big data risk scores, or bureaucratic processes described in a flowchart. This kind of flexibility is important, given how quickly new technologies are being developed and applied.



1. The Ethical Matrix is based on a bioethical framework originally conceived by bioethicist Ben Mepham for the sake of running ethical experiments. For a detailed presentation, see C. O’Neil and H. Gunn, “Near-Term Artificial Intelligence and the Ethical Matrix,” chap. 8 in “Ethics of Artificial Intelligence,” ed. S.M. Laio (New York: Oxford University Press, 2020).

2. C. O’Neil, H. Sargeant, and J. Appel, “Explainable Fairness in Regulatory Algorithmic Auditing,” West Virginia Law Review, forthcoming.

3. See N. Bhutta, A. Hizmo, and D. Ringo, “How Much Does Racial Bias Affect Mortgage Lending? Evidence From Human and Algorithmic Credit Decisions,” Finance and Economics Discussion Series 2022-067, PDF file (Washington, D.C.: Board of Governors of the Federal Reserve System, October 2022), Table A6 is particularly relevant.

4. M. Leonhardt, “Black and Hispanic Americans Often Have Lower Credit Scores — Here’s Why They’re Hit Harder,” CNBC, Jan. 28, 2021,

5. B. Luthi, “How to Add Rent Payments to Your Credit Reports,” myFICO, Dec. 14, 2022,

6. The four-fifths rule is not a law but a rule of thumb from the U.S. Equal Employment Opportunity Commission, saying that selection rates between groups of candidates for a job or promotion (such as people of different ethnicities) cannot be too different. In particular, the rate for the group with the lowest selection rate must be at least four-fifths that of the group with the highest selection rate. See more at “Select Issues: Assessing Adverse Impact in Software, Algorithms, and Artificial Intelligence Used in Employment Selection Procedures Under Title VII of the Civil Rights Act of 1964,” U.S. Equal Employment Opportunity Commission, May 18, 2023,

7.SB21-169 — Protecting Consumers From Unfair Discrimination in Insurance Practices,” Colorado Department of Regulatory Agencies, Division of Insurance, accessed April 24, 2024,

8.“3 CCR 702-10 Unfair Discrimination Draft Proposed New Regulation 10-2-xx,” Colorado Department of Regulatory Agencies Division of Insurance, accessed April 24, 2024,

9. The draft regulations also define these terms. “Statistically significant” means having a p-value of <0.05, and “substantial” means a difference in approval rates, or in price per $1,000 of face amount, of >5 percentage points. The details of the further tests are beyond the scope of this article, but the main idea is to inspect whether “external consumer data and information sources” (that is, nontraditional rating variables, such as cutting-edge risk scores, which insurers often purchase from third-party vendors) used in underwriting and pricing are correlated with race in a way that contributes to the observed differences in denial rates or prices. If inspection shows they are, then the insurer must “immediately take reasonable steps developed as part of [its] risk management framework to remediate the unfairly discriminatory outcome.”

10.10. P. Liang, R. Bommasani, T. Lee, et al., “Holistic Evaluation of Language Models,” Transactions on Machine Learning Research, published online Aug. 23, 2023,

11. D. Hendrycks, C. Burns, S. Basart, et al., “Measuring Massive Multitask Language Understanding,” arXivLabs, published online Sept. 7, 2020,

12. Liang et al., “Holistic Evaluation of Language Models.”

13. B. Edwards, “Anthropic’s Claude 3 Causes Stir by Seeming to Realize When It Was Being Tested,“ Ars Technica, March 5, 2024,

14. A. Zou, Z. Wang, N. Carlini, et al., “Universal and Transferable Adversarial Attacks on Aligned Language Models,” arXivLabs, published online July 27, 2023,; and D. Ganguli, L. Lovitt, J. Kernion, et al., “Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned,” arXivLabs, published online Aug. 23, 2022,

15. L. McCarthy, “A Wellness Chatbot Is Offline After Its ‘Harmful’ Focus on Weight Loss,” The New York Times, June 8, 2023,

16. K. Wells, “National Eating Disorders Association Phases Out Human Helpline, Pivots to Chatbot,” NPR, May 31, 2023,

17. By “sketch,” we mean we are imagining the stakeholders and their concerns. Truly creating an Ethical Matrix for this use case would entail interviewing real representatives of these stakeholder groups. In this article, we approach it as a thought experiment.

18. C. Lecher, “NYC’s AI Chatbot Tells Businesses to Break the Law,” The Markup, March 29, 2024,

19. G.A. Fowler, “TurboTax and H&R Block Now Use AI for Tax Advice. It’s Awful,” The Washington Post, March 4, 2024,

Reprint #:


More Like This

Add a comment

You must to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.

Comments (2)
Thank you for pointing this out—you are correct, an editing error had us citing the wrong Dr. Mepham. We have corrected the reference.
Imho, it must be "Ben" Mepham in first ref not "John".