AI and Statistics: Perfect Together

Many companies develop AI models without a solid foundation on which to base predictions — leading to mistrust and failures. Here’s how statistics can help improve results.

Reading Time: 9 min 

Topics

Permissions and PDF

Carolyn Geason-Beissel/MIT SMR | Getty Images

People are often unsure why artificial intelligence and machine learning algorithms work. More importantly, people can’t always anticipate when they won’t work. Ali Rahimi, an AI researcher at Google, received a standing ovation at a 2017 conference when he referred to much of what is done in AI as “alchemy,” meaning that developers don’t have solid grounds for predicting which algorithms will work and which won’t, or for choosing one AI architecture over another. To put it succinctly, AI lacks a basis for inference: a solid foundation on which to base predictions and decisions.

This makes AI decisions tough (or impossible) to explain and hurts trust in AI models and technologies — trust that is necessary for AI to reach its potential. As noted by Rahimi, this is an unsolved problem in AI and machine learning that keeps tech and business leaders up at night because it dooms many AI models to fail in deployment.

Fortunately, help for AI teams and projects is available from an unlikely source: classical statistics. This article will explore how business leaders can apply statistical methods and statistics experts to address the problem.

Holdout Data: A Tempting but Flawed Approach

Some AI teams view a trained AI model as the basis for inference, especially when that model predicts well on a holdout set of the original data. It’s tempting to make such an argument, but it’s a stretch. Holdout data is nothing more than a sample of the data collected at the same time, and under the same circumstances, as the training data. Thus, a trained AI model, in and of itself, does not provide a trusted basis for inference for predictions on future data observed under different circumstances.

What’s worse, many teams working on AI models fail to clearly define the business problem to be solved. This means that the team members are hard-pressed to tell business leaders whether the training data is the right data. Any one of these three issues (bad foundation, wrong problem, or wrong data) can prove disastrous in deployment — and statistics experts on AI teams can help prevent them.

Many IT leaders and data scientists feel that statistics is an old technology that is no longer needed in a big data and AI era.

Topics

Reprint #:

65413

More Like This

Add a comment

You must to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.