How Generative AI Can Support Advanced Analytics Practice

Large language models can enhance data and analytics work by helping humans prepare data, improve models, and understand results.

Reading Time: 15 min 

Topics

Permissions and PDF Download

Paul Garland

The glare of attention on generative AI threatens to overshadow advanced analytics. Companies pouring resources into much-hyped large language models (LLMs) such as ChatGPT risk neglecting advanced analytics and their proven value for improving business decisions and processes, such as predicting the next best offer for each customer or optimizing supply chains.

The consequences for resource allocation and value creation are significant. Data and analytics teams that our team works with are reporting that generative AI initiatives, often pushed by senior leaders afraid of missing out on the next big thing, are siphoning funds from their budgets. This reallocation could undermine projects aimed at delivering value across the organization, even as most enterprises are still seeking convincing business cases for the use of LLMs.

However, advanced analytics and LLMs have vastly different capabilities, and leaders should not think in terms of choosing one over the other. These technologies can work in concert, combining, for example, the reliable predictive power of machine learning-based advanced analytics with the natural language capabilities of LLMs.

Considering these complementary capabilities, we see opportunities for generative AI to tackle challenges in the development and deployment phases of advanced analytics — for both predictive and prescriptive applications. LLMs can be particularly useful in helping users incorporate unstructured data sources into analyses, translate business problems into analytical models, and understand and explain models’ results.

In this article, we’ll describe some experiments we have conducted with LLMs to boost advanced analytics use cases. We’ll also provide guidance on monitoring and verifying that output, which remains a best practice when working with LLMs, given that they are known to sometimes produce unreliable or incorrect results.

Applying LLMs in Predictive Analytics

Predictive analytics lies at the heart of processes that are increasingly data-driven for many companies. It’s rare to find a marketing department that isn’t discussing shifts in customer churn predictions and how to react, or commercial teams that aren’t considering how to boost next month’s sales in response to a dip that’s been forecast by predictive analytics. We see opportunities to expand the impact of such approaches by tapping LLMs in the following ways to increase the variety of data used to train and execute models or better communicate with business stakeholders who use predictive analytics outputs in decision-making.

Incorporating complex data types. In the development phase of predictive projects, challenges arise when decision makers regularly consult and monitor data sources that are difficult to incorporate into predictive algorithms. For instance, customer reviews detailing negative experiences are complex to use in churn models directly, yet they have valuable predictive power. To utilize such data in predictive models, significant time must be invested in distilling and structuring relevant information from each source. This leads to a trade-off between the investment to make that data usable and the anticipated improvement in the predictive model performance. Of course, there is already natural language software to help with data structuring, but its use is usually circumscribed to particular cases, such as sentiment analysis.

LLMs can significantly reduce the time invested in data wrangling and make it easier to analyze complex data types. A precise prompt can instruct an LLM to review a given data set for key themes and return its answer as data formatted with standard labels that is then suitable for use by predictive models. (See “Labeling Unstructured Data.”) This ability of LLMs to expedite the processing of complex data types — from weeks to mere days or hours — may seem quite simple, but it represents a notable leap forward in the practice of advanced analytics.

Since the rise of LLMs, we have seen that the increased ease of incorporating unstructured data sources into analyses has led to a substantial increase in the share of this type of data in various predictive applications. In a recent project with a telecommunications company that focused on predicting the next best action (NBA) in a debt collection and recovery process, there was an untapped data source: written complaints made by customers that were often linked to this process. Because there was no certainty that a thorough analysis of this information could yield a substantial benefit for the NBA project, it was left unused. However, once the team understood that LLMs could accurately filter and categorize the complaints related to the debt collection and recovery process, it started considering this data source — a source that eventually steered the project substantially in terms of what actions to consider to improve this business process.

LLMs can significantly reduce the time invested in data wrangling and make it easier to analyze complex data types.

Explaining predictions. During the deployment of predictive projects, such as the previously mentioned telco project, we’ve often faced communication challenges in decoding and explaining the inner workings and consequent outputs of machine learning models. One tool data science teams often use to understand and communicate the relevance of the different input variables to a predicted output is Shapley additive explanations (SHAP) analysis. This analysis can be translated into a visualization that describes the relevance and the directional impact of the input variables on a given outcome. For example, it may be used to understand that purchase frequency is the most relevant variable when predicting churn: The less-frequent customers tend to churn more than the others. However, explaining the findings of a SHAP analysis to colleagues who aren’t data scientists is often tricky because of the technical knowledge required.

LLMs may help tackle this communication gap. We’ve noticed that the data sets scraped from the web to train generative AI models, particularly LLMs, encompass extensive knowledge about machine learning models and the analyses used to explain them. (See “Explaining Prediction Results.”) Consequently, LLMs can provide useful output in response to a well-crafted prompt that specifies the prediction topic, the analytical model employed, the results of analyses, and the technique used to understand the results (such as a SHAP visualization). This information allows LLMs to articulate a plausible explanation for changes in predictions for decision makers and highlights the main contributing factors.

Sticking to our churn example, we experimented by giving the information listed above to the LLM and prompting it to explain, in simple terms, the most relevant variables. It returned accurate output: “NumOfProducts and Age are consistently the most impactful features across all iterations. This suggests that the number of products a customer has and their age are strong indicators of potential churn. For example, if the distribution of these features changes (like offering more products to customers or the demographics of the customer base aging), it could significantly impact the model’s predictions.” This output may still sound too technical for some, so we were excited to see that the LLM kept expanding upon the topic and eventually mentioned the business impact of the results of our predictive (banking) churn model: “The bank should consider examining their product offerings and customer engagement strategies, particularly for older customers, as these seem to be significant factors in predicting churn.”

Applying LLMs in Prescriptive Analytics

Prescriptive analytics are typically employed for business problems involving limited recourses and multiple decision options, such as in supply chain management. Mathematical programming and optimization techniques are the go-to approaches to solve complex decision-making problems such as production and distribution plans that have myriad possible decisions constrained by finite resources, such as production and transportation capacities. Analytics teams can use LLMs to support and streamline the development and deployment phases in the following ways.

Crafting model mechanics. In our experience, mathematically representing a business challenge with all of its nuances is a formidable challenge. It requires an understanding of decision makers’ precise goals. For example, when planning store assortments, do category managers give precedence to profitability or to market share, or try to balance both? Defining the boundaries for the underlying decisions is also very hard: In the store assortment problem, is the shelf space assigned to a category a constraint, or can this limitation be overruled in some situations? Missing these pieces of information in the development phase usually results in ineffective models. It’s not uncommon for data scientists to miss something important; the ability to pose the right questions and translate the corresponding answers requires a rare combination of business acumen and analytics expertise.

Recently, we have started experimenting with LLMs to help design the mechanics of optimization models, augmenting the capabilities of analytics translators responsible for these tasks. With a carefully designed prompt, it is possible to instruct the LLM to engage in a conversation that can effectively identify decision makers’ understanding of the business problem and write the first version of the prescriptive model. (See “Developing Model Mechanics.”) An exemplary prompt to the LLM is: “Objective Clarification: What is the primary goal of the supply chain optimization? Is it to minimize costs, maximize profits, ensure timely delivery, or something else?” The prompt guides the LLM to identify and define the decision variables, the objective function, and any constraints of the business problem such that it can generate the mathematical formulation of the challenge.

Alongside the dialogue, the LLM may explain in plain English its understanding of the problem and ask the decision maker for clarifications to sort out missing information. We asked: “We’re creating a linear programming model that decides on the optimal number of units of Product A and Product B to produce. … Before proceeding to write the model, could you please clarify the time required for Product B in both the cutting and finishing departments?” These interactions with the LLM resulted in rapid and accurate output.

Understanding model results. Even if LLMs can help teams craft rigorous and applicable prescriptive models to help decision-making, there is another barrier that, in our experience, is even more severe: the difficulty in deciphering the solutions such models produce. This complicated task often leads business users to distrust the results. Moreover, we have frequently seen technical teams spend many more hours than were budgeted to go back and forth with business teams explaining the results. There are numerous underlying reasons, but one rather obvious one is that decision makers have a more subtle approach to decision-making than algorithms, which often get stuck on corner solutions. For example, if the prescriptive algorithm finds a small savings in changing a supply chain network, it will suggest this movement independently of all of the change management efforts that such an action may imply.

LLMs can be an interesting aid when deploying prescriptive analytics to help teams understand model results. Analytics teams can feed into generative AI the mathematical notation representing the prescriptive model as well as the internal metrics, such as the unused capacity (or slack) in constraints that did not limit the solution and the opportunity costs associated with each limiting condition. (See “Provide Insight Into Model Workings.”)

Decision makers can then ask questions to understand the results that interest them, and the LLM can explain these results in plain English until the user is satisfied. This conversation enables generative AI to identify and explain areas where the model’s trade-offs may seem counterintuitive to decision makers. Moreover, this method facilitates the collection of additional feedback from decision makers, which can be used to refine the model’s mechanics.

LLMs can be an interesting aid when deploying prescriptive analytics to help teams understand model results.

Such an approach can be beneficial for all company stakeholders responsible for the success of analytics initiatives. In our experiments with this use case, we understood that increasing the level of autonomy for business owners would have a drastic impact on their sense of empowerment and control over the quality of the proven prescriptive methods. For technical teams, the fact that they didn’t have multiple conversations with decision makers about topics that an automated system could instead explain was a huge relief.

A classic example of necessary interaction between analytics and business teams around prescriptive analytics projects is the need to understand why a given algorithmic run is not yielding a solution that respects all defined constraints. See how smooth that interaction can be based on the answer we obtained after asking what the source of infeasibility was in a retail distribution problem: “The primary constraint leading to the infeasible state is the insufficient supply to meet the combined demand and minimum stock requirements. The total available supply from all warehouses is not enough to satisfy the demands of the retail stores.”

Monitoring Model Quality and Business Impact

Companies should already be employing processes to monitor the performance of their advanced analytics models to detect errors and drift resulting from, for example, changes in variables or the business environment that deviate from the model’s original assumptions. However the quality of LLMs’ output (especially in the absence of additional techniques that further constrain or check that output) is, by design, somewhat unpredictable. The integration of LLMs introduces additional opportunities for errors due to the potential for random or objectively false responses.

The best approach for controlling the output quality of the LLM will depend on which opportunity to integrate generative AI with advanced analytics is pursued. When incorporating complex data types into predictive models, the approach may be relatively straightforward: Companies can investigate the end result and make note of the improvement (or not) of the accuracy metric of the prediction task after adding the unstructured data sources. In the case of using LLMs to explain predictions and understand prescriptive results, technical teams have to heavily test the prompt and the answers that are being generated for different possible questions that business stakeholders may pose. Finally, for crafting model mechanics — an opportunity that is mainly focused on augmenting the development phase of prescriptive models — the outputs of LLMs will always have to be supervised by modeling experts, who must review them critically.

Despite these caveats, bear in mind that the success of any organizational analytics effort depends on how relevant it is to the enterprise, as measured by usage frequency, adoption rate, and internal customer satisfaction. By allowing decision makers to interact with analytics models, the integration of generative AI could lower barriers to adoption, ease change management, and promote trust in outputs. As a result, we would expect to see improvements in these metrics.


Using LLMs together with advanced analytics tools can increase efficiency by streamlining the labor-intensive processes of explaining the validity of predictions and developing prescriptive models. LLMs can also make analytics more effective by aiding in the incorporation of complex data sets for predictive modeling and in understanding prescriptive model outputs. By leveraging the potential of unstructured data sources and identifying opportunities for model refinement, companies can enhance the quality of their outcomes.

We know from current AI and analytics practice that creating multidisciplinary teams that involve both business owners and data science specialists is essential to take full advantage of these opportunities. Thanks to the accessibility provided by LLMs’ natural language capabilities, integrating generative AI into analytics should empower business users to take a more active role in the development and monitoring of analytics applications. 

Topics

Reprint #:

65436

More Like This

Add a comment

You must to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.