Who Profits the Most From Generative AI?

Unpacking what it takes to build and deploy a large language model reveals which players stand to gain the most — and where newer entrants might have the best prospects.

Reading Time: 18 min 

Topics

Permissions and PDF

Jon Krause

In the months since the public launch of ChatGPT, massive investments have been made in the form of venture capital firms plowing money into generative AI startups, and corporations ramping up spending on the technology in hopes of automating elements of their workflows. The excitement is merited. Early studies have shown that generative AI can deliver significant increases in productivity.1 Some of those increases will come from augmenting human effort, and some from substituting for it.

But the questions that remain are, who will capture the value of this exploding market, and what are the determinants of value capture? To answer these questions, we analyzed the generative AI stack — broadly categorized as computing infrastructure, data, foundation models, fine-tuned models, and applications — to identify points ripe for differentiation. While there are generative AI models for text, images, audio, and video, we use text (large language models, or LLMs) as an illustrative context for our discussion throughout.

Computing infrastructure. At the base of the generative AI stack is specialized computing infrastructure powered by high-performance graphics processing units (GPUs) on which machine learning models are trained and run. In order to build a new generative AI model or service, a company might consider purchasing GPUs and related hardware to set up the infrastructure required to train and run an LLM locally. This would likely be cost-prohibitive and impractical, however, given that this infrastructure is commonly available through major cloud vendors, including Amazon Web Services (AWS), Google Cloud, and Microsoft Azure.

Data. Generative AI models are trained on massive internet-scale data. For example, training data for OpenAI’s GPT-3 included Common Crawl, a publicly available repository of web crawl data, as well as Wikipedia, online books, and other sources. The use of data sets like Common Crawl implies that data from many websites such as those of the New York Times and Reddit was ingested during the training process. In addition, foundation models also include domain-specific data that is crawled from the web, licensed from partners, or purchased from data marketplaces such as Snowflake Marketplace. While AI model developers release information of how the model was trained, they do not provide detailed information about the provenance of their training data sources.

Topics

References

1. S. Peng, E. Kalliamvakou, P. Cihon, et al., “The Impact of AI on Developer Productivity: Evidence From GitHub Copilot,” arXiv, submitted Feb. 13, 2023, https://arxiv.org; and S. Noy and W. Zhang, “Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence,” Science 381, no. 6654 (July 13, 2023): 187-192.

2. A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention Is All You Need” (paper presented at the 31st Annual Conference on Neural Information Processing Systems, Long Beach, California, Dec. 6, 2017).

3. For a catalog of some of these models, see X. Amatriain, A. Sankar, J. Bing, et al., “Transformer Models: An Introduction and Catalog,” arXiv, revised May 25, 2023, https://arxiv.org.

4. S. Wu, O. Irsoy, S. Lu, et al., “BloombergGPT: A Large Language Model for Finance,” arXiv, revised Dec. 21, 2023, https://arxiv.org.

5. X. Li, S. Chan, X. Zhu, et al. “Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? A Study on Several Typical Tasks,” arXiv, revised Oct. 10, 2023, https://arxiv.org.

Reprint #:

65331

More Like This

Add a comment

You must to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.