Ultimate List of Data Licensing Deals for AI

Blog header showing a book and sparkle
Table of contents

Generative AI has made already data-hungry tech companies insatiable for content to train their AI models. This phenomenon has sparked high-profile legal disputes, such as the recent case News Corp brought against Perplexity AI for alleged 'content kleptocracy'. Such cases demonstrate the importance of a data monetization environment in which content creators are fairly remunerated for their intellectual property.

It's in this healthy data sharing economy where content creators can generate net-new revenue by monetizing their data assets, and where AI companies can train more reliable models based on human-generated, diverse datasets.

We've compiled an ongoing list to track licensing deals between companies for AI training purposes. Those included are publicly disclosed.

Undoubtedly, there are more licensing deals happening behind the scenes, and many more which will be reported in the coming months that we'll add here. As long as the AI revolution continues, we will need data to train the AI. So we will see more household name companies striking data licensing deals with AI firms, like those listed below:

1. Reddit

The social media platform, Reddit
The social media platform, Reddit

Status: Confirmed

Deal Details: Ongoing and historical data access with Google.

Financial Information: Reported to be $60M/year, S-1 reports $66.4M between 2021-2022.

Source: SEC‍

2. Shutterstock

The social media platform, Reddit
The image distribution platform, Shutterstock

Status: Confirmed

Deal Details: Deals with Meta, OpenAI, Amazon, Apple.

Financial Information: Estimated $25-50M deals with Amazon, Apple. Data labeled and prepared by Shutterstock's contributors.

Source: Shutterstock Press Release

3. Yelp

The consumer review site, Yelp
The consumer review site, Yelp

Status: Confirmed

Deal Details: Licensed review and location data to Perplexity AI and Neeva, plus other LLM companies.

Financial Information: Other category revenue grew to $47M in 2023, includes licensing AI data.

Source: The Verge‍

4. Reuters

The news reporting and media company, Reuters
The news reporting and media company, Reuters

Status: Confirmed

Deal Details: Licenses its news content to help train large AI models.

Financial Information: Added $22M in the Reuters News Segment, increasing their overall AI-related revenue.

Source: Reuters‍

5. Wiley

The academic book publisher, Wiley

Status: Confirmed

Deal Details: The major academic publisher licensed previously published academic papers for use in AI model training.

Financial Information: One-time revenue of $23M for previously published academic papers.

Source: The Bookseller

6. Axel Springer

The media conglomerate, Axel Springer
The media conglomerate, Axel Springer

Status: Confirmed

Deal Details: Multi-year contracts giving OpenAI historical and ongoing access to news content.

Financial Information: OpenAI reportedly offers $1-5M per corpus, Apple offers $50M over multi-year period.

Source: The Information

7. Condé Nast

The magazine publishing group, Condé Nast
The magazine publishing group, Condé Nast

Status: Confirmed

Deal Details: The media company and publisher partnered with OpenAI to license its data.

Financial Information: No specific financial information available.

Source: WIRED

8. X (Formerly Twitter)

The social media platform, X
The social media platform, X

Status: Confirmed

Deal Details: Provides third-parties to firehose access to users' data, unless users opt-out.

Financial Information: $42K/month or $2.5M/year.

Source: TechCrunch

9. The Financial Times

The newspaper group, the Financial Times
The newspaper group, the Financial Times

Status: Confirmed

Deal Details: Signed deal with OpenAI with both parties referring to the arrangement as a “strategic partnership and licensing agreement.” Licenses FT’s content for training AI models and displaying in generative AI responses produced by tools like ChatGPT.

Financial Information: No specifics, but understood to be non-exclusive licensing arrangement and that OpenAI is not taking any stake in the FT Group.

Source: TechCrunch

10. Associated Press (AP)

The publishing group, Associated Press (AP)
The publishing group, Associated Press (AP)

Status: Confirmed

Deal Details: License means OpenAI will have access to AP news stories going back to 1985.

Financial Information: No specific financial information available.

Source: APNews

To download the full list (with over 15 additional companies), complete the form below:

Monetize your data

150+ data companies use Monda's all-in-one data monetization platform to build a safe, growing, and successful data business.

Explore all features

Related articles

Monda makes it easy to create data products, publish a data storefront, integrate with data marketplaces, and manage data demand - data monetization made simple.

Data Monetization

Remunerating Content Providers in the AI World: Today’s Legal Landscape & Potential Solutions

Dan Goikhman

Data Monetization

Getting started with Data Valuation - What it is, Why it's Important, Methods to Calculate Data Value

Lucy Kelly

Data Monetization

Banking Data Monetization: Definition & 5 Strategies for Financial Firms to Know

Lucy Kelly

Monda Logo

Grow your business with one data monetization platform.

Get a demo

Be the best informed in the data industry

Sign up to our newsletter for unique thought leadership and to be the first to know about every product update and event.

© Monda Labs, Inc. • 2024 • All rights reserved.