DaaS is stickier than any other business service - that’s why it’s valuable

Decorative thumbnail with the Gulp Data logo and portrait of Lauren Cascio
Table of contents

Lauren Cascio on the complexity of data valuation, in conversation with Thani Shamsi

Lauren Cascio, founding partner of Gulp Data, sat down with Monda’s CEO and co-founder, Thani Shamsi, to discuss the factors affecting a dataset’s value, the demand shift towards proprietary data, and why intransparency in data pricing may not be such a bad thing.

---

Does Lauren Cascio agree with Ilya Sutskever, co-founder of OpenAI, that data is the fossil fuel for AI, and that we have achieved peak data? As Sutskever put it, there will “be no more”. In short, no she does not. Cascio agrees that AI companies may have reached “peak data” in terms of available web data, but she highlights the emerging shift toward leveraging untapped, private datasets.

Cascio explains that AI has dramatically expanded the ability of organizations to tap into and process large datasets. “AI has essentially set the foundation for organizations to be able to use data,” she says. This foundational role of AI allows businesses to more effectively analyze, organize, and extract insights from their data, creating opportunities for smarter decision-making across various industries. However, she emphasizes that AI is only as good as the data it’s trained on, and in its current state, much of the AI technology available today lacks true differentiation.

A major issue, according to Cascio, is the over-reliance on publicly available data, particularly web-scraped information. “Most AI is not differentiated today because they’ve all trained their tools on the same open-source, public information,” she explains. This widespread use of identical datasets across AI models leads to several problems, particularly the amplification of inherent biases in the data. AI models trained on the same sources perpetuate the same issues, recycling biases and producing results that may not be as accurate or nuanced as they could be.

In addition to these biases, Cascio points out the dangers of “poisoned” data—AI-generated content that feeds back into the system, further skewing outcomes. She underscores that new, high-quality, and ethically sourced data is essential to build AI models that are not only differentiated but capable of producing truly useful and profitable products. “You need new, proprietary, different data that is deep and correct, factual and ethically sourced,” she says. Without this shift, AI will remain limited in its ability to create value in the long term.

Proprietary data can fix the ‘poisonous’ effects of overusing public data

This shift toward proprietary data is already underway, Cascio observes. “We’re going to see more deals between AI companies and companies that can provide proprietary data,” she predicts. As AI companies recognize the limitations of public web data, they will increasingly turn to private organizations with valuable datasets. This, she believes, will drive a new wave of partnerships and data-sharing agreements, ultimately enabling AI to thrive with more specialized, unique sources of information.

Cascio also highlights the vast untapped potential of proprietary data, particularly that which resides within the walls of corporations. While much of this data has been underutilized in the past, companies are now beginning to realize its value. “We have started to see a huge uptick in corporations that are like, ‘Hey, take the exhaust data of our business and tell us what to do with it,’” she explains. These businesses are waking up to the realization that even seemingly minor operational data can be a goldmine, offering new insights, revenue opportunities, and competitive advantages. Cascio notes that as more organizations explore how to monetize their “exhaust data,” there will be an increasing number of partnerships between data-rich corporations and AI companies looking to unlock new value.

This shift is also driven by high-profile licensing deals between AI companies and platforms that own large datasets. Cascio points to the deals involving Reddit and Shutterstock, among others, as examples of how the data market is evolving. These agreements showcase the potential of proprietary data to be bought, sold, and shared for profit, further encouraging companies to think strategically about their data assets.

While the notion of “peak data” might hold true when it comes to publicly available web data, Cascio argues that the true potential of AI lies in tapping into the vast reserves of proprietary data that are still largely untapped. As companies begin to recognize the value of their own data, the future of AI may no longer be defined by what is publicly available on the internet, but by the rich, proprietary datasets that businesses have locked away.

In the coming years, AI development will likely be shaped by a growing reliance on proprietary, differentiated, and ethically sourced data. As companies unlock and monetize their data, and as more AI companies secure access to these exclusive data streams, the industry will continue to evolve, driven not by the open web, but by the vast, untapped treasure troves of data within organizations themselves.

Gulp Data was created to help companies value their data

Lauren Cascio, a serial entrepreneur with a deep-rooted focus on data, shared the journey that led her to found Gulp Data—a company dedicated to recognizing data as a tangible asset. Cascio's obsession with data began early, rooted in her background in healthcare. She started her first company in 2014, a health tech venture that aimed to use healthcare data to help individuals better understand their risk profiles for diseases. Managing clinical data for hospitals, doctors, and labs, Cascio’s role in data strategy eventually led her to ask a fundamental question: “What can we do with our underlying data assets?”

As her company shifted towards a DaaS model, Cascio became increasingly frustrated by the lack of recognition that data could be monetized as a valuable asset. “Data had become our leading source of revenue, but the firms that were backing us weren’t recognizing it as an asset,” she recalls. Despite data being a revenue driver, it wasn’t being valued on the company’s balance sheet like other intangible assets such as patents or copyrights. This disconnect became the catalyst for launching Gulp Data. “Data is still that company’s main source of revenue, and it seems wild that standard valuations don’t attribute data as an asset,” she notes, adding that this oversight prompted her to build a company that could change how data is perceived and valued in the business world.

Cascio’s vision for Gulp Data began with a simple but ambitious idea: to demonstrate that data could be an asset class in its own right. The company’s first initiative was to create a system that could lend on data assets, thereby showing the market that data has intrinsic value. “We first had to build a valuation engine that could value data,” she explains. This was no small task, but it set Gulp Data on a path to revolutionize how businesses understand the worth of their data.

Today, the core mission of Gulp Data remains focused on transforming how businesses view data as an asset. Cascio describes the company's North Star as “changing generally accepted accounting principles (GAAP)”. This shift in accounting principles is something she believes will have a lasting impact on business valuations and investment strategies, especially as companies begin to treat data with the same reverence as intellectual property or physical assets.

Data valuation goes mainstream - or is beginning to 

In the past year, Cascio has seen a tangible shift in how businesses are approaching data. More companies are now incorporating data into their valuations, signaling a broader acceptance of data as a key asset. While it’s not yet a universally accepted practice, the change is happening. “We are seeing more private equity firms, banks, and financial backers accepting the possibility that data is indeed an asset that belongs in company consideration,” she says. This shift, once driven primarily by data teams within companies, is now being led by executives. “CEOs, CFOs, and boards are now coming to us asking for help valuing data,” Cascio explains, marking a significant change in the stakeholders seeking data valuation services.

While data is still not universally recognized as a standard practice for valuations, Cascio is optimistic about the future. She believes that 2024 will be a turning point. “2024 was a year of companies coming to us saying, ‘Hey, help us value this,’” she says. With more CEOs and CFOs becoming aware of the value of their data, Cascio sees the groundwork being laid for widespread acceptance of data as a financial asset.

Gulp Data’s success has been driven largely by its B2B relationships, particularly with companies looking to better understand and quantify the value of their data in the context of mergers and acquisitions (M&A). Cascio anticipates that this trend will continue into 2025, especially with M&A activity picking up in the coming years. “We think we’ll continue to see this in 2025, particularly with the M&A action picking up this year,” she adds, suggesting that more companies will recognize the importance of properly valuing data during acquisition processes.

As Gulp Data continues to build on this momentum, Cascio is confident that the way businesses approach data is changing for good. The shift towards viewing data not as an afterthought but as a core business asset is already underway, and with it, the financial value of data will continue to grow, transforming industries and investment strategies in the years to come.

There’s no one method for sticking a pricetag on a dataset

Data valuation is a complex and multifaceted process, particularly in the context of AI development and the broader data market. According to Lauren Cascio, the value of data is not solely about assigning a price tag to it based on quantity or attributes. While basic questions about the size and quality of a dataset—such as how many attributes it has, how often it’s updated, or how big the database is—are important, they are only part of the picture. True data valuation, she explains, depends heavily on market demand and the specific use cases for that data.

At Gulp Data, Cascio emphasizes that the company doesn’t rely on traditional valuation methods like cost-based approaches; instead, they use a market comparison model. “We take a company’s data assets, we assess the overall quality, and we compare how similar they are to our massive comps database, and that’s how it receives a price tag,” she says. But the real value goes beyond the price tag. While the price is useful for M&A transactions, business valuations, or raising capital, it doesn’t help companies effectively bring their data to market. The real value in a data comp-based data valuation , Cascio points out, is understanding how to market the data, determine its pricing, and identify the right buyers. Without this knowledge, companies may find themselves “throwing spaghetti at the wall” when attempting to monetize their data.

A key part of the process is understanding the target audience and the most marketable attributes of the data. For example, businesses need to ask questions like: “Who is buying my data?” and “How can I package it effectively?” Cascio explains that the focus should be on identifying valuable and relevant data attributes that will appeal to potential buyers. She describes a scenario in which a healthcare company might collect extensive patient information, such as blood panels, but could potentially increase the marketability of its data by adding insurance and demographic details. The right combination of data attributes can significantly enhance the value and appeal of a dataset, provided that it aligns with what buyers are actively seeking.

When discussing data valuation, Cascio highlights the importance of the market context—the “neighborhood” where the data resides. She likens it to real estate valuation: “A house with three bedrooms can go for $500K in one place, but if you have the same house in Boston or San Francisco, the context changes everything.” Just as location and other factors influence real estate pricing, the market conditions surrounding a dataset, including its volume, quality, and the specific use cases it serves, play a critical role in determining its value.

To assess this, Gulp Data looks closely at the specifics of the data. Cascio uses the example of Zillow’s property valuation approach, which compares the characteristics of a house—such as the number of bedrooms, the size of the lot, or recent renovations—to similar properties in the same neighborhood. In the same way, Gulp Data benchmarks datasets against others in the market, taking into account factors like the number of records, the data’s age, and how much duplication exists in the dataset. The deeper and more comprehensive the data, the higher its potential value. This concept of the "neighborhood" is key to understanding data valuation in a dynamic, market-driven way.

However, as Cascio points out, the context of data use also extends beyond the dataset itself. The brand, reputation, and marketability of the data play significant roles in determining its value. “Data valuation has an inside view on the data itself—its quality, quantity, and completeness—but there’s also this other piece: the context of the use case, the marketability, even the brand of the data,” she says. The data’s intended use and the specific buyer’s needs are critical factors in shaping its price. A dataset may be highly valuable in one industry or application, while in another, it may be far less relevant.

In the AI space, this context-driven valuation is crucial. For AI companies looking to acquire datasets to train their models, understanding the precise market fit for that data is just as important as its technical specifications. Without a clear understanding of demand and use cases, AI companies may struggle to source the right data or accurately assess its value. For companies looking to monetize their data, this means focusing not just on the size and quality of their datasets, but on how those datasets can be packaged and marketed to the right buyers.

Ultimately, Cascio stresses that data valuation is not a one-size-fits-all process. It requires a deep understanding of both the data itself and the broader market dynamics. By focusing on the use case, the buyer’s needs, and the market context, companies can more effectively price and sell their data. This shift toward a more nuanced, market-driven approach to data valuation is one of the key factors that will drive the continued growth of the data economy, particularly in AI development and other data-centric industries.

Same dataset, same data, different buyers, different pricing

Cascio observes that the pricing landscape is driven by customer needs and usage models, making it distinct from the more standardized pricing typically seen in software. “Data pricing is often customer-driven, which makes it so interesting,” she explains. “You have purchasing models and usage models that vary greatly depending on the specific buyer or use case.”

To illustrate this, Cascio provides an example from the healthcare sector. Imagine a dataset containing analytics about heart disease and patients with heart disease indicators. This dataset could be consumed by three different buyers in three distinct ways, each demanding a different pricing model. The first buyer might be a pharmaceutical company or healthcare provider using an interactive dashboard to target at-risk populations. This would likely involve a subscription model, where the customer pays a monthly or annual fee for ongoing access to the data and its analysis.

“Some companies even throw these datasets into tools like Tableau, productize them, and sell them,” Cascio notes. This highlights a typical DaaS model—subscription-based, with pricing tied to the value and usability of the product.

The second use case could involve unstructured clinical notes being used to train AI models. This kind of data could be priced on a consumption basis, where buyers pay for the amount of data they use. This model is becoming increasingly common in AI-driven industries, where customers are more focused on the quantity of data rather than its packaged analysis.

A third buyer—say, a financial institution—might be interested in data about the products covered by insurance companies. This use case is driven by how much value the buyer derives from the dataset, which changes the pricing model entirely. “That’s where DaaS differs from SaaS,” says Cascio. “It’s driven by the customer’s unique needs. Just like Salesforce, which tailors pricing based on company size or usage, DaaS is highly customizable.”

Cascio predicts that data pricing models are moving toward greater flexibility, much like what’s seen in the software industry. Companies are starting to offer more varied options, such as selling raw data at one price or offering more complex products like dashboards and enriched data at a higher cost. “Some will say, ‘We can provide you raw data for this price, but if you need a product or a dashboard, it’s this much more. And if you want us to enrich it, there’s an additional cost’”, Cascio explains. 

However, she acknowledges that this flexibility can sometimes lead to frustration for data buyers. The common refrain of “Talk to sales for a custom quote” can feel opaque to those looking for straightforward pricing. Cascio admits that the lack of transparency in the data market is a problem, though she doesn’t view it as an intentional effort to be opaque. Instead, she believes it’s a reflection of the maturity of the market.

“It’s sales,” says Cascio. “It comes down to selling a product and service to another entity. Yes, there’s a lack of transparency in the data market, but that’s not the core issue.” She points out that this lack of transparency is part of the broader challenge of creating a more efficient and established market for data. “We’re still trying to create a market for data,” she says. “Data has been around for a long time, but it’s still figuring out how to be a well-established market. This is part of the economic model. Once we gain transparency and more market efficiency, that’s when it becomes a standard.”

What will 2025 bring for data valuation and monetization?

Cascio’s vision is that, as the market matures, data pricing will become more transparent and standardized, just as we’ve seen in other industries. Until then, the flexibility in pricing models is likely to remain a defining feature of the data market.

What else makes data valuation different from valuing other assets, services or commodities? Cascio explains, “I believe that DaaS is stickier than any other business service, including SaaS, because once you embed a source into your operations, it is expensive and difficult to remove it.”

When data is sold to AI companies, however, the implications shift dramatically. Cascio compared the traditional provision of data via APIs or dashboards to selling data for derivative use, noting the latter as akin to "giving away the cow." In such transactions, companies risk empowering their customers to become competitors. The difficulty of proving and rectifying unauthorized derivative use, she explained, poses significant challenges, even when contractual safeguards are in place.

Cascio highlighted the importance of ensuring proper usage rights before monetizing data. Many companies eager to enter the data market, she noted, underestimate this step, often lacking clarity on intellectual property (IP) rights. Her advice: create derivative products first. Enriching and adding value to raw data not only differentiates it but also bolsters protection by creating something proprietary.

Reflecting on recent high-profile licensing deals, Cascio expressed optimism about the opportunities for companies entering the data market. “We saw the $60 million Reddit deal [with Google to license content for AI training]. And then Axel Springer, Shutterstock and Yelp. All in 2024.” However, she cautioned against viewing such deals as the standard, noting that these represent the "tip of the iceberg."

Looking ahead to 2025, Cascio predicted a surge of new entrants into the data market. Interestingly, she expects a shift from traditional DaaS providers to corporations leveraging their "data exhaust"—the byproducts of their core operations. While these companies possess valuable data assets, Cascio underscored the steep learning curve involved in monetizing them, including decisions about infrastructure, pricing, and sales strategies. Aggregators, she believes, are well-positioned to capitalize on this trend by helping these corporations refine and package their data for broader market appeal.

“It's a raw material. One of our executives, Nate [Nathan Whigam] he calls it data smelting. He wrote an article using the Kingsford charcoal example. Ford had all of this exhaust wood waste. And so, instead of disposing of the wood waste, they created charcoal. Kingsford charcoal was born out of waste materials. And so we use that example a lot with corporations that have never thought about selling data.”

But for many companies, Cascio continues, data is still viewed as a cost center, requiring a shift in mindset to recognize its potential as a revenue-generating asset. Cascio also forecasted an uptick in mergers and acquisitions (M&A) activity centered on data, driven by a growing recognition of its strategic importance. This presents exciting opportunities but also demands realistic expectations. Companies new to the market must navigate a complex terrain to turn raw data into a refined, marketable product.

Cascio provided a forward-looking perspective on the evolving demand for data, highlighting key trends expected to shape 2025. While traditional industries like healthcare, financial services, and consumer behavior remain the cornerstone of data demand, Cascio noted a growing appetite for unstructured data. This trend, which emerged in late 2024, aligns with advancements in AI technologies that make it easier to integrate and analyze diverse data formats.

According to Cascio, a major catalyst for the increased demand is the availability of tools that simplify the integration of data into products, services, and operations. She credited AI advancements for reducing barriers to working with new information, including unstructured data, and fostering broader adoption across sectors.

Despite the AI-driven surge, Cascio emphasized that established industries would continue to dominate data consumption. "The demand for healthcare, financial services, and consumer behavior data isn’t going anywhere," she explained, adding that these sectors are foundational to the data economy.

Cascio also pointed to a shift in corporate behavior as a factor driving demand. With data valuations becoming more widely understood, more companies are expected to embrace external data sources as strategic assets. This growing comfort with purchasing and utilizing data, she suggested, will fuel demand across both established and emerging industries.

While the landscape evolves, Cascio highlighted a balanced outlook. AI companies are a rising force, but their growth is unlikely to eclipse the enduring needs of core industries. Instead, the confluence of innovation and traditional demand promises a robust and diverse data market in the coming year. “I think more companies will be comfortable purchasing information. I'm certainly after data evaluations becoming mainstream,” concludes Cascio, summarising her outlook on 2025 as a founder with skin in the game when it comes to making data valuation not an exceptional practice, but a business norm. 

---

Lauren Cascio is the founding partner of Gulp Data, a company that provides non-dilutive funding using data assets as collateral. She also co-founded abartysHealth, a health-tech company where she led product, data, and development for six years. Additionally, Lauren is a proven angel investor and an active tech ecosystem builder, successfully advising and mentoring numerous companies through go-to-market strategies, data monetization, and fundraising. 

Learn more about Lauren and Gulp Data:

LinkedIn: Lauren Cascio

Web: Gulp Data

Monetize your data

150+ data companies use Monda's all-in-one data monetization platform to build a safe, growing, and successful data business.

Explore all features

Related articles

Monda makes it easy to create data products, publish a data storefront, integrate with data marketplaces, and manage data demand - data monetization made simple.

News

People don't want data, they want answers

Thani Shamsi

News

Monda Appoints Kyle Antoian as Chairman of the Board

Lucy Kelly

News

Monda and Dewey Data Partner to Increase Access to External Datasets Amongst Academic Researchers

Lucy Kelly

Monda Logo

Grow your business with one data monetization platform.

Get a demo

Be the best informed in the data industry

Sign up to our newsletter for unique thought leadership and to be the first to know about every product update and event.

© Monda Labs, Inc. • 2025 • All rights reserved.