Data lakehouses give enterprises analytics edge

Cyber3ra
2 min readAug 4, 2022

Data warehouses, for their part, enable companies to store large amounts of structured data with well-defined schemas. They are designed to support a large number of simultaneous queries and to deliver the results quickly to many simultaneous users.

Data lakes, on the other hand, enable companies to collect raw, unstructured data in many formats for data analysts to hunt through. These vast pools of data have grown in prominence of late thanks to the flexibility they provide enterprises to store vast streams of data without first having to define the purpose of doing so.

The market for these two types of big data repositories is “converging in the middle, at the lakehouse concept,” Ronthal says, with established data warehouse vendors adding the ability to manage unstructured data, and data lake vendors adding structure to their offerings.

For example, on AWS, enterprises can now pair Amazon Redshift, a data warehouse, with Amazon Redshift Spectrum, which enables Redshift to reach into Amazon’s unstructured S3 data lakes. Meanwhile, data lake Snowflake can now support unstructured data with external tables, Ronthal says.

When companies have separate lakes and warehouses, and data needs to move from one to the other, it introduces latency and costs time and money, Ronthal adds. Combining the two in one platform reduces effort and data movement, thereby accelerating the pace of uncovering data insights.

And, depending on the platform, a data lakehouse can also offer other features, such as support for data streaming, machine learning, and collaboration, giving enterprises additional tools for making the most of their data.

data lakehouses store data in such a way that it is readily available for use by a wide array of technologies, from traditional business intelligence and reporting systems to machine learning and artificial Intelligence, Karan adds. “Other benefits include reduced data redundancy, simplified IT operations, a simplified data schema to manage, and easier to enable data governance.”

One particularly valuable use case for data lakehouses is in helping companies get value from data previously trapped in legacy or siloed systems. For example, one Capgemini enterprise customer, which had grown through acquisitions over a decade, couldn’t access valuable data related to resellers of their products

--

--