Featured Post

Reference Books and material for Analytics

Website for practising R on Statistical conceptual Learning: https://statlearning.com  Reference Books & Materials: 1) Statis...

Saturday, January 2, 2021

Unify Data Warehousing and Advanced Analytics

 Most of the Data warehouses in today's world still deals with only structured data. Portion of it alos utilizes unstructured data from Data Lake or some landing layer before the warehouse. Data warehouse architecture as we know it today will wither in the coming years and be replaced by a new architectural pattern, the Lakehouse, which will (i) be based on open direct-access data formats, such as Apache Parquet, (ii) have firstclass support for machine learning and data science, and (iii) offer state-of-the-art performance. Lakehouses can help address several major challenges with data warehouses, including data staleness, reliability, total cost of ownership, data lock-in, and limited use-case support. The industry is already moving toward Lakehouses and how this shift may affect work in data management. We also report results from a Lakehouse system using Parquet that is competitive with popular cloud data warehouses on TPC-DS.


Please refer below architecture for the evolution of Data Warehouses. With the increased focus now on 

Data Science & Machine learning Lakehouse platform is the future.


Reference: This article has referred the CIDR paper. For more details please refer following link.

http://cidrdb.org/cidr2021/papers/cidr2021_paper17.pdf?utm_source=bambu&utm_medium=social&utm_campaign=advocacy&blaid=1066676

No comments:

Post a Comment