Cloud Data Lake Best Practices: Data Lake vs Data Warehouse
Cloud Data Lake or Warehouse Cloud Data Lake Best Practices: Data Lake vs. Data Warehouse Why an open, flexible, and agile data lake architecture makes a difference between success or swamp Before we jump into best practices around lake formation , architecture, analytics, and other aspects of data lakes, we need to baseline precisely “ what is a data lake ?” As we have detailed in a prior post, there are numerous misconceptions and myths about data lakes . To set a baseline, this is how Pentaho co-founder and CTO, James Dixon who coined the term, frames it; This situation is similar to the way that old school business intelligence and analytic applications were built. End users listed out the questions they want to ask of the data, the attributes necessary to answer those questions were skimmed from the data stream, and bulk loaded into a data mart. This method works fine until you have a new question to ask. The Data Lake approach solves this problem. You store all of the dat