Data Lake Definition: Velocity, Agility, and Openness By Design
Defining data lakes in terms of velocity, agility, and openness, delivers successful business outcomes Data lake definitions can take many shapes, largely because different vendors promote definitions that align with product offerings. Given that there can be many different definitions, there can be confusion when people attempt to ask “ what is a data lake ?” Building a background definition it helps build a common vocabulary around what can be overly technical, abstract, and vendor-driven conversations. Define “Data Lake” Rather than rely on an AWS, Google, or Azure data lake definition, here are a few essentials to set some baselines. Pentaho co-founder and CTO, James Dixon framed it this way; This situation is similar to the way that old school business intelligence and analytic applications were built. End users listed out the questions they want to ask of the data, the attributes necessary to answer those questions were skimmed from the data stream, and bulk loaded into a d