Data Engineering

Introduction to Data Lake

Senthil Nayagan
Senthil Nayagan           

Introduction to Data Lake

Writing in progress: If you have any suggestions for improving the content or notice any inaccuracies, please email me at [email protected]. Thanks!

Data lake trade-offs

There are trade-offs involved in the shift from traditional data storage and processing platforms, such as databases and data warehouses, to data lakes. After the migration to the data lake, we have sacrificed the following capabilities in favor of others:

  • We have given up durability and consistency features like ACID transactions in return for the ability to process them on a highly scalable platform.
  • We have traded performance characteristics such as indexing and caching in exchange for the capacity to handle data in multiple formats.
  • We have given up features like versioning and auditing in exchange for the ability to decouple storage and computing.

Comments

comments powered by Disqus