In this session, we will discuss the need for data lakes, what does it take to build and scale a data lake with 100’s of petabytes data at Uber. We will understand the challenges in managing such a massive scale and also discuss solutions. Finally, we will discuss some techniques to optimize your data lakes to reduce costs.
- What are the requirements from a data lake
- How to decide what data lake technologies to use
- What are the pitfalls associated with massive scale data lakes
- Operational challenges when managing large scale data lakes