Data is getting bigger exponentially, and so is cloud storage. Price, performance, scalability, and ecosystem support integration are some reasons why more organizations choose cloud data warehouse (DWH) solutions for their data.
“Apart from that, the solution should be cost-effective and provide you with options so you can tailor the experience you’re giving to your users and applications with different pricing and compute tiers.”, says Luka Lovosevic, Solution Architect at Firebolt. Firebolt is a cloud data warehouse purpose-built to provide sub-second analytics performance on massive, terabyte-scale data sets and it is one of the exhibitors at this year’s Data Innovation Summit.
In the interview, he shared more about the advantages and disadvantages of cloud data warehouses than on-premises data warehouses, the development of modern cloud data warehousing, and the importance of the speed and performance at scale for the users of cloud data warehouses.
Hyperight: It’s an honour to have Firebolt as one of the companies exhibiting at this year’s Data Innovation Summit. Please, tell us more about you, your professional background and your current working focus at the company?
Luka Lovosevic: Thanks for having us, we’re really excited to be at the Data Innovation Summit this year. I am a solution architect at Firebolt, working with prospects and customers and supporting them in the journey from evaluating Firebolt to moving their workloads into production with us. My background is at the intersection of application development and big data analytics, having spent more than 10 years at Microsoft in various customer facing roles. I’m based out of Croatia and cover the EMEA region.
Hyperight: Firebolt provides cloud data warehouse solutions for data engineers, development teams and organizations with large amounts of data. Cloud data warehouses have gained a lot of popularity in the last few years. Tell us a bit about the company, the product you are providing on the market and the unique selling points of your product?
Luka Lovosevic: The data space overall has been growing quite fast in recent years. Cloud computing has unlocked new scenarios where the scale of data growth and the performance needs of end users can be addressed in an efficient way. Firebolt was founded with the mission of enabling businesses to deliver next-gen analytics experiences, by making it easy to deliver sub-second analytics at huge scales.
Hyperight: Firebolt is, as you call it, “the third generation of cloud warehousing”. How do you compare to other cloud data warehouse vendors?
Luka Lovosevic: When you look at the industry, we really are standing on the shoulders of giants. There have been numerous advancements in this space, ever since Redshift was announced as the first data warehouse as a service offering. Since then, we have seen BigQuery with their serverless approach, Snowflake with decoupled storage and compute, Databricks with massive batch workloads, and so on. Still, we see room for Improvement.
Our offering is combining a lot of these features and architectures, and we bring in our own special sauce – like efficient data storage, lots of indexing options, multiple compute or engine sizes, and an advanced SQL query optimiser. Our focus today is to give you the power to tune the system and make it really really fast, and this concept is very close to a data engineering mindset. Firebolt is focused on building a data warehouse that helps modern data engineering and dev teams deliver customer-facing data apps, and go beyond traditional internal BI.
Hyperight: You already said that the emphasis of the product is on speed and performance at scale. How is this achieved, and how important is it for the users?
Luka Lovosevic: Speed has always been important for users of analytics, but it’s getting even more important today. Unfortunately we’ve gotten too used to slow dashboards and wait times of dozens of seconds, but nowadays users expect a consumer-grade experience from analytics experiences. Experiences that are just as fast as the other services we consume. This is why Firebolt is focused on helping deliver sub-second analytics at large data volumes.
Firebolt is built on a combination of architectural building blocks that are different from other data warehouses. Firebolt uses a storage format called TripleF, that is tightly coupled with a type of index called sparse indexing. Every table that is ingested into Firebolt is sparsely indexed, which results in highly efficient data pruning and very little data scanning compared to other technologies. There are additional types of indexes users can then add to accelerate performance even further. And the most fun part is that this is all delivered over a natively decoupled storage/compute architecture, so you can scale up or down with a click.
Performance has been core to our offering from the start, and we’re constantly working on pushing the boundaries of low-latency analytics. This is why there are many other exciting things going behind the scenes, from an amazing query optimizer, to vectorized processing, and more.
Hyperight: Many organizations have already successfully migrated their data to the cloud, but the number of ones that have not is significantly big. What are the main benefits of modern cloud data warehouses compared to on-premise solutions, and why are these two probably incomparable?
Luka Lovosevic: The truth is that even though we’re all excited about the benefits of the cloud, there’s still a huge on-prem footprint in the market. But it is shrinking rapidly. The benefits of the cloud in general and cloud data warehouses, in particular, are widely accepted. Thanks to the advancements in recent years, data warehouses are now dramatically easier to implement. Delivered as SaaS, users don’t need to worry about hardware. If in the past you needed an experienced DWH architect to implement a DWH over months, today a college graduate can experiment with multiple cloud DWHs within a few days.
Scaling has been dramatically simplified. Whether through serverless architectures like you see in BigQuery or Athena, or in decoupled storage & compute architectures you see in Snowflake and Firebolt – scaling has become something the platform simplifies dramatically, bringing true Elasticity to users without the need to understand distributed processing or worry about storage management.
Another benefit is that modern cloud data warehouses can handle a broader variety of workloads compared to traditional warehouses. You see that especially in ELT becoming popular at the expense of ETL, with the data warehouses taking care of transforming data as well.
And finally, consumption-based pricing models make it easier to reduce barriers to entry and reduce financial risk.
Hyperight: If an organization decides to migrate their data warehouse from on-the promise to the cloud or from one cloud warehouse provider to you, what does the process look like? What should the organization pay attention to while doing this change?
Luka Lovosevic: There are lots of considerations to think about while choosing a cloud data warehouse solution, and the top ones are usually around price, performance, scalability and integration or ecosystem support. Data today is growing rapidly, so supporting high volumes of data is a must for a modern cloud data warehouse solution. Apart from that, the solution should be cost effective and provide you with options so you can tailor the experience you’re giving to your users and applications with different pricing and compute tiers.
Performance is a must if you are doing anything around data applications, embedded analytics or supporting high concurrency, and then finally, integration and tooling support is critical so that you can easily feed data into your warehouse, and also consume it once it’s there – this can be anything from ELT tooling, reverse ETL, SDKs or BI tools support.
Hyperight: What would your final advice and recommendation be to organizations that just migrated to cloud data warehouse or to think of it?
Luka Lovosevic: My main advice would be – focus on the business value and find a technology vendor that can be your partner throughout the path of realizing that business value. At Firebolt, we pride ourselves not only with our cloud data warehousing technology, but also with how we work with our customers and how we help them in this journey – from understanding their use case, to bringing massive amounts of data into Firebolt and finally, getting the insights and the value out of our platform. A big mindset switch in the cloud should be not to look for a large monolithic solution for everything. The cloud makes it easy to combine multiple technologies around the data lake, each optimized for a different workload or use case. With SaaS and consumption based pricing it’s extremely fast and easy to experiment and try new technologies, without a lot of financial risk. Data really is the new oil, so my final advice is – don’t wait too long, check out Firebolt today.
Featured image credits: Joshua Sortino at Unsplash
More content about cloud data warehouse:
- Analytics Data Warehouse-Built for Business Users on Production From the Day One – Jay T Chinnaswamy