Databricks' data lake solution paves the way for BI and AI
Databricks has launched SQL Analytics, which enables cloud data warehousing on data lakes and offers improved price and performance over traditional cloud data warehouses.
The new solution allows data analysts to perform workloads previously meant only for a data warehouse on a data lake, the company states.
This expands the traditional scope of the data lake from data science and machine learning to include all data workloads, including business intelligence (BI) and SQL.
Now, organisations can allow data teams across data engineering, data science, and data analytics to work on a single source of truth for data, the company states.
Key benefits of lakehouse architecture
According to Databricks, SQL Analytics realises the company's vision for a lakehouse architecture that combines data warehousing performance with data lake economics, resulting in up to 9x better price/performance than traditional cloud data warehouses.
Furthermore, Databricks states a lakehouse architecture simplifies data and AI for organisations.
In the past, data teams had to maintain proprietary data warehouses for BI workloads and data lakes for data science and machine learning workloads, because no single data platform could meet the performance needs of BI and the flexibility needs of data science.
Expensive and complicated to maintain, this coexistence of legacy architectures has created data silos that slow innovation and stifle data team productivity, Databricks states.
A lakehouse addresses this by running all workloads through a single architecture.
The specifics of SQL Analytics
SQL Analytics is built on Delta Lake, an open format data engine that adds reliability, quality, and security, to a customer's existing data lake.
Customers are able to avoid storing multiple copies of data, as well as locking data up in proprietary formats.
To deliver BI-performance on a data lake, SQL Analytics makes use of two innovations.
First, it provides auto-scaling endpoints that keep query latency consistently low under high user load.
Second, it uses Delta Engine, Databricks' polymorphic query execution engine, to complete queries against both large and small data sets.
With native connectors for all major BI tools, including Tableau and Microsoft Power BI, customers can integrate SQL Analytics into their existing BI workflows to conduct analytics on fresher, more complete data.
SQL Analytics also provides a SQL-native query and visualisation interface to allow analysts, data scientists, and developers without access to traditional BI tools to build dashboards and reports that can be shared within their organisation.
A statement from the CEO
Databricks CEO and co-founder Ali Ghodsi says, “It is no longer a matter of if organisations will move their data to the cloud, but when.
"A lakehouse architecture built on a data lake is the ideal data architecture for data-driven organisations and this launch gives our customers a far superior option when it comes to their data strategy.
"We've worked with thousands of customers to understand where they want to take their data strategy, and the answer is overwhelmingly in favor of data lakes.
"The fact is that they have massive amounts of data in their data lakes and with SQL Analytics, they now can actually query that data by connecting directly to their BI tools like Tableau.”