Understanding Databricks SQL
Databricks SQL is an integrated query execution tool for running ad-hoc queries and creating a dashboard on the data that is stored on Data Lake.
Databricks SQL contains two main components:
- SQL endpoint
They are explained as follows:
- SQL endpoint: SQL Endpoint is a serverless compute service. For executing SQL queries, we will use SQL endpoints for computation that will help us run SQL commands on data objects within the Databricks environment. Classic SQL endpoint (default SQL endpoint) uses the computational resources in AWS Cloud. We can also create Databricks managed SQL endpoints, called Serverless SQL endpoints, that utilize compute resources in Databrick cloud accounts. Using serverless endpoints simplifies SQL endpoint management and accelerates launch times.
- Query: This component is available as part of the Databricks SQL interface. Imagine this as creating a notebook. When you click on the notebook, it will prompt you to create a new notebook by attaching it to a cluster. Similarly, when you select a query, it will open a SQL editor that will prompt you to select a SQL endpoint (a serverless compute) that needs to be attached to the SQL editor and can run ad hoc queries.
Following are the features:
- Enables customers to work with the multi-cloud lake house architecture
which is cheaper and faster compared to the conventional data warehouse
and/or Data Lake architecture.
- Use SQL queries on the Data Lake house with the data warehousing
capabilities and performance.
- Less complex administration and configurations to set up SQL analytics for
Databricks platform itself determines the instance types and configurations to reduce the cost and improve the performance of the SQL queries.
Databricks SQL has in-built features to easily manage user access, data, and resources including monitoring, query history, and fine-grained access management:
- Use the preferred BI tool to perform analytics on top of data available in the Databricks Lakehouse platform.
- Efficient data discovery, access using SQL queries, and quickly sharing new insights with the visualization capabilities.
Hope this was helpful.