Why do Power BI users need Dremio? It unlocks your data lake!

The purpose of this article is to highlight how Power BI users can benefit from Dremio and show its easy integration. Dremio is a SQL Lakehouse Platform that allows companies to run interactive analytics and high-performing dashboards directly on the data lake storage. It eliminates the need for proprietary and expensive data warehouses.

To understand why you need Dremio, think about how you currently access the data from the data lake. Are you a Data Analyst who is currently using the imports provided by your Data Engineers? If so, how many versions/copies of the same data do you currently have? Many Power BI users work with different copies of the same data and in situations like this, how can you make sure you have the same definition of data? How would you determine which semantic layer to trust and use for analysis and decision-making?

Before we explore those questions in further detail, let’s talk about how we reach the phase of accessing data. Many organizations use data lakes like Amazon S3, Google Cloud Storage, and Azure Datalake Storage (ADLS) to store their data. This data is then brought to a data warehouse where the dimensions and measures of the data are made into query-able components after performing complex ETL processes. This does not allow any dashboarding, yet. Data engineers must then create further copies of the data which can be utilized for Power BI imports. If you are a data analyst using this Power BI extract, you are now restricted to this subset of data. What if you become interested in integrating this data with something that lives outside your BI extract? In many scenarios, this would involve either asking the data engineers for more data or hoping you have a similar Power BI extract available. If the additional data you requested is not available in the data warehouse, the data engineers must go to the bottom of the pyramid again – performing complex ETL to get the data from the data lake. If you are using a similar Power BI version for the join, you are limited to the scope of that data definition. How do you make sure that your definition of data matches the data you are looking for?

This is frustrating for different data users:

  • Data analysts using Power BI (or other BI tools) cannot discover/curate and analyze the data in a self-service manner. They spend much of their time creating ETL tickets for data engineers, while also being restricted to individual definitions of data in isolated BI tools.
  • Data engineers must respond to all the data requests from Data Analysts/Data Scientists and the majority of their time is consumed by creating and maintaining data pipelines.
  • Business leaders want to use data to make strategic decisions. With so many definitions of data, it’s difficult to know which definition to trust.

Dremio users can analyze data as it lives in the data lake. You can also create multiple engines to manage your workload. For example, it is possible to have a large engine for data science workload and a medium-sized engine for marketing. These engines are physically isolated and do not affect one another’s performance.

All that sounds exciting but how difficult is the integration with BI too

With just a few clicks you can create a live connection to the data lake house and use it for Power BI dashboards.

Dremio and Power BI Integration

The Dremio Driver for ODBC is available here: https://www.dremio.com/drivers/odbc

Import Data in Power BI from Dremio

  1. After installing the Dremio connector, click on “Get Data” in Power BI.

If you do not have the Dremio connector installed, you will get this error message:

 

As a bonus, the last section of this article talks about how to successfully configure the Dremio connector.

  1. Next, Power BI will ask if you are connecting to Dremio Cloud in the United States or Europe.

Use Server as sql.dremio.cloud for the United States and sql.eu.dremio.cloud for Europe and select DirectQuery in the data connectivity mode.

  1. You now have an option to enter either your Microsoft credentials or an access token. You can create an access token on your user profile and decide its lifetime.

Voila! You now have a live connection to Dremio and can access your data as it lives in the data lake. Export Data from Dremio to Power BI

now query the data in its logical form and create Virtual Datasets that can be exported for analysis in Power BI.

Configuring Dremio Connector for Power BI Integration

After installing the Dremio Connector from here:

  1. Open ODBC Data Sources and select “Dremio Connector.” Select Configure.
  2. Use the following configuration in your Dremio Connector DSN Setup:

Username: $token

Password: This will be your access token. You can create access tokens on Dremio in your user profile.

  1. Enter your project_id in the advanced properties and test the connection.

Dremio makes data analysts and data scientists more self-sufficient and provides faster analytics while ensuring consistent semantics for decision-making.

Interested in talking to our team or scheduling a personalized demo? Send us an email at info@capitalize.com