Why Should You Blend When You Can Already Join In Tableau?

Last updated on Nov 29,2022 19K Views
Research Analyst, Tech Enthusiast, Currently working on Azure IoT & Data Science... Research Analyst, Tech Enthusiast, Currently working on Azure IoT & Data Science with previous experience in Data Analytics & Business Intelligence.

Why Should You Blend When You Can Already Join In Tableau?

edureka.co

In a world that generates and consumes 2.5 quintillion bytes of data, a day, organizations are bound to look for new methods to transform and combine data in order to attain optimum efficiency. One such method of combining data is Data Blending in Tableau.

Now, because this serves such an important purpose in the data cycle of any given organization, it makes for a very essential module in most Tableau Training curriculum. In this blog, we shall discuss the following concepts: 

Why do you need Data Blending in Tableau?

Suppose, you are a Tableau Developer who has transactional data stored in Salesforce and quota data stored in Access. The data you want to combine is stored in different databases, and the granularity of the data captured in each table is different in the two data sources, so data blending is the best way to combine this data.

Data blending is useful under the following conditions:

  1. You want to combine data from different databases that are not supported by cross-database joins.

    Cross-database joins do not support connections to cubes (for example, Oracle Essbase) or to some extract-only connections (for example, Google Analytics). In this case, set up individual data sources for the data you want to analyze, and then use data blending to combine the data sources on a single sheet.

  2. Data is at different levels of detail.

    Sometimes one data set captures data using various levels of detail i.e, greater or lesser granularity than the other data set.

    For example, suppose you are analyzing transactional data and quota data. Transactional data might capture all transactions. However, quota data might aggregate transactions at the quarter level. Because the transactional values are captured at different levels of detail in each data set, you should use data blending to combine the data.

What is Data Blending in Tableau?

Data Blending is a very powerful feature in Tableau. It is used when there is related data in multiple data sources, which you want to analyze together in a single view. It is a method for combining data that supplements a table of data from one data source with columns of data from another data source.

Usually, you use joins to perform this kind of data combining, but there are times, depending on factors like the type of data and its granularity, when it’s better to use data blending. 

How is it Different From Data Joining?

Data blending simulates a traditional left join. The main difference between the two is when the join is performed with respect to aggregation.

Left join

When you use a left join to combine data, a query is sent to the database where the join is performed. Using a left join returns all rows from the left table and any rows from the right table that has a corresponding row match in the left table. The results of the join are then sent back to and aggregated by Tableau.

For example, suppose you have the following tables. If the common columns are User ID, a left join takes all the data from the left table, as well as all the data from the right table because each row has a corresponding row match in the left table.

Data Blending

When you use data blending to combine data, a query is sent to the database for each data source that is used on the sheet. The results of the queries, including the aggregated data, are sent back and combined by Tableau. The view uses all rows from the primary data source, the left table, and the aggregated rows from the secondary data source, the right table, based on the dimension of the linking fields.

You can change the linking field or add more linking fields to include different or additional rows of data from the secondary data source in the blend, changing the aggregated values.

For example, suppose you have the following tables. If the linking fields are User ID in both the tables blending your data takes all of the data from the left table, and supplements the left table with the data from the right table. In this case, not all values can be a part of the resulting table because of the following:

Suppose you have the same tables as above, but the secondary data source contains a new field called Fines. Again, if the linking field is User ID, blending your data takes all of the data from the left table, and supplements it with data from the right table. In this case, you see the same null value and asterisks in the previous example in addition to the following:

When to Substitute Joining for Blending

1. Data needs cleaning.

If your tables do not match up with each other correctly after a join, set up data sources for each table, make any necessary customizations (that is, rename columns, change column data types, create groups, use calculations, etc.), and then use data blending to combine the data.

2. Joins cause duplicate data.

Duplicate data after a join is a symptom of data at different levels of detail. If you notice duplicate data, instead of creating a join, use data blending to blend on a common dimension instead.

3. You have lots of data.

Typically joins are recommended for combining data from the same database. Joins are handled by the database, which allows joins to leverage some of the database’s native capabilities. However, if you’re working with large sets of data, joins can put a strain on the database and significantly affect performance. In this case, data blending might help. Because Tableau handles combining the data after the data is aggregated, there are fewer data to combine. When there are fewer data to combine, generally, performance improves.

Blending your Data in Tableau 

You can use data blending when you have data in separate data sources that you want to analyze together on a single sheet. Tableau has two inbuilt data sources named Sample-superstore and Sample coffee chain.mdb which will be used to illustrate data blending.

Step 1: Connect to your data and set up the data sources

Step 2: Designate a primary data source

Step 3: Designate a secondary data source

Step 4: Blend Data

Limitations of Data Blending in Tableau

  1. There are some data blending limitations around non-additive aggregates, such as MEDIAN, and RAWSQLAGG.
  2. Data Blending compromises the speed of Query in high Granularity.
  3. When you try to sort by a calculated field that uses blended data, the calculated field is not listed in the Field drop-down list of the Sort dialog box.
  4. Cube data sources can only be used as the primary data source for blending data in Tableau. They cannot be used as secondary data sources.

I hope you all, now, have a fair idea about Data Blending in Tableau from this blog. Hungry for more knowledge? Don’t worry, this video will give you a better understanding of the concept.

 

Upcoming Batches For Tableau Certification Training Course
Course NameDateDetails
Tableau Certification Training Course

Class Starts on 4th May,2024

4th May

SAT&SUN (Weekend Batch)
View Details
Tableau Certification Training Course

Class Starts on 18th May,2024

18th May

SAT&SUN (Weekend Batch)
View Details
BROWSE COURSES
REGISTER FOR FREE WEBINAR ChatGPT for SEO in 60 Minutes