Business Analyst Masters Program (9 Blogs) Become a Certified Professional

Cluster Analysis Steps in Business Analytics with R

Last updated on Sep 22,2023 8.2K Views

Cluster Analysis Steps in Business Analytics with R

Cluster Analysis is a fundamental modelling technique, which is all about grouping. The steps involved in clustering are valid for all techniques.

Here are the steps for Cluster Analysis:

1.Choose the Right Variable – The concept involves identifying what is the right attribute and how much is it worth it. Here, one must select a variable that one feels may be important for identifying and understanding differences among groups of observation within the data.

2.Scaling the Data – In this, the data samples from different sources may be grouped in different scales. For example, if we are working on personal data, such as age where it goes from 0 to 100, weight between 40-180 and height between 1-6 feet. Here, the variables in the analysis vary in range; the variable with the largest range will have the greatest impact on the results.

3.Calculate Distances- Here, if the variables in the analysis vary in range, the variable with the largest range will have the greatest impact on the results.

A Point to note is that each of the attributes has different scales. If we try to come out with an equation, then normalization must be considered, where we may have to bring all attributes and variables. For example, given that we are doing analysis on weather and evaluate the sample data from India & US, the scale is different in this case. This is because one would be using metric system and the other is using US system. Thus, our objective is to bring them to the same standard. Also, the basic purpose of Cluster Analysis is to calculate distances

R logo-Cluster Analysis with R-Edureka

Find out our Business Analyst Course in Top Cities

United StatesOther Countries
Business Analyst Course in USABusiness Analyst Course in Canada
Business Analyst Course in DallasBusiness Analyst Course in London
Business Analyst Course in NYCBusiness Analyst Course in Singapore

Calculation of Distance between Points in a Cluster

Here, one objective can be to group similar points together into one cluster.

1)      One way is that we can take the center of the cluster and find out the center of the next group and calculate distance between the centers.

2)      Or take the closest point and find distance between closest points.

3)      Or take the largest distance points and find out the distant between them.

Simple linkage – produces elongated clusters. It is the shortest distance between a point in one cluster and a point in the other cluster.

Complete linkage– longest distance between a point in one cluster and a point in the other cluster

Average linkage– average distance between each point in one cluster and each point in the other cluster

Centroid – distance between the centroids (mean vector over the variables) of the two clusters

Ward– combines clusters that lead to the smallest distance within clusters, sum of all squares over all variables

Note: These concepts may be applied to multiple techniques. In each and every technique we have multiple options to choose from. When it comes to cluster analysis, this is called as hierarchical cluster analysis, where one can use multiple methods. Each method has its own advantage, disadvantage and properties.

If you wish to learn Power BI and build a career in data visualization or BI, then check out our Power BI Certification Course which comes with instructor-led live training and real-life project experience. This training will help you understand Power BI in-depth and help you achieve mastery over the subject. Also, Take your career to the next level by mastering the skills required for business analysis. Enroll in our Business Analyst Course today and take the first step towards a fulfilling and lucrative career.

Got a question for us? Mention them in the comments section and we will get back to you.

edurekaRelated Posts:

Introduction to Business Analytics with R

Get started with Business Analytics with R

Upcoming Batches For Business Analyst Masters Course
Course NameDateDetails
Business Analyst Masters Course

Class Starts on 1st June,2024

1st June

SAT&SUN (Weekend Batch)
View Details

Join the discussion

Browse Categories

webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP

Subscribe to our Newsletter, and get personalized recommendations.

image not found!
image not found!

Cluster Analysis Steps in Business Analytics with R