DevOps Certification Training
- 46k Enrolled Learners
- Live Class
Data has been the buzzword for ages now. Either the data being generated from large-scale enterprises or the data generated from an individual, each and every aspect of data needs to be analyzed to benefit yourself from it. But how do we do it? Well, that’s where the term ‘Data Analytics’ comes in. In this blog on ‘What is Data Analytics?’, you will get an insight of this term with a hands-on.
Let’s look into the topics that will be discussed in this article:
As an enormous amount of data gets generated, the need to extract useful insights is a must for a business enterprise. Data Analytics has a key role in improving your business. Here are 4 main factors which signify the need for Data Analytics:
Now that you know the need of Data Analytics, let me quickly elaborate on what is Data Analytics for you.
As the word suggests Data Analytics refers to the techniques to analyze data to enhance productivity and business gain. Data is extracted from various sources and is cleaned and categorized to analyze different behavioral patterns. The techniques and the tools used vary according to the organization or individual.
So, in short, if you understand your Business Administration and have the capability to perform Exploratory Data Analysis, to gather the required information, then you are good to go with the career in Data Analytics.
So, now that you know what is Data Analytics, let me quickly cover the top tools used in this field.
With the increasing demand for Data Analytics in the market, many tools have emerged with various functionalities for this purpose. Either open-source or user-friendly, the top tools in the data analytics market are as follows.
Now, that you know all this about Data Analysis, let me tell you what you can become by gaining knowledge about this field.
Well, you can become a well-renowned Data Analyst. Now, if you ask me Who is a Data Analyst?, then my answer would be that a Data Analyst is a professional who can analyze data by applying various tool and techniques and gathering the required insights.
So, let’s now discuss how you can become a Data Analyst?
Data analysts translate numbers into plain English. A Data Analyst delivers value to their companies by taking information about specific topics and then interpreting, analyzing, and presenting findings in comprehensive reports. So, if you have the capability to collect data from various sources, analyze the data, gather hidden insights and generate reports, then you can become a Data Analyst. Refer to the image below:
Fig 1: Process of Data Analysis – What is Data Analytics
Apart from the above-mentioned capabilities, a Data Analyst should also possess skills such as Statistics, Data Cleaning, Exploratory Data Analysis, and Data Visualization. Also, if you have a knowledge of Machine Learning, then that would make you stand out from the crowd.
On an Average, a Data Analyst can expect a salary of ₹404,660 (IND) or $83,878 (US). As experts, data analysts are often called on to use their skills and tools to provide competitive analysis and identify trends within industries. If you wish to know more about the Salary trends of a Data Analyst, then you can read our full article on Data Analyst Salary.
So, now that you know a handful about Data Analytics, let me show you a hands-on in R, where we will analyze the data set and gather some insights.
For this hands-on, we will be using sample census data. The dataset has columns such as House_number, Husband_Age, Wife_Age, Husband_Income, Wife_Income, Number_Of_Bedrooms, Electricity_Units, Gas, Number_Of_Children, Internet_Connection, Mode, House_Owned/Rented, Speaking_Language, and Decade_Of_House_Built. Refer to the snapshot below.
To find out the following:
The steps we are going to follow are as follows:
Step 1: To import the data set you can use the read.csv command and mention the path of the CSV file to be read. After that assign this CSV file to a variable for the future use purpose. Here I have assigned it to sampledata.
#Importing Dataset sampledata <- read.csv("C:/Users/Sahiti/Desktop/Census_Data.csv")
Step 2: Now, you can view your data set, by using the View command and the variable’s name.
Step 3: Now, if you observe in your data set, you will find NA values. So, I am going to replace the NA values with 0. To do that, you can use the is.na function. This function will find all the NA values, replace them with 0, and assign it back to the same variable. Refer below.
#Replacing NA Values with 0 sampledata[is.na(sampledata)]<- 0
Now, if you view dataset again, you would find that all the NA values are replaced with 0. You can use the View (sampledata) to do so.
Step 4: Now, that all the null values are removed, use the summary function to get the summary of the dataset.
#summary of dataset summary(sampledata)
Step 5: After this, to calculate the minimum, maximum and the average value of Wife_Age, you can use the inbuilt functions (min, max, mean) as follows.
#To calculate the min, max, and mean of Wife_Age min(sampledata$Wife_Age) max(sampledata$Wife_Age) mean(sampledata$Wife_Age)
Step 6: Now, to calculate the median, quantile range, variance and the standard deviation of Husband_Income you can use the inbuilt functions (median, quantile, var, sd) as follows.
#To calculate the median, quantile,variance, standard deviation of Husband_Income median(sampledata$Husband_Income) quantile(sampledata$Husband_Income) var(sampledata$Husband_Income) sd(sampledata$Husband_Income)
Step 7: Now, to plot the number of children and number of bedrooms, you can use the histogram and bar plot.
#To plot the number of children hist(sampledata$Number_Of_Children)
As you can clearly see in the above plot, the frequency of the number of children between 0-2 stands the highest.
Now, to find the number of bedrooms, I have assigned the number of bedrooms to a table, which is again assigned to a variable countofbedrooms.
#To plot the number of bedrooms. countofbedrooms <- table(sampledata$Number_Of_Bedrooms) barplot(countofbedrooms, main="Bedrooms Distribution", xlab="Number of Bedrooms")
As you can see in the above plot, the number of bedrooms 3 stands the highest.
So, folks, with this, we come to the end of this ‘What is Data Analytics?’ article. If you are someone looking to get into an interesting career, now would be the right time to up-skill and take advantage of the Data Analysis career opportunities that come your way.
Edureka has a specially curated Data Analytics Master Program that will make you proficient in tools and systems used by Data Analytics Professionals. It includes in-depth training on Statistics, Data Analytics with R, SAS, and Tableau. The curriculum has been determined by extensive research on 5000+ job descriptions across the globe.
Got a question for us? Please mention it in the comments section of the “What is Data Analytics?” article and we will get back to you.