Starting as a fresher I need to know how should I begin learning to be a data scientist?
Aug 7, 2018 734 views

## 2 answers to this question.

Data Science is a vast domain. It requires a mixture of multidisciplinary skills ranging from an intersection of mathematics, statistics, computer science, communication and business. Here’s a little cheat sheet on who the modern Data Scientist really is:

MATH & STATISTICS

1. Machine Learning
2. Statistical Modelling
3. Experimental Design
4. Bayesian Inference
5. Optimization

PROGRAMMING & DATABASE

1. Computer Science fundamentals
2. Scripting Language: Python
3. Statistical Computing Package: R
4. NOSQL and SQL Databases
5. Algebra

DOMAIN KNOWLEDGE AND SOFT SKILLS

3. Influence without authority
4. Hacker Mindset
5. Problem Solving Techniques

COMMUNICATION AND VISUALIZATION

1. Able to engage with senior management
2. Story telling skills
3. Translate data driven insights into decisions
• 3,720 points

Your first steps towards becoming a top performer
Your first step towards becoming a top-performing data scientist is mastering the foundations:

• data visualization
• data manipulation
• exploratory data analysis

Have you mastered these? Have you memorized the syntax to accomplish these? Are you “fluent” in the foundations?

If not, you need to go back and practice. Believe me. You’ll thank me later. (You’re welcome.)

The reason is that these skills are used in almost every part of the data science workflow, particularly in earlier parts of your career.

Given almost data task, you’ll almost certainly need to clean your data, visualize it, and do some exploratory data analysis.

Moreover, they are also important as you move into more advanced topics. Do you want to start doing machine learning, artificial intelligence, and deep learning? You had better know how to clean and explore a dataset. If you can’t, you’ll basically be lost.

• 3,790 points

## How to use a function to repeat a set of procedures on specific set of columns in a data frame?

You can parse the strings to symbols. ...READ MORE

+1 vote

## How to convert a list of vectors with various length into a Data.Frame?

We can easily use this command as.data.frame(lapply(d1, "length< ...READ MORE

## How to create a list of Data frames?

Basically all we have to do is ...READ MORE

## How to spilt a column of a data frame into multiple columns

it is easily achievable by using "stringr" ...READ MORE

## On a given dataset would time taken to train n - random forest be equal to time taken to train n X (Decision tree)

No, the time to train the random ...READ MORE

+1 vote

## What is the difference between correlation and covariance?

Correlation and Co-variance both are used as ...READ MORE

## What is the difference between random forest and decision trees?

The basic difference is that Random Forest ...READ MORE

+1 vote

## How do I perform feature selection in a disease prediction data set?

Feature selection is based equally upon logic ...READ MORE

+1 vote