When should Data Binning be used in data processing

0 votes

In data pre-processing, Data Binning is a technique to convert continuous values of a feature to categorical ones. For example, sometimes, the values of age feature in datasets are replaced with one of intervals such as:

[10,20),
[20,30),
[30,40].

When is the best time to use Data Binning? Does it (always) lead to a better result in a predication system or it may work as a trial and error?

Apr 5 in Machine Learning by Dev
• 6,000 points
36 views

1 answer to this question.

0 votes
Mostly by trial and error. When a continuous variable is binned, some information is automatically discarded. Many algorithms would like to make a forecast with a continuous input, and many would bin the continuous input themselves. If your continuous variable is noisy, meaning the values for your variable were not recorded very properly, binning would be a good idea to use. Binning could then be used to lessen the noise. Equal width binning and equal frequency binning are two binning procedures. When your continuous variable is poorly distributed, I would avoid equal width binning.
answered Apr 7 by Nandini
• 5,480 points

Related Questions In Machine Learning

0 votes
1 answer

If both negative and positive skewness are present in data set,then how it can be removed??

Hi@shama, It depends on your use case. If ...READ MORE

answered Dec 8, 2020 in Machine Learning by MD
• 95,360 points
158 views
0 votes
1 answer
0 votes
3 answers

Skills to become a data mining expert

Data mining expert will work on different ...READ MORE

answered Mar 14, 2019 in Career Counselling by Kumar
717 views
0 votes
1 answer

Error saying "Failed to get an access token." when trying to access my Google Analytics API

Try this: library(RGoogleAnalytics) oauth_token <- Auth( client.id = ...READ MORE

answered Nov 15, 2018 in Data Analytics by Maverick
• 10,840 points
710 views
+1 vote
1 answer

Error saying "vector size cannot be NA" when using R with data mining

You can use the removesparseterm function.  Removes sparse ...READ MORE

answered Nov 15, 2018 in Data Analytics by Maverick
• 10,840 points
3,011 views
+1 vote
2 answers
0 votes
1 answer
0 votes
1 answer
webinar REGISTER FOR FREE WEBINAR X
Send OTP
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP