When should Data Binning be used in data processing

0 votes

In data pre-processing, Data Binning is a technique to convert continuous values of a feature to categorical ones. For example, sometimes, the values of age feature in datasets are replaced with one of intervals such as:

[10,25),
[25,40),
[40,55].

When is the best time to use Data Binning? Does it (always) lead to a better result in a predication system or it may work as a trial and error?

Mar 3, 2022 in Machine Learning by Dev
• 6,000 points
3,015 views

1 answer to this question.

0 votes

Mostly by trial and error. When you bin a continuous variable, you automatically discard some data. Many algorithms would prefer to make a forecast using a continuous input, and many would bin the continuous data themselves. If your continuous variable is noisy, meaning the values were not recorded precisely, binning is a good idea. Binning could therefore help to lessen the loudness. Equal width binning and equal frequency binning are examples of binning strategies. When your continuous variable is poorly distributed, I would advocate avoiding equal width binning.

Ignite Your Future with Machine Learning Training

answered Mar 3, 2022 by Nandini
• 5,480 points

Related Questions In Machine Learning

0 votes
1 answer

If both negative and positive skewness are present in data set,then how it can be removed??

Hi@shama, It depends on your use case. If ...READ MORE

answered Dec 8, 2020 in Machine Learning by MD
• 95,440 points
547 views
0 votes
1 answer
0 votes
1 answer
0 votes
3 answers

Skills to become a data mining expert

Data mining expert will work on different ...READ MORE

answered Mar 14, 2019 in Career Counselling by Kumar
1,415 views
0 votes
1 answer

Error saying "Failed to get an access token." when trying to access my Google Analytics API

Try this: library(RGoogleAnalytics) oauth_token <- Auth( client.id = ...READ MORE

answered Nov 15, 2018 in Data Analytics by Maverick
• 10,840 points
1,039 views
+1 vote
1 answer

Error saying "vector size cannot be NA" when using R with data mining

You can use the removesparseterm function.  Removes sparse ...READ MORE

answered Nov 15, 2018 in Data Analytics by Maverick
• 10,840 points
4,406 views
+1 vote
2 answers
0 votes
1 answer
0 votes
1 answer
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP