Which machine learning classifier to choose in general

0 votes

Suppose I'm working on some classification problem. (Fraud detection and comment spam are two problems I'm working on right now, but I'm curious about any classification task in general.)

How do I know which classifier I should use?

  1. Decision tree
  2. SVM
  3. Bayesian
  4. Neural network
  5. K-nearest neighbors
  6. Q-learning
  7. Genetic algorithm
  8. Markov decision processes
  9. Convolutional neural networks or others
In which cases is one of these the "natural" first choice, and what are the principles for choosing that one?
Feb 21 in Machine Learning by Nandini
• 5,480 points

1 answer to this question.

0 votes
Choice of Machine Learning Classifier depends upon the data set. When a substantial amount of training data is available, boosting is generally effective. Random trees are frequently quite effective, and they can also be used to do regression.
K-nearest neighbors - the most basic thing you can perform, but can be slow and demands a lot of memory.
Slow to train but incredibly fast to operate, neural networks are still the best performer for letter recognition.
With minimal data, SVM is one of the best
With categorical/binomial data, Bayesian works best.
Complex non-linear classification can be handled using neural nets and SVMs.
Thus, there is no benchmark or pre-defined solutions, it is all about performing iterations over the model, improving the results with hyperparameter tuning and then finalizing the model after continued improvements and iterations.
answered Feb 21 by Dev
• 6,000 points

Related Questions In Machine Learning

+1 vote
1 answer

ValueError: could not convert string to float in Machine learning.

Hi@akhtar, You are trying to use constant variable ...READ MORE

answered Apr 14, 2020 in Machine Learning by MD
• 95,340 points
0 votes
1 answer
0 votes
1 answer

What is clustering in Machine Learning?

Clustering is a type of unsupervised learning ...READ MORE

answered May 10, 2019 in Machine Learning by Shridhar
0 votes
1 answer

Use different distance formula other than euclidean distance in k means

K-means is based on variance minimization. The sum-of-variance formula ...READ MORE

answered Jun 21, 2018 in Data Analytics by Sahiti
• 6,360 points
0 votes
1 answer

Overfitting vs Underfitting

In statistics and machine learning, one of ...READ MORE

answered Jul 11, 2018 in Data Analytics by CodingByHeart77
• 3,720 points
+1 vote
1 answer

How to handle Nominal Data?

Nominal data is basically data which can ...READ MORE

answered Jul 24, 2018 in Data Analytics by Abhi
• 3,720 points
+2 votes
2 answers

How to handle outliers

There are multiple ways to handle outliers ...READ MORE

answered Jul 24, 2018 in Data Analytics by Abhi
• 3,720 points
0 votes
1 answer
Send OTP
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP