Overfitting vs Underfitting

0 votes
I am new to Machine Learning, and I want to understand the difference between these two terms.
Jul 11, 2018 in Data Analytics by darklord
• 6,170 points
60 views

1 answer to this question.

0 votes

In statistics and machine learning, one of the most common tasks is to fit a model to a set of training data, so as to be able to make reliable predictions on general untrained data.

In overfitting, a statistical model describes random error or noise instead of the underlying relationship. Overfitting occurs when a model is excessively complex, such as having too many parameters relative to the number of observations. A model that has been overfitted, has poor predictive performance, as it overreacts to minor fluctuations in the training data.

Underfitting occurs when a statistical model or machine learning algorithm cannot capture the underlying trend of the data. Underfitting would occur, for example, when fitting a linear model to non-linear data. Such a model too would have poor predictive performance.

answered Jul 11, 2018 by CodingByHeart77
• 3,690 points

Related Questions In Data Analytics

0 votes
2 answers

What are the differences between overfitting and underfitting?

Overfitting is a modeling error which occurs when ...READ MORE

answered Aug 8 in Data Analytics by anonymous
3,141 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
334 views
0 votes
1 answer

Hadoop Streaming job vs regular jobs?

In certain cases, Hadoop Streaming is beneficial ...READ MORE

answered Mar 21, 2018 in Data Analytics by kurt_cobain
• 9,240 points
55 views
0 votes
3 answers

R vs MATLAB, which is better with respect to machine learning?

Hello, Both are a good programming language you ...READ MORE

answered Apr 12 in Data Analytics by SA
• 1,030 points
97 views
0 votes
1 answer

data.frame vs matrix

Yes, both matrix and data.frame are multidimensional ...READ MORE

answered Apr 27, 2018 in Data Analytics by Bharani
• 4,550 points
36 views
0 votes
2 answers

Vector vs List in R

Well, you are right in  saying that ...READ MORE

answered May 8, 2018 in Data Analytics by anonymous
34 views
0 votes
1 answer

What are the options for deploying models in production with R?

Well, I could say that the answer ...READ MORE

answered Apr 12, 2018 in Data Analytics by DataKing99
• 8,130 points
272 views
0 votes
1 answer

Use different distance formula other than euclidean distance in k means

K-means is based on variance minimization. The sum-of-variance formula ...READ MORE

answered Jun 21, 2018 in Data Analytics by darklord
• 6,170 points
356 views
+1 vote
1 answer

How to handle Nominal Data?

Nominal data is basically data which can ...READ MORE

answered Jul 23, 2018 in Data Analytics by Anmol
• 3,620 points
36 views
+1 vote
2 answers

How to handle outliers

There are multiple ways to handle outliers ...READ MORE

answered Jul 23, 2018 in Data Analytics by Anmol
• 3,620 points
42 views