What are the ways of detecting outliners in Python

+1 vote

I want to set the outlier values as 'NaN' values. Here is the code I am using right now. Can someone explain me ?

import numpy as np, matplotlib.pyplot as plt
data = np.random.rand(1000)+5.0
plt.xlabel('observation number')
plt.ylabel('recorded value')
Aug 24, 2018 in Python by bug_seeker
• 15,520 points

edited Aug 24, 2018 by Vardhan 1,101 views

3 answers to this question.

0 votes

Here's an implementation for the N-dimensional case (from some code for a paper here: https://github.com/joferkington/oost_paper_code/blob/master/utilities.py):

answered Aug 24, 2018 by Priyaj
• 58,090 points
0 votes

There are a huge number of ways to test for outliers, and you should give some thought to how you classify them. Ideally, you should use a-priori information (e.g. "anything above/below this value is unrealistic because...")

answered Aug 24, 2018 by Archana
• 4,170 points
0 votes

code from http://eurekastatistics.com/using-the-median-absolute-deviation-to-find-outliers  This uses the L1 distance instead of L2 distance, and has support for asymmetric distributions.  

def doubleMADsfromMedian(y,thresh=3.5):
    # warning: this function does not check for NAs
    # nor does it address issues when 
    # more than 50% of your data have identical values
    m = np.median(y)
    abs_dev = np.abs(y - m)
    left_mad = np.median(abs_dev[y <= m])
    right_mad = np.median(abs_dev[y >= m])
    y_mad = left_mad * np.ones(len(y))
    y_mad[y > m] = right_mad
    modified_z_score = 0.6745 * abs_dev / y_mad
    modified_z_score[y == m] = 0
    return modified_z_score > thresh
answered Aug 24, 2018 by eatcodesleeprepeat
• 4,710 points

reshown Aug 24, 2018 by Priyaj

Related Questions In Python

0 votes
1 answer

What are the different types of data types one can use in Python?

Python provides an array of built-in constants, ...READ MORE

answered May 28, 2019 in Python by Harsh
• 260 points
0 votes
1 answer

What are the arguments of sorted() function in Python?

Sorted() sorts any sequence (list, tuple) and ...READ MORE

answered Jul 29, 2019 in Python by Neel
• 3,020 points
0 votes
0 answers

What are the various ways to manipulate a string in Python?

I have a string in Python. I ...READ MORE

Aug 13, 2019 in Python by Arvind
• 3,040 points
+2 votes
3 answers

what is the practical use of polymorphism in Python?

Polymorphism is the ability to present the ...READ MORE

answered Mar 31, 2018 in Python by anto.trigg4
• 3,440 points
0 votes
1 answer
0 votes
1 answer

How to create Pandas series from numpy array?

Hi. Refer to the below command: import pandas ...READ MORE

answered Apr 1, 2019 in Python by Pavan
0 votes
1 answer
0 votes
2 answers

What are the types of dictionary in python?

There are 4 types of dictionary Empty Integer Mixed Dictionary with ...READ MORE

answered Feb 14, 2019 in Python by Shashank
• 1,370 points
0 votes
1 answer

What are the key features of Python?

If it makes for an introductory language ...READ MORE

answered Jul 20, 2018 in Python by Priyaj
• 58,090 points
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP