UnicodeDecodeError utf-8 codec can t decode byte 0x87 in position 10 invalid start byte

Question

Unable to import this file it shows an error. My code was:

import pandas as pd
a = pd.read_csv("filename.csv")

Kalgi · Answer 1 · Jul 11, 2019

Best answer

You have to use the encoding as latin1 to read this file as there are some special character in this file, use the below code snippet to read the file. Try this:

import pandas as pd
data=pd.read_csv("C:\\Users\\akashkumar\\Downloads\\Customers.csv",encoding='latin1')
print(data.head())

Hope it helps!!

If you need to know more about Python, It's recommended to join Python course today.

Thanks!

answered Jul 11, 2019 by Ritu

selected Dec 11, 2019 by Kalgi

Show 21 previous comments

Thank you so much you saved me

commented Feb 22, 2023 by anonymous

edited Mar 6

its really helpful ,its worked thank you.

commented Apr 30, 2023 by krishanu

edited Mar 6

THANK YOU! I got stuck here for 2.5hrs

commented Aug 2, 2023 by sunilregu

edited Mar 6

It worked thank you so much

commented Jan 20 by anonymous

edited Mar 6

it worked...thank you

commented Feb 4 by anonymous

edited Mar 6

Roshni · Answer 2 · Dec 11, 2020

tl;dr / quick fix

Don't decode/encode willy nilly.
Don't assume your strings are UTF-8 encoded.
Try to convert strings to Unicode strings as soon as possible in your code.
Fix your locale: How to solve UnicodeDecodeError in Python 3.6?
Don't be tempted to use quick reload hacks.

Ready to unlock the power of data? Join our Data Science with Python Course and gain the skills to analyze, visualize, and make data-driven decisions.

answered Dec 11, 2020 by Roshni
• 10,440 points

Gitika · Answer 3 · Dec 11, 2020

str = unicode(str, errors='replace')

or

str = unicode(str, errors='ignore')

Note: This will strip out (ignore) the characters in question returning the string without them.

For me this is ideal case since I'm using it as protection against non-ASCII input which is not allowed by my application.

Alternatively: Use the open method from the codecs module to read in the file:

import codecs
with codecs.open(file_name, 'r', encoding='utf-8',
                 errors='ignore') as fdata:

answered Dec 11, 2020 by Gitika
• 65,730 points

Rajiv · Answer 4 · Dec 11, 2020

Python bytes decode() function is used to convert bytes to string object. Both these functions allow us to specify the error handling scheme to use for encoding/decoding errors. The default is 'strict' meaning that encoding errors raise a UnicodeEncodeError.

The UnicodeDecodeError normally happens when decoding an str string from a certain coding. Since codings map only a limited number of str strings to unicode characters, an illegal sequence of str characters will cause the coding-specific decode() to fail

score 0 · Answer 5 · May 5, 2021

My code

def decode(self, input, final=False):
# decode input (taking the buffer into account)
data = self.buffer + input
(result, consumed) = self._buffer_decode(data, self.errors, final)
# keep undecoded input until the next call
self.buffer = data[consumed:]
return result

I am getting similar error and i am quite new to this ,how can i fix this ?

Error

File "./load_dap_templates_dave.py", line 284, in <module>
data = pickle.load(neFile)
File "/usr/local/lib/python3.7/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

Thanks in advance