Python UnicodeDecodeError utf-8 codec can t decode byte 0xa0 in position 10 invalid start byte

+4 votes

Unable to import this file it shows an error. My code was:

import pandas as pd
a = pd.read_csv("filename.csv")
Jul 11, 2019 in Python by Yadu
131,497 views

4 answers to this question.

+10 votes
Best answer

You have to use the encoding as latin1 to read this file as there are some special character in this file, use the below code snippet to read the file. Try this:

import pandas as pd
data=pd.read_csv("C:\\Users\\akashkumar\\Downloads\\Customers.csv",encoding='latin1')
print(data.head())


Hope it helps!!

If you need to know more about Python, It's recommended to join Python course today.

Thanks!

answered Jul 11, 2019 by Ritu

selected Dec 11, 2019 by Kalgi
worked for me also thanks
OMG! Thank you so much. I was stuck on this for a while!! :D
Thank you!!! that additional encoding='latin1' entirely fixes this issue

Thanks a lot. I was really stuck with this problem. 

Alternatively we can use "encoding='unicode_escape'" with the same effect.

import pandas as pd
data=pd.read_csv("C:\\Users\\akashkumar\\Downloads\\Customers.csv",encoding=''unicode_escape')
print(data.head())

worked for me as well. thanks very much indeed
0 votes

tl;dr / quick fix

  1. Don't decode/encode willy nilly.
  2. Don't assume your strings are UTF-8 encoded.
  3. Try to convert strings to Unicode strings as soon as possible in your code.
  4. Fix your locale: How to solve UnicodeDecodeError in Python 3.6?
  5. Don't be tempted to use quick reload hacks.
answered Dec 11, 2020 by Roshni
• 10,480 points
+1 vote
str = unicode(str, errors='replace')

or

str = unicode(str, errors='ignore')

Note: This will strip out (ignore) the characters in question returning the string without them.

For me this is ideal case since I'm using it as protection against non-ASCII input which is not allowed by my application.

Alternatively: Use the open method from the codecs module to read in the file:

import codecs
with codecs.open(file_name, 'r', encoding='utf-8',
                 errors='ignore') as fdata:

answered Dec 11, 2020 by Gitika
• 65,950 points
0 votes

Python bytes decode() function is used to convert bytes to string object. Both these functions allow us to specify the error handling scheme to use for encoding/decoding errors. The default is 'strict' meaning that encoding errors raise a UnicodeEncodeError.

The UnicodeDecodeError normally happens when decoding an str string from a certain coding. Since codings map only a limited number of str strings to unicode characters, an illegal sequence of str characters will cause the coding-specific decode() to fail

answered Dec 11, 2020 by Rajiv
• 8,890 points

Related Questions In Python

0 votes
0 answers

utf-8' codec can't decode byte 0xa0 in position 10: invalid start byte

my code import wordcloud import numpy as np from matplotlib ...READ MORE

Mar 29, 2020 in Python by anonymous
• 120 points
2,767 views
0 votes
2 answers
0 votes
1 answer

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

Hi, @hala, Regarding your query, you can go ...READ MORE

answered Jun 29, 2020 in Python by Niroj
• 82,820 points
7,098 views
0 votes
2 answers

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

Hey,  @Himanshu. It's still most likely gzipped data. ...READ MORE

answered Jul 27, 2020 in Python by Gitika
• 65,950 points
5,965 views
0 votes
2 answers
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 6, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 2,364 views
0 votes
0 answers
0 votes
1 answer

Error is "invalid literal for int() with base 10: ' ' "

This error is caused because we try ...READ MORE

answered Oct 15, 2020 in Python by Gitika
• 65,950 points
1,334 views
0 votes
1 answer
+2 votes
2 answers

UnicodeDecodeError: "utf-8" codec can't decode byte in position : invalid start byte

You have to use the encoding as latin1 ...READ MORE

answered Jul 23, 2019 in Python by Kunal
129,674 views