Python UnicodeDecodeError utf-8 codec can t decode byte 0xa0 in position 10 invalid start byte

+4 votes

Unable to import this file it shows an error. My code was:

import pandas as pd
a = pd.read_csv("filename.csv")
Jul 11, 2019 in Python by Yadu
320,023 views

4 answers to this question.

+11 votes
Best answer

You have to use the encoding as latin1 to read this file as there are some special character in this file, use the below code snippet to read the file. Try this:

import pandas as pd
data=pd.read_csv("C:\\Users\\akashkumar\\Downloads\\Customers.csv",encoding='latin1')
print(data.head())


Hope it helps!!

If you need to know more about Python, It's recommended to join Python course today.

Thanks!

answered Jul 11, 2019 by Ritu

selected Dec 11, 2019 by Kalgi
worked for me also thanks
OMG! Thank you so much. I was stuck on this for a while!! :D
Thank you!!! that additional encoding='latin1' entirely fixes this issue

Thanks a lot. I was really stuck with this problem. 

Alternatively we can use "encoding='unicode_escape'" with the same effect.

import pandas as pd
data=pd.read_csv("C:\\Users\\akashkumar\\Downloads\\Customers.csv",encoding=''unicode_escape')
print(data.head())

worked for me as well. thanks very much indeed
0 votes

tl;dr / quick fix

  1. Don't decode/encode willy nilly.
  2. Don't assume your strings are UTF-8 encoded.
  3. Try to convert strings to Unicode strings as soon as possible in your code.
  4. Fix your locale: How to solve UnicodeDecodeError in Python 3.6?
  5. Don't be tempted to use quick reload hacks.

Ready to unlock the power of data? Join our Data Science with Python Course and gain the skills to analyze, visualize, and make data-driven decisions.

answered Dec 11, 2020 by Roshni
• 10,520 points
+1 vote
str = unicode(str, errors='replace')

or

str = unicode(str, errors='ignore')

Note: This will strip out (ignore) the characters in question returning the string without them.

For me this is ideal case since I'm using it as protection against non-ASCII input which is not allowed by my application.

Alternatively: Use the open method from the codecs module to read in the file:

import codecs
with codecs.open(file_name, 'r', encoding='utf-8',
                 errors='ignore') as fdata:

answered Dec 11, 2020 by Gitika
• 65,910 points
0 votes

Python bytes decode() function is used to convert bytes to string object. Both these functions allow us to specify the error handling scheme to use for encoding/decoding errors. The default is 'strict' meaning that encoding errors raise a UnicodeEncodeError.

The UnicodeDecodeError normally happens when decoding an str string from a certain coding. Since codings map only a limited number of str strings to unicode characters, an illegal sequence of str characters will cause the coding-specific decode() to fail

answered Dec 11, 2020 by Rajiv
• 8,910 points

Related Questions In Python

0 votes
0 answers

utf-8' codec can't decode byte 0xa0 in position 10: invalid start byte

my code import wordcloud import numpy as np from matplotlib ...READ MORE

Mar 29, 2020 in Python by anonymous
• 120 points
4,970 views
0 votes
2 answers

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xba in position 16: invalid start byte

Thanks, This answer was helpful. READ MORE

answered Jul 11, 2020 in Python by Prashant Chhatrashali
15,512 views
0 votes
1 answer

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

Hi, @hala, Regarding your query, you can go ...READ MORE

answered Jun 29, 2020 in Python by Niroj
• 82,880 points
16,927 views
0 votes
2 answers

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

Hey,  @Himanshu. It's still most likely gzipped data. ...READ MORE

answered Jul 27, 2020 in Python by Gitika
• 65,910 points
23,017 views
0 votes
2 answers
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 7, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 4,007 views
0 votes
1 answer
0 votes
1 answer

Error is "invalid literal for int() with base 10: ' ' "

This error is caused because we try ...READ MORE

answered Oct 15, 2020 in Python by Gitika
• 65,910 points
2,661 views
0 votes
1 answer
+2 votes
2 answers

UnicodeDecodeError: "utf-8" codec can't decode byte in position : invalid start byte

You have to use the encoding as latin1 ...READ MORE

answered Jul 23, 2019 in Python by Kunal
240,988 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP