How to perform HTML decoding/encoding using Python/Django?

0 votes

I have a string that is HTML encoded:

'''<img class="size-medium wp-image-113"\
 style="margin-left: 15px;" title="su1"\
 alt="" width="300" height="194" />'''

I want to change that to:

<img class="size-medium wp-image-113" style="margin-left: 15px;" 
  title="su1" src="ttp://" 
  alt="" width="300" height="194" /> 

I want this to register as HTML so that it is rendered as an image by the browser instead of being displayed as text.

The string is stored like that because I am using a web-scraping tool called BeautifulSoup, it "scans" a web-page and gets certain content from it, then returns the string in that format.

I've found how to do this in C# but not in Python. Can someone help me out?

May 6 in Python by kartik
• 11,890 points

1 answer to this question.

0 votes


For html encoding, there's cgi.escape from the standard library:

>> help(cgi.escape)
cgi.escape = escape(s, quote=None)
    Replace special characters "&", "<" and ">" to HTML-safe sequences.
    If the optional flag quote is true, the quotation mark character (")
    is also translated.

For html decoding, I use the following:

import re
from htmlentitydefs import name2codepoint
# for some reason, python 2.5.2 doesn't have this one (apostrophe)
name2codepoint['#39'] = 39

def unescape(s):
    "unescape HTML code refs; c.f."
    return re.sub('&(%s);' % '|'.join(name2codepoint),
              lambda m: unichr(name2codepoint[]), s)

For anything more complicated, I use BeautifulSoup.

Hope this is helpful!!

Thank You!!

answered May 6 by Niroj
• 23,950 points

Related Questions In Python

0 votes
1 answer

How to filter HTML tags and resolve entities using Python?

Him the answer is a pretty simple ...READ MORE

answered Feb 13, 2019 in Python by Nymeria
• 3,540 points
+2 votes
2 answers

How to make a laplacian pyramid using OpenCV python?

down voteacceptTheeThe problem is that you're iterating ...READ MORE

answered Apr 3, 2018 in Python by charlie_brown
• 7,760 points
0 votes
2 answers

how to print the current time using python?

print( READ MORE

answered Feb 14, 2019 in Python by Shashank
• 1,370 points
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 6, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 963 views
0 votes
0 answers
0 votes
1 answer

How to get the latest file in a folder using python?

Hello @kartik,  would suggest using glob.iglob() instead of the glob.glob(), as ...READ MORE

answered May 27 in Python by Niroj
• 23,950 points
0 votes
1 answer

How to remove a key from a Python dictionary?

Hello, If you need to remove a lot ...READ MORE

answered Apr 15 in Python by Niroj
• 23,950 points