How to perform HTML decoding/encoding using Python/Django?

I have a string that is HTML encoded:

'''<img class="size-medium wp-image-113"\
 style="margin-left: 15px;" title="su1"\
 alt="" width="300" height="194" />'''

I want to change that to:

<img class="size-medium wp-image-113" style="margin-left: 15px;" 
  title="su1" src="ttp://" 
  alt="" width="300" height="194" /> 

I want this to register as HTML so that it is rendered as an image by the browser instead of being displayed as text.

The string is stored like that because I am using a web-scraping tool called BeautifulSoup, it "scans" a web-page and gets certain content from it, then returns the string in that format.

I've found how to do this in C# but not in Python. Can someone help me out?

May 6 in Python by kartik
1 answer to this question.

For html encoding, there's cgi.escape from the standard library:

>> help(cgi.escape)
cgi.escape = escape(s, quote=None)
    Replace special characters "&", "<" and ">" to HTML-safe sequences.
    If the optional flag quote is true, the quotation mark character (")
    is also translated.

For html decoding, I use the following:

import re
from htmlentitydefs import name2codepoint
# for some reason, python 2.5.2 doesn't have this one (apostrophe)
name2codepoint['#39'] = 39

def unescape(s):
    "unescape HTML code refs; c.f."
    return re.sub('&(%s);' % '|'.join(name2codepoint),
              lambda m: unichr(name2codepoint[]), s)

For anything more complicated, I use BeautifulSoup.

Hope this is helpful!!

Thank You!!

answered May 6 by Niroj
