How to use BeautifulSoup for Webscraping

0 votes

I am trying to scrape all the subject titles of all the forum posts on this website. I am not sure how to go about this as the HTML format of the forum website is not what I am familiar with.

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'http://thailove.net/bbs/board.php?bo_table=ent'

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html, "html.parser")

#I don't think this is correct, but not sure on how else to to do this...
containers = page_soup.findAll("td",{"class":"td_subject"})


for container in containers:
subject = container.a.font.font.contents
#similarly not sure this is correct     
print("subject: ", subject)

Please let me know what I should do. Also keep in mind that the website is in Korean but can be easily translated into English if need be.

Sep 6, 2018 in Python by bug_seeker
• 14,960 points
119 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

Your code is good until you get to the for loop, you should be acessing container.a.contents[0]to get the subjects, and the print function should be inside your for loop:

for container in containers:
    subject = container.a.contents[0]
    print("subject: ", subject)
answered Sep 6, 2018 by Priyaj
• 56,100 points

Related Questions In Python

+2 votes
2 answers

How to use BeatifulSoup for webscraping?

your programme is fine until you start ...READ MORE

answered Apr 4, 2018 in Python by charlie_brown
• 7,710 points
16 views
0 votes
1 answer

Raw_input method is not working in python3. How to use it?

raw_input is not supported anymore in python3. ...READ MORE

answered May 4, 2018 in Python by aayushi
• 750 points
81 views
0 votes
2 answers

how to use print statement in python3?

The print statement has been replaced with a print() ...READ MORE

answered Jul 16, 2018 in Python by Mrunal
• 680 points
22 views
0 votes
1 answer

How to use “raise” keyword in Python

You can use it to raise errors ...READ MORE

answered Jul 30, 2018 in Python by Priyaj
• 56,100 points
15 views
0 votes
1 answer

How to download intext images with beautiful soup

Try this: html_data = """ <td colspan="3"><b>"Assemble under ...READ MORE

answered Sep 10, 2018 in Python by Priyaj
• 56,100 points
233 views
0 votes
1 answer

How to download intext images with beautiful soup

Ohh... I got what you need. Try this: html_data ...READ MORE

answered Sep 20, 2018 in Python by Priyaj
• 56,100 points
764 views
0 votes
1 answer

Get all the read more links of amazon.jobs with Python

As you've noticed your request returns only ...READ MORE

answered Sep 28, 2018 in AWS by Priyaj
• 56,100 points
40 views
0 votes
1 answer

How to use for loop in Python?

There are multiple ways of using for ...READ MORE

answered Mar 4 in Python by Priyaj
• 56,100 points
9 views
0 votes
2 answers

How to use threading in Python?

 Thread is the smallest unit of processing that ...READ MORE

answered Apr 6 in Python by anonymous
43 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.