How to use BeatifulSoup for webscraping

+2 votes

I'm trying to collect all the titles of a forum from a certain site. I can't really figure out which HTML elements to target as I'm not very familiar with the site structure. 

This is what I could develop reading the documentation

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'http://thailove.net/bbs/board.php?bo_table=ent'

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html, "html.parser")

#I don't think this is correct, but not sure on how else to to do this...
containers = page_soup.findAll("td",{"class":"td_subject"})


for container in containers:
subject = container.a.font.font.contents
#similarly not sure this is correct     
print("subject: ", subject)


I'm not really sure where I should be trying to improvise 
Apr 4, 2018 in Python by aryya
• 7,450 points
749 views

2 answers to this question.

+2 votes
Best answer

your programme is fine until you start executing the for-loop.  You have to access container.a.contents[0]to get the subjects, and the print function should be inside your for loop:

for container in containers:
    subject = container.a.contents[0]
    print("subject: ", subject)
answered Apr 4, 2018 by charlie_brown
• 7,720 points

selected Oct 12, 2018 by Omkar
0 votes
You can go through the below link:

Here the webscrapping is explained in brief
https://www.dataquest.io/blog/web-scraping-tutorial-python/
answered Oct 12, 2018 by findingbugs
• 4,780 points

Related Questions In Python

0 votes
1 answer

How to use BeautifulSoup for Webscraping

Your code is good until you get ...READ MORE

answered Sep 6, 2018 in Python by Priyaj
• 58,090 points
1,954 views
0 votes
1 answer

How to use for loop in Python?

There are multiple ways of using for ...READ MORE

answered Mar 4, 2019 in Python by Priyaj
• 58,090 points
494 views
0 votes
0 answers
0 votes
2 answers

How to use in python for loop not equal marks? example: a!=0

Hello @Azizjon, You can go through the example ...READ MORE

answered Oct 12, 2020 in Python by Gitika
• 65,910 points
1,941 views
0 votes
1 answer

Raw_input method is not working in python3. How to use it?

raw_input is not supported anymore in python3. ...READ MORE

answered May 5, 2018 in Python by aayushi
• 750 points
3,091 views
0 votes
3 answers

how to use print statement in python3?

Brackets are required to print the output. >>> ...READ MORE

answered Nov 25, 2021 in Python by anonymous
1,326 views
0 votes
2 answers

How to use threading in Python?

 Thread is the smallest unit of processing that ...READ MORE

answered Apr 6, 2019 in Python by anonymous
1,053 views
0 votes
1 answer

How to use “raise” keyword in Python

You can use it to raise errors ...READ MORE

answered Jul 30, 2018 in Python by Priyaj
• 58,090 points
506 views
+1 vote
3 answers

How can I use python to execute a curl command?

For sake of simplicity, maybe you should ...READ MORE

answered Oct 11, 2018 in Python by charlie_brown
• 7,720 points
93,348 views
0 votes
7 answers

How to use not equal operator in python

To check if operands are not equal ...READ MORE

answered Nov 30, 2021 in Python by Charry
367,648 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP