Web Scraping With Bs4 Pyhton3. Cant Find Elements
I am currently experimenting with BeautifulSoup(bs4) in python3. When i go to print the soup or sauce the elements that i am looking for are not there. I cannot find the code for t
Solution 1:
Very, very common problem: page uses JavaScript
to add items but BS
and requests
can't run JavaScript
.
You may use Selenium to control real web browser which can run JavaScript
and use Selenium functions to search data or get HTML from Selenium (driver.page_source) and use BS
.
OR you may use DevTools
in Firefox
/Chrome
(tab: Network
, filter: XHR
) to find url used by JavaScript
to get data from server and then you can use this url with requests
.
Using DevTools
I found url and get HTML with table.
It needed header 'X-Fsign'
to get data instead of HTML with message 401 Unauthorized
I don't know if this header always has the same value. If not then it would need more research to find this value in HTML or in Cookies.
import requests
import bs4 as bs
url = 'https://d.flashscore.com/x/feed/ss_1_INmPqO86_GOMWObX1_table_overall'
headers = {
# 'User-Agent': 'Mozilla/5.0'
# 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:79.0) Gecko/20100101 Firefox/79.0',
# 'X-Referer': 'https://www.flashscore.com/field-hockey/netherlands/hoofdklasse/standings/',
'X-Fsign': 'SW9D1eZo',
# 'X-Requested-With': 'XMLHttpRequest',
# 'Referer': 'https://d.flashscore.com/x/feed/proxy-local',
}
r = requests.get(url, headers=headers)
soup = bs.BeautifulSoup(r.text, 'lxml')
for item in soup.find_all('span', class_='team_name_span'):
print(item.text)
Result:
Bloemendaal
Den Bosch
HGC
Rotterdam
Kampong
Oranje Rood
Amsterdam
Pinoke
Tilburg
Klein Zwitserland
Hurley
Almere
Post a Comment for "Web Scraping With Bs4 Pyhton3. Cant Find Elements"