Web Scraping With Bs4 Pyhton3. Cant Find Elements

January 30, 2023 Post a Comment

I am currently experimenting with BeautifulSoup(bs4) in python3. When i go to print the soup or sauce the elements that i am looking for are not there. I cannot find the code for t

Solution 1:

Very, very common problem: page uses JavaScript to add items but BS and requests can't run JavaScript.

You may use Selenium to control real web browser which can run JavaScript and use Selenium functions to search data or get HTML from Selenium (driver.page_source) and use BS.

OR you may use DevTools in Firefox/Chrome (tab: Network, filter: XHR) to find url used by JavaScript to get data from server and then you can use this url with requests.

Using DevTools I found url and get HTML with table.

It needed header 'X-Fsign' to get data instead of HTML with message 401 Unauthorized

I don't know if this header always has the same value. If not then it would need more research to find this value in HTML or in Cookies.

import requests
import bs4 as bs

url = 'https://d.flashscore.com/x/feed/ss_1_INmPqO86_GOMWObX1_table_overall'

headers = {
#    'User-Agent': 'Mozilla/5.0'
#    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:79.0) Gecko/20100101 Firefox/79.0',
#    'X-Referer': 'https://www.flashscore.com/field-hockey/netherlands/hoofdklasse/standings/',
    'X-Fsign': 'SW9D1eZo',
#    'X-Requested-With': 'XMLHttpRequest',
#    'Referer': 'https://d.flashscore.com/x/feed/proxy-local',
}

r = requests.get(url, headers=headers)

soup = bs.BeautifulSoup(r.text, 'lxml')

for item in soup.find_all('span', class_='team_name_span'):
    print(item.text)

Result:

Bloemendaal
Den Bosch
HGC
Rotterdam
Kampong
Oranje Rood
Amsterdam
Pinoke
Tilburg
Klein Zwitserland
Hurley
Almere

Python Playground

Web Scraping With Bs4 Pyhton3. Cant Find Elements

Solution 1:

Post a Comment for "Web Scraping With Bs4 Pyhton3. Cant Find Elements"