Beautiful Soup 4 Find_all Don't Find Links That Beautiful Soup 3 Finds
I noticed a really annoying bug: BeautifulSoup4 (package: bs4) often finds less tags than the previous version (package: BeautifulSoup). Here's a reproductible instance of that iss
Solution 1:
You have lxml
installed, which means that BeautifulSoup 4 will use that parser over the standard-library html.parser
option.
You can upgrade lxml to 3.2.1 (which for me returns 1701 results for your test page); lxml itself uses libxml2
and libxslt
which may be to blame too here. You may have to upgrade those instead / as well. See the lxml requirements page; currently libxml2 2.7.8 or newer is recommended.
Or explicitly specify the other parser when parsing the soup:
s4 = bs4.BeautifulSoup(r.text, 'html.parser')
Post a Comment for "Beautiful Soup 4 Find_all Don't Find Links That Beautiful Soup 3 Finds"