Skip to content Skip to sidebar Skip to footer

How Do I Parse Every Html File In A Directory For Images?

I have a directory full of html files, each of which has a clinical image of a psoriasis patient in it. I want to open each file, find the image, and save it in the same directory

Solution 1:

You need to change this line:

for root, dirs, files in path:

to

for root, dirs, files inos.walk(path):

Also note that files are file names, not objects, so this would be your fixed code:

import os, os.path
import Image
from BeautifulSoup import BeautifulSoup as bs

path = 'C:\Users\gokalraina\Desktop\derm images'for root, dirs, files inos.walk(path):
    for f in files:
        soup = bs(open(os.path.join(root, f)).read())
        for image in soup.findAll("img"):
            print"Image: %(src)s" % image
            im = Image.open(image)
            im.save(path+image["src"], "JPEG")

Solution 2:

for root, dirs, files in path:

path here is a string. Each element is only a single character, and you can't unpack a single character into three variables. Hence the error message: you need more than one value to unpack.

You probably want:

for root, dirs, files inos.walk(path):

Post a Comment for "How Do I Parse Every Html File In A Directory For Images?"