Skip to content Skip to sidebar Skip to footer

Reading Settings In Spider Scrapy

I wrote a small scrapy spider. Following is my code class ElectronicsSpider(scrapy.Spider): name = 'electronics' allowed_domains = ['www.olx.com'] start_urls = ['http:/

Solution 1:

from scrapy.utils.project import get_project_settings

settings=get_project_settings()
print settings.get('NAME')

Using this code we can read data from settings file...

Solution 2:

self.settings is not yet initiated in __init__(). You can check self.settings in start_requests().

defstart_requests(self): 
    print self.settings

Solution 3:

I think if you want to access scrapy settings.py then answer from @Sellamani is good. But I guess name,allowed_domains and start_urls are not variables defined in settings.py. But if you want to have the same knd of arrangement then make your own config file like this, yourown.cfg :

[Name]crawler_name=electronics

[DOMAINS]allowed_domains=http://example.com

and then in your program use ConfigParser module like this to access yourown.cfg :

import ConfigParser
config = ConfigParser.ConfigParser()
config.read('yourown.cfg') # Assuming it is at the same location
name=config.getint('NAME', 'crawler_name')
allowed_domains=config.getint('DOMAINS', 'allowed_domains')

Post a Comment for "Reading Settings In Spider Scrapy"