Django Caching - Can It Be Done Pre-emptively?
Solution 1:
So you want to schedule something to run at a regular interval? At the cost of some CPU time, you can use this simple app.
Alternatively, if you can use it, the cron job for every 5 minutes is:
*/5 * * * * /path/to/project/refresh_cache.py
Web hosts provide different ways of setting these up. For cPanel, use the Cron Manager. For Google App Engine, use cron.yaml
. For all of these, you'll need to set up the environment in refresh_cache.py
first.
By the way, responding to a user's request is considered lazy caching. This is pre-emptive caching. And don't forget to cache long enough for the page to be recreated!
Solution 2:
"I'm still unsure as to how I accomplish this with the python script I will be calling. "
The issue is that your "significant delay of a few seconds while I go to the external site to parse the new data" has nothing to do with Django cache at all.
You can cache it everywhere, and when you go to reparse the external site, there's a delay. The trick is to NOT parse the external site while a user is waiting for their page.
The trick is to parse the external site before a user asks for a page. Since you can't go back in time, you have to periodically parse the external site and leave the parsed results in a local file or a database or something.
When a user makes a request you already have the results fetched and parsed, and all you're doing is presenting.
Solution 3:
I have no proof, but I've read BeautifulSoup is slow and consumes a lot of memory. You may want to look at using the lxml module instead. lxml is supposed to be much faster and efficient, and can do much more than BeautifulSoup.
Of course, the parsing probably isn't your bottleneck here; the external I/O is.
First off, use memcached!
Then, one strategy that can be used is as follows:
- Your cached object, called
A
, is stored in the cache with a dynamic key (A_<timestamp>
, for example). - Another cached object holds the current key for
A
, calledA_key
. - Your app would then get the key for
A
by first getting the value atA_key
- A periodic process would populate the cache with the
A_<timestamp>
keys and upon completion, change the value atA_key
to the new key
Using this method, all users every 5 minutes won't have to wait for the cache to be updated, they'll just get older versions until the update happens.
Solution 4:
You can also use a python script to call your view and write it to a file, then deliver it staticaly with lightpd for example :
request = HttpRequest()
request.path = url # the url of your view
(detail_func, foo, params) = resolve(url)
params['gmap_key'] = settings.GMAP_KEY_STATIC
detail = detail_func(request, **params)
out = open(dir + "index.html", 'w')
out.write(detail.content)
out.close()
then call your script with a cron
Post a Comment for "Django Caching - Can It Be Done Pre-emptively?"