XML Sitemaps Generator

Author Topic: Using the generator for cache renewal  (Read 2383 times)

ville

  • Registered Customer
  • Approved member
  • *
  • Posts: 2
Using the generator for cache renewal
« on: April 19, 2016, 11:44:17 AM »
Hello everyone!

I've been using the generator for several years now. Our website is around 50k pages, and is also quite database intensive (an advanced webshop).

I have developed a pretty nice cache for our website which currently operates in lazy mode: whenever a page is requested a cache version is stored for future requests.

Currently we flush the cache every friday and run the generator straight after to prime the cache once, but this can cause some pretty heavy load and the users will also experience a decrese in site speed due to hitting non-cached pages.

Now i came up with the idea of allowing the generator to set a session variable ($_SESSION['cache] = true), for example when it runs. This would allow the site to always give generator a fresh version and re prime the cache. Naturally, we wouldn't have to flush the cache "ever again" so users would have a 100 % cache hit rate to an always fresh page.

In my opinion a session variable would be best as it wouldn't mess with the url structures at all (opposed to a get for example).

Opinions or suggestions on how to approach this idea?

PS. I Know i can hard code something into the current source, but i'd rather have a solution which will work out of the box after i update to a new version of the generator.
« Last Edit: April 19, 2016, 11:46:09 AM by ville »

ville

  • Registered Customer
  • Approved member
  • *
  • Posts: 2
Re: Using the generator for cache renewal
« Reply #1 on: April 25, 2016, 08:43:48 AM »
I solved this with the xs_crawl_ident name in the configuration, and checking for it in php in order to determine if a fresh page needs to be served. This is an easy solution, but for future updates, an actual field in the settings might be useful in the as a shortcut.

Also, setting something once in a crawler session vs. checking every time and setting in page serving is an unneccessary overhead. I admit the performance effect is zero to none, but still could be more effective where a direct setting would be ideal as long as the crawler and crawled pages share a session =)

 

SMF 2.0.12 | SMF © 2014, Simple Machines
XHTML RSS WAP2