Extremely slow crawling speed
« on: April 03, 2012, 07:44:00 PM »
Hi, my site took 52hrs to crawl through 260k pages. This is slower than desired.

  • How can I increase the speed of crawling?
  • I see the feature 'Make a delay between requests, X seconds after each N requests'. What is the default value is left blank, and what are the increments I should decrease the X seconds delay to boost the crawl speed?

Thank you.
« Last Edit: April 03, 2012, 07:46:41 PM by decemberflow »
Re: Extremely slow crawling speed
« Reply #1 on: April 03, 2012, 10:01:36 PM »

default values for delays are blank, which means "no delays at all".

with website of this size the best option is to create a limited sitemap - with "Maximum depth" or "Maximume URLs" option limited so that it would gather about 100-200,000 URLs, which would be main pages representing "roadmap" sitemap for search engines.

The crawling time itself depends on the website page generation time mainly, since it crawls the site similar to search engine bots.
For instance, if it it takes 1 second to retrieve every page, then 1000 pages will be crawled in about 16 minutes.

Some of the real-world examples of big db-driven websites:
about 35,000 URLs indexed - 1h 40min total generation time
about 200,000 URLs indexed - 38hours total generation time

With "Max urls" options defined it would be much faster than that.
Re: Extremely slow crawling speed
« Reply #2 on: May 19, 2017, 03:00:35 AM »
I am getting 130000 pages in 23 minutes. Used to having 130000 pages in days.
Hint was in Narrow Indexed Pages Set. Try Exclusion preset. I am now using this software without a hassle.