Very Slow on Datafeed site
« on: June 09, 2010, 07:37:44 PM »
Any suggestions to make the crawl a lot faster?  There are certain strings that I don't want in the sitemap, hoping to make it faster, but didn't get much better.  Around 50k pages.

I used program on another site without a datafeed that was around 700 pages and it worked like a dream.

Any suggestions are helpful.

Thanks,
Jeremy

Re: Very Slow on Datafeed site
« Reply #1 on: June 10, 2010, 12:08:30 PM »
Hello,

you can use "Exclude URLs" setting to avoid crawling the pages that you don't need in sitemap.
Re: Very Slow on Datafeed site
« Reply #2 on: June 10, 2010, 07:15:09 PM »
Thank you Oleg.

Right now I have it running in the "background" and am unable to access the Crawl tab to see the status.  I am assuming it is still running (it has been about 24 hours), so would love to see how far it has come so far.

I did use exclude string.  Does that make the overall crawl faster?

thanks,
Jeremy
Re: Very Slow on Datafeed site
« Reply #3 on: June 10, 2010, 08:38:04 PM »
After making changes in configuration you should restart generator. Please try to close browser and reopen it if you cannot open crawling page.
Re: Very Slow on Datafeed site
« Reply #4 on: June 16, 2010, 10:11:01 PM »
Hi Oleg,

Crawl has been running for 7 days now (roughly).  Up to 145k pages.  I'm guessing there might be some pages getting spidered that I don't want.  How can I take what I got so far as the sitemap.xml while I make adjustments to hopefully make it go a lot faster.  Is this the normal pace for this many pages so far?

Thanks,
Jeremy
Re: Very Slow on Datafeed site
« Reply #5 on: June 16, 2010, 10:39:34 PM »
Hello,

you can set "maximum URLs" setting in "crawler limitations" section to 145000 and start generator.

The crawling time itself depends on the website page generation time mainly, since it crawls the site similar to search engine bots.
For instance, if it it takes 1 second to retrieve every page, then 1000 pages will be crawled in about 16 minutes.
Re: Very Slow on Datafeed site
« Reply #6 on: June 16, 2010, 10:45:23 PM »
Do I interrupt the crawl, make the cap of 145k, then continue?

Or, can I just update the configuration while it is running?