speed of crawler issue
« on: March 01, 2006, 09:59:46 AM »
Hi  ???

After much fiddling around with my server, installing php so forth I managed to get the software crawling. My stupid hosting do not allow me to edit file permissions on my site (they say it infringes their server security).
 It seems to be taking ages to crawl my 150,000 pages, its been running for about 18 hours now and only crawled 80,000 pages.
I do intend to have over 100,000,0 pages soon, so as it is setup now it will take weeks to complete? I have read a few posts on this forum and you explain it is the speed of you webhost (how fast it can read a page). It completes the first 40k within a few hours then it just gradually gets slower and slower so I would say its down to the speed of the computer this software is running on. The computer this software is running on is a duel P4 3.2ghz 1gb ram and its using up so much processor activity.

I have tried to speed it up by using the phase URL’s, but I think I am using the wrong syntax, could you please tell me what I would type in that box if I only wanted it to index all the links on [ External links are visible to forum administrators only ] but not to follow the links to all the 150,000 pages?

Many thanks
Edward Long
Re: speed of crawler issue
« Reply #1 on: March 01, 2006, 01:08:24 PM »
Hi Edward,

To speed things up, on the configuration page there is a box that says "Do Not Parse URLs:"

If you put this in

price-breaker-

then it will include all of the pages that you have on the price-breaker.co.uk/data page without actually loading them.

Philip Nicosia