• Welcome to Sitemap Generator Forum.
 

How long will it take to crawl a site with over 100K pages?

Started by am1, February 17, 2012, 01:20:26 PM

Previous topic - Next topic

am1

Purchsed the standalone version on the 15th and installed it right away.

Started the crawl and it is far from complete...

Here's the info

Already in progress. Current process state is displayed:
Links depth: 4
Current page: Venues/Southeast-Texas-Entertainment-Complex-Tickets.html
Pages added to sitemap: 38220
Pages scanned: 39180 (552,008.5 KB)
Pages left: 5388 (+ 6846 queued for the next depth level)
Time passed: 8:14:08
Time left: 1:07:57
Memory usage: 85,702.9 Kb
Auto-restart monitoring: Fri Feb 17 08:18:42 EST 2012 (38 second(s) since last update)

below it states

Fri Feb 17 08:07:32 EST 2012: resuming generator (121 seconds with no response)

Please advise

TY

am1

What would be the memory limit recommended for a site with over 100K pages?

Currently set at 60

[ External links are visible to forum administrators only ]

Any other settings that should be changed so the script can work at optimal performance?

Please advise



am1

Private message sent with url/login

[ External links are visible to forum administrators only ]

am1

php.ini updated to 256M

Still have the same issue

Mon Feb 20 14:55:48 EST 2012: resuming generator (120 seconds with no response)
Mon Feb 20 14:53:48 EST 2012: resuming generator (120 seconds with no response)
Mon Feb 20 14:51:48 EST 2012: resuming generator (120 seconds with no response)
Mon Feb 20 14:49:48 EST 2012: resuming generator (120 seconds with no response)
Mon Feb 20 14:47:48 EST 2012: resuming generator (120 seconds with no response)
Mon Feb 20 14:45:48 EST 2012: resuming generator (120 seconds with no response)
Mon Feb 20 14:43:47 EST 2012: resuming generator (120 seconds with no response)
Mon Feb 20 14:41:47 EST 2012: resuming generator (120 seconds with no response)

5 Days working and the site still not even close to being completely crawled

Please advise

am1

Set at 256M and still not able to crawl the site completely

Please advise


am1

the site has over 200K pages... only 43511 indexed


Links depth: 3
Current page: 1791169/cleveland-cavaliers-vs-milwaukee-bucks-tickets.html
Pages added to sitemap: 28600
Pages scanned: 29020 (768,064.7 KB)
Pages left: 61963 (+ 11937 queued for the next depth level)
Time passed: 3:47:15
Time left: 8:05:13
Memory usage: 123,780.1 Kb
Auto-restart monitoring: Thu Feb 23 20:33:17 EST 2012 (23 second(s) since last update)Resuming the last session (last updated: 2012-02-23 02:00:08)

Just here with Links deph 3 it shows it has 61963 left to go with 29020 already scanned
but it nevers completes

I will be glad to pay you to install it if you think it will help

Otherwise I will have to ask for a refund

Please advise

brades


am1


XML-Sitemaps Support

Hello,

could you please PM me your generator URL and an example URL that is not included in sitemap and how it can be reached starting from homepage?

am1

I already sent a PM with the details to investigate the issue over a week ago.

[ External links are visible to forum administrators only ]

Access info already sent as a PM

[ External links are visible to forum administrators only ]

and all the pages in this folder can't be found in the sitemap as well as many other pages

These pages are linked from the home page


am1

Now it is.

Since yesterday due to an issue with google...