am1

*
  • *
  • 17
Re: How long will it take to crawl a site with over 100K pages?
« Reply #15 on: February 28, 2012, 11:35:44 PM »
Please read below

Links depth: 3
Current page: 1726189/disney-on-ice-toy-story-3-tickets.html
Pages added to sitemap: 33149
Pages scanned: 34080 (933,354.9 KB)
Pages left: 56903 (+ 16662 queued for the next depth level)
Time passed: 5:04:06
Time left: 8:27:46
Memory usage: 132,785.0 Kb
Auto-restart monitoring: Tue Feb 28 18:31:05 EST 2012 (34 second(s) since last update)Resuming the last session (last updated: 2012-02-26 16:08:20)

Links depth: 3
34080 pages scanned
Pages left: 56903
+ 16662 queued for the next depth level

in my math it looks like 107645 pages and that's just Links depth: 3

So.... if after over a week the sitemap only has 43K pages lsited something is not right

Or please correct me if I am wrong...

Please advise
Re: How long will it take to crawl a site with over 100K pages?
« Reply #16 on: March 01, 2012, 08:29:34 AM »
Sitemap generator reads robots.txt as well, so it won't index pages that are disallowed in it.

am1

*
  • *
  • 17
Re: How long will it take to crawl a site with over 100K pages?
« Reply #17 on: March 01, 2012, 01:07:36 PM »
Your answer is common knowledge but you still didn't answer my question.

Please read below

Links depth: 3
Current page: 1726189/disney-on-ice-toy-story-3-tickets.html
Pages added to sitemap: 33149
Pages scanned: 34080 (933,354.9 KB)
Pages left: 56903 (+ 16662 queued for the next depth level)
Time passed: 5:04:06
Time left: 8:27:46
Memory usage: 132,785.0 Kb
Auto-restart monitoring: Tue Feb 28 18:31:05 EST 2012 (34 second(s) since last update)Resuming the last session (last updated: 2012-02-26 16:08:20)

Links depth: 3
34080 pages scanned
Pages left: 56903
+ 16662 queued for the next depth level

in my math it looks like 107645 pages and that's just Links depth: 3

So.... if after over a week the sitemap only has 43K pages lsited something is not right

Or please correct me if I am wrong...

Please advise

am1

*
  • *
  • 17
Re: How long will it take to crawl a site with over 100K pages?
« Reply #19 on: March 01, 2012, 02:51:06 PM »
I need a solution or a refund!

You still didn't answer my question.

Over a week trying to create a sitemap that includes all the pages on the site with no success.

The site has over 100000 pages your script has only indexed 43000 pages.

Please offer a solution or issue a refund!

Please read below

Links depth: 3
Current page: 1726189/disney-on-ice-toy-story-3-tickets.html
Pages added to sitemap: 33149
Pages scanned: 34080 (933,354.9 KB)
Pages left: 56903 (+ 16662 queued for the next depth level)
Time passed: 5:04:06
Time left: 8:27:46
Memory usage: 132,785.0 Kb
Auto-restart monitoring: Tue Feb 28 18:31:05 EST 2012 (34 second(s) since last update)Resuming the last session (last updated: 2012-02-26 16:08:20)

Links depth: 3
34080 pages scanned
Pages left: 56903
+ 16662 queued for the next depth level

in my math it looks like 107645 pages and that's just Links depth: 3

So.... if after over a week the sitemap only has 43K pages lsited something is not right

Or please correct me if I am wrong...

Please advise
« Last Edit: March 01, 2012, 02:54:04 PM by am1 »

am1

*
  • *
  • 17
Re: How long will it take to crawl a site with over 100K pages?
« Reply #21 on: March 01, 2012, 08:29:55 PM »
The site has over 100k pages, how do you expect me to figure that out????

Can you please explain how your script shows

34080 pages scanned
56903 Pages left
16662 queued for the next depth level

and after a week only 43K pages have been added.

Can you please explain this?

This issue was posted on Feb 17th it is now March 1st and you are yet to provide a reasonable answer.

Please provide a solution to get all the pages indexed in the sitemap or refund my money!
Re: How long will it take to crawl a site with over 100K pages?
« Reply #22 on: March 01, 2012, 08:34:49 PM »
"Pages left" doesn't mean that they will be added in sitemap. That's a total number of pages that were detected. For instance, if it's ruled out by robots.txt it won't be included.

am1

*
  • *
  • 17
Re: How long will it take to crawl a site with over 100K pages?
« Reply #23 on: March 01, 2012, 11:45:43 PM »
The pages I am referring to are not rulled out by the robots.txt

The pages inside the folder I mentioned before were added to the original 100K+ pages

You are just giving me the run around and turning me into a very upset customer

Bad business!!!!

Please provide a solution or issue a refund.

am1

*
  • *
  • 17
Re: How long will it take to crawl a site with over 100K pages?
« Reply #24 on: March 02, 2012, 03:52:50 PM »
More errors

An error occured
There was an error while retrieving the URL specified: [ External links are visible to forum administrators only ]
HTTP headers follow:

HTTP output:

Re: How long will it take to crawl a site with over 100K pages?
« Reply #25 on: March 03, 2012, 10:20:48 AM »
Quote
The pages inside the folder I mentioned before were added to the original 100K+ pages
As I mentioned a few times in this topic already, I need an example URL that is not included in sitemap to be able to check this further. Otherwise I cannot help.
Please send me details via private message.