XML Sitemaps Generator

Author Topic: Problems with first Crawl  (Read 15768 times)

mike59

  • Registered Customer
  • Approved member
  • *
  • Posts: 9
Problems with first Crawl
« on: July 28, 2009, 07:58:51 PM »
Hi,

I'm new and just bought the sitemap program.  RIght now I have 16,000 pages and see it will grow about 500 - 1000 per day. Any way, I just the to crawl and it cannot make it through.  It just stops with the following (and it's different every time):

Links depth: 2
Current page: tag/credit/
Pages added to sitemap: 495
Pages scanned: 1620 (60,370.3 KB)
Pages left: 4571 (+ 5817 queued for the next depth level)
Time passed: 1:11:17
Time left: 3:21:08
Memory usage: 4,755.7 Kb

What do I need to do so it can run all the way through?  More memory, more speed, etc?  The problem is that I'm using shared hosting, so I do not have access to any configs.

XML-Sitemaps Support

  • Administrator
  • Hero Member
  • *****
  • Posts: 10622
Re: Problems with first Crawl
« Reply #1 on: July 29, 2009, 03:57:42 AM »
Hello,

I'd recommend to add this in "Do not parse" option first:
Code: [Select]
tag/
feed/
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.

mike59

  • Registered Customer
  • Approved member
  • *
  • Posts: 9
Re: Problems with first Crawl
« Reply #2 on: July 30, 2009, 12:55:56 AM »
That has helped, thanks so much.  I had to run the program 4 times manually, it finally completed.  The problem though is that the crawl didn't obtain all my URLs.  It only indexed 3000 out of 18,000.  What can I do fix this?

XML-Sitemaps Support

  • Administrator
  • Hero Member
  • *****
  • Posts: 10622
Re: Problems with first Crawl
« Reply #3 on: July 30, 2009, 06:46:50 AM »
Hello,

could you please PM me your generator URL and an example URL that is not included in sitemap and how it can be reached from homepage?
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.

mike59

  • Registered Customer
  • Approved member
  • *
  • Posts: 9
Re: Problems with first Crawl
« Reply #4 on: July 31, 2009, 03:37:22 AM »
Sure, I PM'd all the info.  Just curious, is there any way to enable logging to see where the problem lies?  Also, I was curious if your software supports remote crawls.  I was thinking that if I was to create a seperate hosting account just for your software, then crawl my site remotly, then resources (or what ever it is) may be saved for the site itself.  Does your software support this?

XML-Sitemaps Support

  • Administrator
  • Hero Member
  • *****
  • Posts: 10622
Re: Problems with first Crawl
« Reply #5 on: July 31, 2009, 12:23:45 PM »
Hello,

this issue in most cases is resolved by analyzing the site structure to optimize crawler settings using exclude urls and do not parse options.
Yes, it is possible to crawl the site from remote account, but then resulting sitemap files will have to be manually moved to main server where the site is hosted.
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.

mike59

  • Registered Customer
  • Approved member
  • *
  • Posts: 9
Re: Problems with first Crawl
« Reply #6 on: August 02, 2009, 03:49:32 AM »
Would someone be able to recommend a hosting company that doesn't limit php scripts (and also affordable)?  My host say they do not, but I cannot figure out why the script just stops.  I need to be able to build my sitemap.  Does anyone not have an issue?  If so, what host do you use?

mike59

  • Registered Customer
  • Approved member
  • *
  • Posts: 9
Re: Problems with first Crawl
« Reply #7 on: August 28, 2009, 03:04:58 PM »
OK, I moved my site to a full dedicated server (outgrew shared hosting).  Now, the software is still crawling my site, but stops every so often.  Is there anyway we can set the software to continue without human intervention?  For example, after two days the software stopped for some reason.  I had to manually kick it off again.  Is there a way I can set it to continue once it detects the script stopped?  So far my first crawl has been running for 90 + hours (and I had to manually restart it twice).

Links depth: 4
Current page: friendship-month-%e2%80%93-the-joys-of-friendship/
Pages added to sitemap: 20847
Pages scanned: 26060 (963,508.8 KB)
Pages left: 5131 (+ 8691 queued for the next depth level)
Time passed: 90:21:10
Time left: 17:47:22
Memory usage: 24,787.5 Kb

XML-Sitemaps Support

  • Administrator
  • Hero Member
  • *****
  • Posts: 10622
Re: Problems with first Crawl
« Reply #8 on: August 29, 2009, 02:06:53 AM »
Hello,

yes, you can setup a daily scheduled  task (cron job) in hosting control panel for sitemap generator and it will automatically resume generation in case if it has stopped.
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.

mike59

  • Registered Customer
  • Approved member
  • *
  • Posts: 9
Re: Problems with first Crawl
« Reply #9 on: August 31, 2009, 09:00:18 PM »
Great news.  So if the job is running fine and the cron job kicks off, would this cause the job to fail since it's already running?  Just checking because I was thinking about having it run every 6 hours or so.....

XML-Sitemaps Support

  • Administrator
  • Hero Member
  • *****
  • Posts: 10622
Re: Problems with first Crawl
« Reply #10 on: September 01, 2009, 10:54:48 AM »
Yes, it will check if another job is running and will skip session in this case.
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.

mike59

  • Registered Customer
  • Approved member
  • *
  • Posts: 9
Re: Problems with first Crawl
« Reply #11 on: September 02, 2009, 09:11:07 PM »
ok, well it now completes the crawl (very quickly), but only reports 7956 URLs rather than my 55000 plus URL's.  I ran this several times with the same results.  Have you seen this before?  how can I fix this?

XML-Sitemaps Support

  • Administrator
  • Hero Member
  • *****
  • Posts: 10622
Re: Problems with first Crawl
« Reply #12 on: September 02, 2009, 09:33:30 PM »
Do you have an example URL that is not included in sitemap?
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.

mike59

  • Registered Customer
  • Approved member
  • *
  • Posts: 9
Re: Problems with first Crawl
« Reply #13 on: September 03, 2009, 03:06:36 PM »
One of the URL's is

[external links are visible to admins only]

XML-Sitemaps Support

  • Administrator
  • Hero Member
  • *****
  • Posts: 10622
Re: Problems with first Crawl
« Reply #14 on: September 03, 2009, 09:16:02 PM »
Please let me know your generator login/password.
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.

 

SMF 2.0.12 | SMF © 2014, Simple Machines
XHTML RSS WAP2