XML Sitemaps Generator

    Advanced search
Sitemap Generator Forum
July 20, 2008, 07:03:41 PM
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
   Home   Help Search Login Register  
Sitemap software 2.9 released - Email notifications, html sitemap customizing and more
6813 Posts in 1681 Topics by Members
Latest Member: info607
Pages: [1] 2 3
  Print  
Author Topic: Crawl Issues _ HELP!  (Read 7087 times)
scottsaxton
Registered Customer
Jr. Member
*
Posts: 18


View Profile
« on: March 06, 2008, 01:53:02 PM »

Hello,

I am new here.  Yesterday I purchased sitemap generator and have had some luck with it, but I am also still running into issues.

It seems to get caught up in my photo gallery.  If I do a crawl of only a few layers the crawl completes successfully, gut it I try to do 7 or 8 layers, it never finishes and I don't really see an error of why not.

I have tried messing around with a maximum run time and maximum page count, etc...but no luck as of yet.

Any ideas would be usefull.

Thanks
Logged
admin
Administrator
Hero Member
*****
Posts: 2837


View Profile
« Reply #1 on: March 06, 2008, 10:39:16 PM »

Hello,

do you get sitemap generator stopped or it keeps working/crawling your site?
Logged

scottsaxton
Registered Customer
Jr. Member
*
Posts: 18


View Profile
« Reply #2 on: March 07, 2008, 03:09:04 PM »

Sometimes the crawling srceen comes back (like when you first start it)....other times it just gets stuck at say level 6 or layer 7 and doens't indicate anyting.  I have also seen an xml error in the sitemap.xml file...at the very bottom.
Logged
admin
Administrator
Hero Member
*****
Posts: 2837


View Profile
« Reply #3 on: March 07, 2008, 11:04:33 PM »

So, the sitemap is created successfully? I.e., you see new entries on change log page?
Logged

scottsaxton
Registered Customer
Jr. Member
*
Posts: 18


View Profile
« Reply #4 on: March 10, 2008, 01:00:06 PM »

No when I am experiencing problems, I don't see an update to the change log.
Logged
admin
Administrator
Hero Member
*****
Posts: 2837


View Profile
« Reply #5 on: March 10, 2008, 11:05:48 PM »

Hello,

it looks like your server configuration doesn't allow to run the script long enough to create full sitemap. Please try to increase memory_limit and max_execution_time settings in php configuration at your host (php.ini file) or contact hosting support regarding this.
Logged

scottsaxton
Registered Customer
Jr. Member
*
Posts: 18


View Profile
« Reply #6 on: March 14, 2008, 01:24:55 PM »

So I am waiting to hear from my web host, but it appears that I don't have access to my php.ini file.

Here's the deal.  I went through my config and set as shown in attachment.

I started the crawl with the run in background option selected.

It updated the status up to level 7...but then sat and sat and sat until if finally gave me the message to restart the interrupted session.

The change log did not update and the request date did not change.

Any suggestions other then the php.ini file?
Logged
da_lyman
Registered Customer
Jr. Member
*
Posts: 18


View Profile
« Reply #7 on: March 14, 2008, 04:30:04 PM »

You can creat your own php.ini file and put it in the root of your web site and you can make the changes needed. 

My problem is that the program hangs up on level 3 and says its crawling the same page, then times out and says to start over, so i try to start over and it hangs up again. 

Links depth: 3
Current page: products.cfm/action/mfgdisplay/start/81/display/10/CategoryName/Control-Circuit-and-Protection/mfgname/Federal-Pacific-Electric-(FPE)/PageNum/9
Pages added to sitemap: 698
Pages scanned: 720 (20,601.2 Kb)
Pages left: 1312 (+ 2724 queued for the next depth level)
Time passed: 41:40
Time left: 75:57
Memory usage: 3,890.7 Kb

When i go in to the analyze portion it gives me the error:

Warning:  ksort() expects parameter 1 to be array, null given in D:\inetpub\overstockelectrical\generator\pages\page-analyze.inc.php(2) : eval()'d code on line 55

Warning:  Invalid argument supplied for foreach() in D:\inetpub\overstockelectrical\generator\pages\page-analyze.inc.php(2) : eval()'d code on line 57

SOMEONE HELP!
Logged
admin
Administrator
Hero Member
*****
Posts: 2837


View Profile
« Reply #8 on: March 14, 2008, 08:25:57 PM »

Please PM me your generator URL and reference to this thread.

Quote
When i go in to the analyze portion it gives me the error:
You should not open analyze page until sitemap is created (since there is nothing to analyze at that point).
Logged

scottsaxton
Registered Customer
Jr. Member
*
Posts: 18


View Profile
« Reply #9 on: March 15, 2008, 12:09:05 AM »

I have lost the button to start a crawl....any ideas?  This is on the url/generator -> crawl tab!

shold my server be set for PHP 4 or PHP5?

Logged
admin
Administrator
Hero Member
*****
Posts: 2837


View Profile
« Reply #10 on: March 15, 2008, 11:07:25 PM »

Hello,

you should increase memory_limit setting to resolve that.
Both PHP4 and 5 are supported.
Logged

scottsaxton
Registered Customer
Jr. Member
*
Posts: 18


View Profile
« Reply #11 on: March 17, 2008, 03:16:29 PM »

So here is where I am and I am still having problems:

I have access to my php.ini file.  The default values were as follows:

max_execution_time = 300
max_input_time = 60
memory_limit = 18MB.

Any recommendations what I should change these settings to.  I have tried increasing them and the crawl never complete's.


Here is what my configuration looks like:

Main Parameters:

Starting URL:
 [external links are visible to admins only]

Save sitemap to:
 /hermes/bosweb/web145/b1450/ipw.weflyhot/public_html/sitemap.xml
Current path to Sitemap generator is: /hermes/bosweb/web145/b1450/ipw.weflyhot/public_html/generator/

Your Sitemap URL:
[external links are visible to admins only]
 
Create Text Sitemap:
 X Create sitemap in Text format

Create ROR Sitemap:
X Create sitemap in ROR format
It will be stored in the same folder as XML sitemap, but with different filename: ror.xml
 
Create Google Base Feed (RSS):
X Create feed for Google Base
It will be stored in the data/ folder with filename: gbase.xml

Create HTML Sitemap:
X Create html site map for your normal visitors
Please note that this option requires additional resources to perform

HTML Sitemap filename:
/hermes/bosweb/web145/b1450/ipw.weflyhot/public_html/sitemap.html

Sitemap entry attributes (optional)

Change frequency:
Weekly

Last modification:
 Use server's response
 
Priority
0.5

Automatic Priority:
X Automatically assign priority attribute
Enable this option to automatically reduce priority depending on the page's depth level

Individual attributes:
Blank
define specific frequency and priority attributes here in the following format:
"url substring,lastupdate YYYY-mm-dd,frequency,priority".
example:
page.php?product=,2005-11-14,monthly,0.9


Logged
scottsaxton
Registered Customer
Jr. Member
*
Posts: 18


View Profile
« Reply #12 on: March 17, 2008, 03:20:32 PM »

Continued

Miscellaneous Definitions (optional)

Number of links per page in HTML sitemap:
40000
(that will split your sitemap on several pages)

Compress sitemap using GZip:
 Use sitemap files compression
(".gz" will be added to all filenames automatically)

Inform (ping) Search Engines upon completion (Google, Yahoo, Ask, Moreover):
 Ping Google when generation is done

Calculate changelog:
 Calculate Change Log after completion
please note that this option requires more resources to complete

  • Crawler Limitations, Finetune (optional)
Maximum pages:
 0 "0" for unlimited

Maximum depth level:
0 "0" for unlimited

Maximum execution time, seconds:
0  "0" for unlimited

Save the script state, every X seconds:
120 this option allows to resume crawling operation if it was interrupted. "0" for no saves

Make a delay between requests, X seconds after each N requests:
60 s after each  500 requests
This option allows to reduce the load on your webserver. "0" for no delay

  • Advanced Settings (optional)

Extract meta description tag
X enable META descriptions
Note: this option may significantly increase memory usage and is not recommended for larger sitemaps

Use IP address for crawling:
Blank

Remove session ID from URLs:
 PHPSESSID sid osCsid
common session parameters (separate with spaces): PHPSESSID, sid, osCsid

Progress state storage type:
X serialize  var_export
try to change this option in case of memory usage issues

Logged
scottsaxton
Registered Customer
Jr. Member
*
Posts: 18


View Profile
« Reply #13 on: March 17, 2008, 03:22:51 PM »

Basically what happens is I strat the crawl..then after a period of time The Sitemap generation in progress... screen shows

This page cannot be sidplayed

The page you are looking for is currently....................................................
......
........
Logged
admin
Administrator
Hero Member
*****
Posts: 2837


View Profile
« Reply #14 on: March 17, 2008, 10:35:54 PM »

Hello,

memory_limit of 18M is too low in many cases, please increase it. THen check how manya pages does it crawl before getting interrupted.
Logged

Pages: [1] 2 3
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.5 | SMF © 2006, Simple Machines LLC Valid XHTML 1.0! Valid CSS!