• Welcome to Sitemap Generator Forum.
 

Crawl seems block after 100 pages scanned

Started by christophe.weber, January 20, 2007, 04:29:48 AM

Previous topic - Next topic

christophe.weber

Links depth: 1
Current page: Bijoux et Cadeaux,Montres,z1,334.html
Pages added to sitemap: 96
Pages scanned: 100 (2,686.1 Kb)
Pages left: 198 (+ 1330 queued for the next depth level)
Time passed: 8:55
Time left: 17:40
Memory usage: 923.1 Kb

the website contains more than 50000 pages, and xmlsitemap script ever block at same point... memory_limit is set to 128MB, and max_execution_time to 1200... no firewall (not active)... all permissions 666 to files... please help

XML-Sitemaps Support

Hello,

please try increasing max_execution_time setting (1200 is only 20 minutes and you will need more for this). Also, we suggest to execute sitemap generator via command line if you have SSH access for better performance.

christophe.weber

tried to pass max_execution_time setting to 12000 but same thing... and crawl always launched via command line on the server (the server is a dedicated machine home hosted)

XML-Sitemaps Support

What do you see when execute it via command line? (command line generator should not display progress like you specified in the first post, it shows the progress details line-by-line).

christophe.weber

in fact I launch the script via command line, and check the progress with firefox...
so this is what appears into my shell with command line :

<html>
<head>
<title>XML Sitemaps - Generation</title>
<meta http-equiv="Content-type" content="text/html;charset=iso-8859-15" />
<link rel=stylesheet type="text/css" href="pages/style.css">
</head>
<body>
Resuming the last session (last updated: 1969-12-31 19:00:00)1 | 297 | 57.8 | 0:34 | 171:25 | 1 | 719.6 Kb | 1 | 0 | 719
20 | 278 | 551.0 | 2:54 | 40:31 | 1 | 667.1 Kb | 18 | 309 | -52
40 | 258 | 1,081.7 | 4:46 | 30:48 | 1 | 704.1 Kb | 36 | 577 | 37
60 | 238 | 1,641.2 | 6:39 | 26:22 | 1 | 768.1 Kb | 56 | 863 | 64
80 | 218 | 2,204.2 | 8:54 | 24:17 | 1 | 808.5 Kb | 76 | 1139 | 40
100 | 198 | 2,683.4 | 10:44 | 21:16 | 1 | 925.9 Kb | 96 | 1347 | 117

and there is the progress with firefox:

Already in progress. Current process state is displayed:
Links depth: 1
Current page: Bijoux et Cadeaux,Montres,z1,334.html
Pages added to sitemap: 96
Pages scanned: 100 (2,683.4 Kb)
Pages left: 198 (+ 1347 queued for the next depth level)
Time passed: 10:44
Time left: 21:16
Memory usage: 925.9 Kb

crawl always block at the same point, and no sitemap is created

XML-Sitemaps Support

And what do you see in the shell when script stops? No output at all or there is some sort of error message?

christophe.weber

it stops refresh with the line 100 | 198 | 2,683.4 | 10:44 | 21:16 | 1 | 925.9 Kb | 96 | 1347 | 117

no message, no other line... like the script continue to work but it does not... and after a few minutes if I check with firefox the crawl page the page is like script has never been launched and there is only the "run" button

XML-Sitemaps Support

#7
Can you provide us with temporary ssh access (via private message) to check this further?

christophe.weber

I just tried to launch the script (on the same machine, same config) to crawl one of my other website and the script seems to work perfectly  :o
so what could block it when it crawls this one? not the firewall I desengage it when I use the script

XML-Sitemaps Support

Mm.. this is strange (since it has partially crawled the site). Does our online generator (https://www.xml-sitemaps.com/) crawl more than 100 pages fro that site?