• Welcome to Sitemap Generator Forum.
 

CGI Error

Started by davidstetler1, November 08, 2006, 04:09:46 PM

Previous topic - Next topic

davidstetler1

I am receiving the following error after crawling for a couple of minutes...

The specified CGI application misbehaved by not returning a complete set of HTTP headers.

Any ideas?

davidstetler1

I've worked through this issue, but now I'm stuck again.  I am on a shared server so the php.ini cannot be edited.  I am saving the progress and rerunning it after it times out  over and over, but this could take a very long time.  Is there anyway around this problem without access to php.ini?

XML-Sitemaps Support

Some hosting environments (not all though) allow to change php settings per directory, so you can try to create php.ini file in generator/ folder:
max_execution_time=5000
memory_limit=128M
(you can change values to suit your instance)
OR create .htaccess file:
php_value memory_limit 128M
php_value max_execution_time 3000


valle

Im getting the same problem as the OP
The specified CGI application misbehaved by not returning a complete set of HTTP headers.

as he didnt post what he did to stop this, does anyone else know?
Im on a windows server and do have access to the php.ini file.

I also have this at the bottom of each page, not sure its related :
PHP Warning: PHP Startup: pdf: Unable to initialize module Module compiled with module API=20060613, debug=0, thread-safety=1 PHP compiled with module API=20060613, debug=0, thread-safety=0 These options need to match in Unknown on line 0

XML-Sitemaps Support

Hello,

in case of Windows server it might be needed to increase script timeout in IIS configuration as described on:
[ External links are visible to logged in users only ]
[ External links are visible to logged in users only ]

The mentioned warning message is not related to generator (this is onyl related to php configuration).

valle

thanks, I will check the timeout for the timeout error but what about :
The specified CGI application misbehaved by not returning a complete set of HTTP headers.
is this the same cause?

valle

further to this does anyone have a good timeout setting for large sites? I have at least 100k urls to crawl and getting timeouts is really slowing down the process

XML-Sitemaps Support

The crawling time itself depends on the website page generation time mainly, since it crawls the site similar to search engine bots.
For instance, if it it takes 1 second to retrieve every page, then 1000 pages will be crawled in about 16 minutes.

valle

I see the CGI headers error is getting ignored but I really need to solve it
"The specified CGI application misbehaved by not returning a complete set of HTTP headers."
is blocking every run and I have no idea whats causing it except that it appears to be your application.


valle

Are you 100% sure on that? as I get 1 error for time outs and this other one for something I dont know. From reading up on it, its something to do with the script inself sending data thats confusing the browser.
[ External links are visible to forum administrators only ]
is what I read today but not sure how relevant it is as I dont know the language. I would be looking to purchase another copy of this script if I can sort these issues out so its important they are resolved.


valle

ok so from what I have read you can set the cgi timeout to a silly max value if 2 billion seconds. Or run it from a cron so it cant time out at all.
I will try running it again from the cron to see if it completes. I did attempt this yesterday but dont think it worked out, will try again.