jap

*
  • *
  • 3
Allowed memory exhausted
« on: December 16, 2005, 12:30:51 PM »
Hi to all,

I just purchased the script yesturday and this piece of script is exactly what i was looking for. So thank you for this script.

Now i have a major problem. The script run into a shared hosting and my hoster disabled some php init. At example: set_time_limit cannot be changed and is set to 90sec. Anyway i can resume.

But major problem goes with the memory limit. i cant modify it on my php setting and im having this message when my map reach 2.6Mb

Code: [Select]
Resuming the last session (last updated: 2005-12-16 13:17:37)
Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allocate 44 bytes) in /home/www/76942b3e68655b6a6390884a31226063/web/sitemap/pages/class.grab.inc.php(2) : eval()'d code(1) : eval()'d code(1) : eval()'d code on line 284

My question is how i can continue to crawl the site to index my 74000 pages??

Thank you for any help.

JAP
Re: Allowed memory exhausted
« Reply #1 on: December 16, 2005, 01:28:32 PM »
Hello JAP,

if you are not allowed to modify PHP settings to increase memory_limit option, the only way to get the whole site included into sitemap is to try to separate the site on several parts .

For instance, if your site has folders:
/part1/
/part2/
/part3/

you can generate sitemap for every single subfolder (using corresponding "Starting URL" setting) .

It depends on your site structure and may be hard (or impossible) to do in some cases though.

jap

*
  • *
  • 3
Re: Allowed memory exhausted
« Reply #2 on: December 16, 2005, 06:07:31 PM »
Hi!

Thank you for your fast reply, unfortunely i cant separate the site indexing cause it's dynamicaly php site. Major of it is like /index.php?id=xxx

Does it is possible to have an option where the script create multiple result log (when we are in memory limit) to allow crawling until the end. And at the end append all logs together to make the sitemap xml file.

Otherwise i dont see a way to completly index the full website!

JAP
Re: Allowed memory exhausted
« Reply #3 on: December 17, 2005, 07:12:56 PM »
Hello JAP,

Standalone Generator script supports "Resume crawling" feature, but it still requires to have all urls info restored to know which URLs are already indexed and where to get (parse) the list of links. So, it still will require the same amount of memory and will not help to avoid this.

You can also set the "Maximum number of URLs to index" limit to get at least part of your site indexed.

Basically, it is required to have enough memory for the script to allow it to create large sitemap.
Re: Allowed memory exhausted
« Reply #4 on: December 27, 2005, 09:31:39 PM »
Is there a way I can cope with this message please?

Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allocate 35 bytes) in /home/yada/xml-generator/pages/class.grab.inc.php(2) : eval()'d code(1) : eval()'d code(1) : eval()'d code on line 262

I don't know where I should be trying to increase memory allocation. Is it in your script somewhere or do I have to ask my hosting company please? I'm trying to index a static site of around 60,000 urls and it's all kinds of trouble. Fun though.

BB

jap

*
  • *
  • 3
Re: Allowed memory exhausted
« Reply #5 on: December 28, 2005, 02:53:15 PM »
Hi admin,

Maybe in next version can it be possible to have the script split temp files into 1.5megs and when finished to crawl the site, concatenate the result into a single file?

In this moment i cant use the software cause i have no way to split the website contents.

Happy hollidays and best regards.

JAP

Hello JAP,

Standalone Generator script supports "Resume crawling" feature, but it still requires to have all urls info restored to know which URLs are already indexed and where to get (parse) the list of links. So, it still will require the same amount of memory and will not help to avoid this.

You can also set the "Maximum number of URLs to index" limit to get at least part of your site indexed.

Basically, it is required to have enough memory for the script to allow it to create large sitemap.
Re: Allowed memory exhausted
« Reply #6 on: December 28, 2005, 11:33:41 PM »
Is there a way I can cope with this message please?

Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allocate 35 bytes) in /home/yada/xml-generator/pages/class.grab.inc.php(2) : eval()'d code(1) : eval()'d code(1) : eval()'d code on line 262

I don't know where I should be trying to increase memory allocation. Is it in your script somewhere or do I have to ask my hosting company please? I'm trying to index a static site of around 60,000 urls and it's all kinds of trouble. Fun though.

BB
Hello BB,

the memory_limit setting is related to your server config (not the script config), it is defined in php.ini file.
Re: Allowed memory exhausted
« Reply #7 on: December 28, 2005, 11:37:40 PM »
Hi JAP,

the script doesn't work in this way, because it is required to have the list of all URLs already indexed to avoid duplicate indexing and recursive loops. That's why it cannot be splitten onto parts, unfortunately. Hopefully, we will find a way to optimize memory management further though. :)

Merry Christmas! :)

Hi admin,

Maybe in next version can it be possible to have the script split temp files into 1.5megs and when finished to crawl the site, concatenate the result into a single file?

In this moment i cant use the software cause i have no way to split the website contents.

Happy hollidays and best regards.

JAP

Hello JAP,

Standalone Generator script supports "Resume crawling" feature, but it still requires to have all urls info restored to know which URLs are already indexed and where to get (parse) the list of links. So, it still will require the same amount of memory and will not help to avoid this.

You can also set the "Maximum number of URLs to index" limit to get at least part of your site indexed.

Basically, it is required to have enough memory for the script to allow it to create large sitemap.
Re: Allowed memory exhausted
« Reply #8 on: June 09, 2006, 11:22:13 PM »
I don't know if you check this site still or not, but how long did it take for those 74,000 pages to be put in a site map?