• Welcome to Sitemap Generator Forum.
 

UNLTD Generator works strange - too slow and never finished

Started by motogonki2, August 01, 2018, 07:09:46 PM

Previous topic - Next topic

motogonki2

Hi there! I am the Sitemap Generator Customer since 2012 or so. So, till this time everything worked good (not perfect, but good). This time I can not finish the crowling operation.

Old version (updated in 2017) crowls around 2.5 hours for 100 000 links and consumes not more than 255 megs of RAM. This one consumes more and more, and after few hours of work drops with different exceptions.

Also, I have a warning instantly in the shell:

PHP Warning:  json_encode(): Invalid UTF-8 sequence in argument in /var/www/vhosts/motogonki.ru/cp.motogonki.ru/generator2018/pages/class.utils.inc.php on line 254

So, I could not generate а page of sitemap for the last 24 hours. Changes settings, sets to '0' (unlimited), no way.

"No sitemaps found
Sitemap was not generated yet, please go to Crawling page to start crawler manually or to setup a cron job."

P.S. Launched on the own server 64Gb RAM/4 CPU under CentOS. Powerfull enuf, for sure. Everything works well else the new Generator.

Old version works as usual. As a spare.

This is the existing Sitemap (0666), generated with old version Generator - [ External links are visible to forum administrators only ]

Also works.

What I do wrong for example? Thanx.


motogonki2

Hello, Oleg!

I also learned that updated Standalone PHP Sitemap Generator (up to v8.0) did not finished its processes after July, 31 2018. It stops every time after 60-75 mins on a 3rd depth level or so, with only 1500-1600 found links, and then can not continue the interrupted session, it initiates a new crowling every time (by submitting "Start creating Sitemap" neither with or without "Continue" checkbox on/off or run crowling in the shell).

In the end, Google can not see most of new articles, so we loosing a lot of incoming trafic because of this issue.

motogonki2

And one more learning:

I just found that The "Unlimited Site Generator" I purchased two days is the same (at the script level, file sizes etc) as the Standalone Site Generator, so I'd like to return the payment and return to using "old" Site Generator.

It's probably someone's mistake, but... for now both do not work for the same reason and in the same way. So why I should pay for the new one? Have no reason for sure.


motogonki2

One more thing: after two days of work one of the processes has been successfully finished! Hallelujah.

But - once again -

The XLS stylesheets have ceased to work. Earlier it was possible to change the header and footer manually, now it's forbidden, and for some reason by default it use the remote stylesheet, not the file that was copied to the folder with the Sitemap XLS. Each time after regeneration, references to styles are lost. How to fix it?

motogonki2

[ External links are visible to forum administrators only ]

XML-Sitemaps Support

Hello,

you can modified generator/pages/mods/sitemap_xml_tpl.xml file for that :
<?xml-stylesheet type="text/xsl" href="http:/yourlinkto/sitemap.xsl"?>
and regenerate sitemap after that.

motogonki2

Hi there, I am back. There's a strange problem with the UNLTD Generator: it spoils some headers of the XML Site Map pages, and Google informed me about it, 3 weeks later. What to do about it?

Google stops parsing there (on 30000-th page from 105000). We have no traffic and losing positions in the Search results every day.

[ External links are visible to forum administrators only ]

Please, check it out and rule it out somehow.

XML-Sitemaps Support

Hello,

it's possible that the script was terminated by the server in the middle of the sitemap file creation.
You can try to reduce the "URLs per sitemap file" setting to make smaller sitemap files to avoid this.

motogonki2

Hi there!

How to rebuild/regenerate all sitemaps files (html/xml) on the site without crawling, using previously collected data? Is it possible?

XML-Sitemaps Support

Hello,

if there is a file named crawl_dump_resume.log in generator/data/ folder, you can copy it to crawl_dump.log (save a copy of it just in case) and run generator, resuming the crawling process - that would be the last saved point of the crawling process.

motogonki2

Quote from: XML-Sitemaps Support on September 12, 2018, 06:46:02 AM
Hello,

it's possible that the script was terminated by the server in the middle of the sitemap file creation.
You can try to reduce the "URLs per sitemap file" setting to make smaller sitemap files to avoid this.

Well, I did as you said. It does not help. More chunks - more errors, and most of them because of wrong EOF or something like this. It does not matter - 2000, 5000 or 20000 links in each file, still there is a enough of errors and unhandled XML.

This never happened before I installed the latest versions of the Unlimited Generator and the upgrade my Standalone Sitemap Generator (PHP) to v8.0, 2018-05-17

motogonki2


XML-Sitemaps Support

Hello,

it means that the generator process was interrupted by the server in the middle of writing the file. The only way to resolve this is to configure the server not to interrupt (kill) generator process I'm afraid.