UNLTD Generator works strange - too slow and never finished
« on: August 01, 2018, 07:09:46 PM »
Hi there! I am the Sitemap Generator Customer since 2012 or so. So, till this time everything worked good (not perfect, but good). This time I can not finish the crowling operation.

Old version (updated in 2017) crowls around 2.5 hours for 100 000 links and consumes not more than 255 megs of RAM. This one consumes more and more, and after few hours of work drops with different exceptions.

Also, I have a warning instantly in the shell:

PHP Warning:  json_encode(): Invalid UTF-8 sequence in argument in /var/www/vhosts/motogonki.ru/cp.motogonki.ru/generator2018/pages/class.utils.inc.php on line 254

So, I could not generate а page of sitemap for the last 24 hours. Changes settings, sets to '0' (unlimited), no way.

"No sitemaps found
Sitemap was not generated yet, please go to Crawling page to start crawler manually or to setup a cron job."

P.S. Launched on the own server 64Gb RAM/4 CPU under CentOS. Powerfull enuf, for sure. Everything works well else the new Generator.

Old version works as usual. As a spare.

This is the existing Sitemap (0666), generated with old version Generator - [ External links are visible to forum administrators only ]

Also works.

What I do wrong for example? Thanx.
Re: UNLTD Generator works strange - too slow and never finished
« Reply #2 on: August 03, 2018, 10:41:01 AM »
Hello, Oleg!

I also learned that updated Standalone PHP Sitemap Generator (up to v8.0) did not finished its processes after July, 31 2018. It stops every time after 60-75 mins on a 3rd depth level or so, with only 1500-1600 found links, and then can not continue the interrupted session, it initiates a new crowling every time (by submitting "Start creating Sitemap" neither with or without "Continue" checkbox on/off or run crowling in the shell).

In the end, Google can not see most of new articles, so we loosing a lot of incoming trafic because of this issue.
Re: UNLTD Generator works strange - too slow and never finished
« Reply #3 on: August 03, 2018, 10:58:54 AM »
And one more learning:

I just found that The "Unlimited Site Generator" I purchased two days is the same (at the script level, file sizes etc) as the Standalone Site Generator, so I'd like to return the payment and return to using "old" Site Generator.

It's probably someone's mistake, but... for now both do not work for the same reason and in the same way. So why I should pay for the new one? Have no reason for sure.
Re: UNLTD Generator works strange - too slow and never finished
« Reply #4 on: August 03, 2018, 06:58:36 PM »
Hello, 

in this case I would recommend to run generator in command line if you have ssh access to your server.
Re: UNLTD Generator works strange - too slow and never finished
« Reply #5 on: August 05, 2018, 04:32:38 PM »
One more thing: after two days of work one of the processes has been successfully finished! Hallelujah.

But - once again -

The XLS stylesheets have ceased to work. Earlier it was possible to change the header and footer manually, now it's forbidden, and for some reason by default it use the remote stylesheet, not the file that was copied to the folder with the Sitemap XLS. Each time after regeneration, references to styles are lost. How to fix it?
Re: UNLTD Generator works strange - too slow and never finished
« Reply #6 on: August 05, 2018, 04:33:03 PM »
[ External links are visible to forum administrators only ]
Re: UNLTD Generator works strange - too slow and never finished
« Reply #7 on: August 06, 2018, 02:44:02 PM »
Hello,

you can modified generator/pages/mods/sitemap_xml_tpl.xml file for that :
Code: [Select]
<?xml-stylesheet type="text/xsl" href="http:/yourlinkto/sitemap.xsl"?>and regenerate sitemap after that.
Re: UNLTD Generator works strange - too slow and never finished
« Reply #8 on: September 11, 2018, 11:57:32 PM »
Hi there, I am back. There's a strange problem with the UNLTD Generator: it spoils some headers of the XML Site Map pages, and Google informed me about it, 3 weeks later. What to do about it?

Google stops parsing there (on 30000-th page from 105000). We have no traffic and losing positions in the Search results every day.

[ External links are visible to forum administrators only ]

Please, check it out and rule it out somehow.
Re: UNLTD Generator works strange - too slow and never finished
« Reply #9 on: September 12, 2018, 06:46:02 AM »
Hello,

it's possible that the script was terminated by the server in the middle of the sitemap file creation.
You can try to reduce the "URLs per sitemap file" setting to make smaller sitemap files to avoid this.
Re: UNLTD Generator works strange - too slow and never finished
« Reply #10 on: September 19, 2018, 05:12:51 PM »
Hi there!

How to rebuild/regenerate all sitemaps files (html/xml) on the site without crawling, using previously collected data? Is it possible?
Re: UNLTD Generator works strange - too slow and never finished
« Reply #11 on: September 21, 2018, 04:29:20 AM »
Hello,

if there is a file named crawl_dump_resume.log in generator/data/ folder, you can copy it to crawl_dump.log (save a copy of it just in case) and run generator, resuming the crawling process - that would be the last saved point of the crawling process.
Re: UNLTD Generator works strange - too slow and never finished
« Reply #12 on: September 27, 2018, 01:53:55 PM »
Hello,

it's possible that the script was terminated by the server in the middle of the sitemap file creation.
You can try to reduce the "URLs per sitemap file" setting to make smaller sitemap files to avoid this.

Well, I did as you said. It does not help. More chunks - more errors, and most of them because of wrong EOF or something like this. It does not matter - 2000, 5000 or 20000 links in each file, still there is a enough of errors and unhandled XML.

This never happened before I installed the latest versions of the Unlimited Generator and the upgrade my Standalone Sitemap Generator (PHP) to v8.0, 2018-05-17
Re: UNLTD Generator works strange - too slow and never finished
« Reply #13 on: September 27, 2018, 02:00:31 PM »
Google reports about 12 errors in 24 chunks.
Re: UNLTD Generator works strange - too slow and never finished
« Reply #14 on: September 27, 2018, 04:08:49 PM »
Hello,

it means that the generator process was interrupted by the server in the middle of writing the file. The only way to resolve this is to configure the server not to interrupt (kill) generator process I'm afraid.