Hello I have few problems. Since 10 days now my sitemap isn't working. I noticed only now because I bought the video sitemap option. When I start the crawling the following error occur :

Warning: preg_match() [function.preg-match]: Compilation failed: regular expression is too large at offset 35939 in /home/realityg/public_html/generator/pages/class.grab.inc.php on line 107

Warning: preg_match() [function.preg-match]: Compilation failed: regular expression is too large at offset 35939 in /home/realityg/public_html/generator/pages/class.grab.inc.php on line 192

It's saying the second error unlimited amount of times.

So I need your help plz.

- My second problem was, is there a way that all the link that haven't changed are not crawled each time, because now my crawling is 7hours long.

Thanks for the help :)
Hello,

did you modify "Exclude URLs" or "Do not parse" settings in generator config? If yes, what are the values you did set?


> - My second problem was, is there a way that all the link that haven't changed are not crawled each time, because now my crawling is 7hours long.

Unfortunately, generator will have no way to find if pages were actually changed unless they crawl them, so it only could work if you know the URLs (or parts of URLs) for those pages and add them in "Do not parse" setting in generator configuration.
Hello,

did you modify "Exclude URLs" or "Do not parse" settings in generator config? If yes, what are the values you did set?


> - My second problem was, is there a way that all the link that haven't changed are not crawled each time, because now my crawling is 7hours long.

Unfortunately, generator will have no way to find if pages were actually changed unless they crawl them, so it only could work if you know the URLs (or parts of URLs) for those pages and add them in "Do not parse" setting in generator configuration.

Thanks for your awnser. In generator.conf I changed <option name="xs_robotstxt"> to 0 (it was set on 1).

And is there a way that I don't generate my URLs anymore, but I only generate the video sitemap ?
Hello,

generator still needds to crawl your website to *find* pages with videos. In case if they are in a separate section of your site, for instance in /video/ subfolder, you can use it as Starting URL and other pages will not get crawled.
I see thanks for your awnser.

One last question, the crawling process was fined and like 280 videos were found on my site.

BUT (it's a forum), these 280 videso were in the media part of my site, where only videos are psoted, and all the video embeded on my threads haven't been found.

Why the crawling process didn't found the video I have on my topics ?

Tx.
mysite.com/media = embeded
mysite.com/threads/name of the thread = not embeded (videos I mean).