wordpress categories not getting crawled
« on: February 25, 2009, 06:07:34 PM »
Hi there.
I just downloaded the script few hours ago, and installed it.
I’m using wordpress for my site. However my categories are not getting crawled.
For the same site I have 2 Wordpress installed. 1 in the root and another one in a different folder, with 2 different templates. However, for both of them the categories are not getting indexed.
Moreover, I have to restart over and over again the crawling because it stops regularly, displaying a blank page. I have around 6000 urls and it took me over 2 hours to crawl it.
Would you be so kind to advice?

Re: wordpress categories not getting crawled
« Reply #1 on: February 25, 2009, 10:42:48 PM »
Hello,

do you have links to your categories somewhere on your site so that generator can find them, starting from homepage?

To address timeout issue you can increase max_execution_time setting.
Re: wordpress categories not getting crawled
« Reply #2 on: February 26, 2009, 02:12:24 PM »
Hi,
The category links are starting from the home page and are linked to every page of the site.
I know it sounds very strange, but the only urls not indexed are the categories.
max_execution_time setting is set on unlimited.
I also get this message from cron :
Warning: fclose(): supplied argument is not a valid stream resource in /home/…./public_html/generator/pages/class.utils.inc.php(2) : eval()'d code on line 34
<h4>Completed</h4>Total pages indexed: 6058<br>Calculating changelog...
Warning: fopen(/home/…./public_html/generator/data/urllist.txt): failed to open stream: Permission denied in /home/…./public_html/generator/pages/class.xml-creator.inc.php(2) : eval()'d code on line 95 and so on…
I did delete the content of data folder and after the next cron I did chmod 666 the files from data. Also the data folder is chmod 777.
However the categories are still not indexed.
However when refreshing crawling I get to see that the categories are spidered but not indexed to the sitemap.xml
The crawling session is always interrupted after few hundred urls so I have to always resume the last session.
If you don’t mind I’ll PM you the url so maybe you can take a look.I don’t have any password yet on the generator.
If there is an extra job to do, that has nothing to do with the script, I won’t hesitate to pay for your time.