15 million post
« on: October 21, 2014, 03:04:23 PM »
Hello,

Just about to start using this powerful script although i bought it over 2yrs ago. My questions..

My new website is going to have about 15-18million post. which hints and tips will you offer me  to get the best out of this sofware considering this post size. Any particulat settings to help me get the best including server settings-e.g -max-excution, etc. The site will have 40 categories each category with a post of 300k.

Regards
Tony
Re: 15 million post
« Reply #1 on: October 22, 2014, 06:07:31 PM »
Hello,

with website of this size the best option is to create a limited sitemap - with "Maximum depth" or "Maximume URLs" option limited so that it would gather about 200-300,000 URLs, which would be main pages representing "roadmap" sitemap for search engines.
Re: 15 million post
« Reply #2 on: October 22, 2014, 10:01:22 PM »
Thanks for the advise. Please which section of the script  below should i complete? and what should i put.?

  • Other Sitemap Types (click to expand)
  • Sitemap Entry Attributes (click to expand)
  • Miscellaneous Settings (click to expand)
  • Narrow Indexed Pages Set (click to expand)
  • Crawler Limitations, Finetune (click to expand)
  • Advanced Settings (click to expand)



Thanks
Re: 15 million post
« Reply #3 on: October 23, 2014, 09:20:35 AM »
Hello,

I'd recommend to keep all setting in default state, except for "Maximum pages" setting which will be limited in this case.
Re: 15 million post
« Reply #4 on: October 23, 2014, 07:21:19 PM »
Thanks
Re: 15 million post
« Reply #5 on: October 23, 2014, 07:26:36 PM »
is it possible to have a url one can use with cron services  instead of the /usr/bin/php /home/site/public_html/generator/runcrawl.php   .  i use setcron for all my cron jobs and i normally have url as [ External links are visible to forum administrators only ] for running the cron . is there any format to use with this service assuming the script is installed in  [ External links are visible to forum administrators only ]
Re: 15 million post
« Reply #6 on: October 24, 2014, 05:28:08 AM »
Hello,

you can use http://www.example.com/generator/index.php?op=crawlproc&resume=1

However, command line cron task is recommended since it's running in less restricted environment usually.