We are using an enormous amount of server resources each time XML Sitemap generates a sitemap for us. I have configured many URL exclusions, but it does not seem to affect the crawling of those pages - only that they are not added to the sitemap.
Is there a way to limit the URLs that are actually crawled when the sitemap is being generated?
One of our main problems is that our sitemaps are generating many reduntant entries for products because of the Joomla/Virtuemart URL syntax. It appears that a product is indexed first as a single product page, but then also as part of a category, and then also as a "manufacturer" search result. The resulting unnecessary URL looks like this:
[external links are visible to admins only]
I would like to keep the redundant results from being crawled, but since they are dynamically generated I can't figure out how to limit them using a URL exception.
I know it seems like two different issues, but if I had an answer to either we may accomplish our objective of decreasing server resource usage.
Thank you for an excellent product!