Directing Sitemap Generator To Selected Files
« on: February 18, 2006, 01:02:24 AM »
My website is produced dynamically but has the ability to convert dynamic pages to regular html pages.  All of the html pages reside in the public folder (htdocs) along with "[ External links are visible to forum administrators only ]." Is there any way to direct sitemap generator to the folders containing the html pages and/or the html pages themselves?  This is driving me nuts because every sitemap generator program I've tried jumps into "index.html" and never looks below that.  The result is several hundred pages of meaningless dynamic pages.  Help!
« Last Edit: February 18, 2006, 01:08:37 AM by danl »
Re: Directing Sitemap Generator To Selected Files
« Reply #1 on: February 18, 2006, 01:19:31 AM »
Hello,

I'm not sure what do you mean, but you can enter direct URL of your html page instead of top domain URL as Starting address for crawling.
Re: Directing Sitemap Generator To Selected Files
« Reply #2 on: February 19, 2006, 10:58:15 AM »
Hi.  If I start sitemap generator at an html page it follows the links on that page and it crawls all the links on that page, which brings it into the dynamic pages (even though I've included the meta tag <META NAME="robots" content="index,no follow">.

If I start sitemap generator at a directory containing the html pages, it lists the directory, but not the html pages residing inside the directory

(Example:  <url>
  <loc>[ External links are visible to forum administrators only ]</loc>
  <priority>0.5</priority>
  <changefreq>weekly</changefreq>
  </url>
  </urlset>).

Basically, I need a way to crawl several hundred independent html pages, pages that are not linked to each other.  Is there a way to do this?
Re: Directing Sitemap Generator To Selected Files
« Reply #3 on: February 19, 2006, 08:20:49 PM »
Hi.

the crawler finds the pages from the links included. You should have the site with pages that link to each other to get all of them.