not all files being indexed
« on: November 21, 2009, 09:40:45 PM »
I'm new to XML-Sitemap, and I'm having problems.  My site is hosted at pairNetworks on a FreeBSD UNIX server.

I had no difficulty installing the software, but it isn't indexing all my pages.  As it's running, I can see cases where the paths end, not with a file name, but what look like Apache queries, such as: 


In the site map, these turn into

Index of /path/
Index of /path/

as you can see on the site map:
[ External links are visible to forum administrators only ]

Why is this happening, and how can I fix it?

Re: not all files being indexed
« Reply #1 on: November 22, 2009, 04:47:55 AM »
OK, I see what's happening.  I allow visitors access to the directory listing of many of my folders, so they can browse them, directly, by not including an index.htm file in those folders.  Sitemap is adding

Index of /path/

lines for each of these open folders.  That must be the reason the links in these lines look like Apache folder queries.  I'm assuming I could stop this behavior by adding index.htm files to each of these folders.  I can do it for the /Census/ folders without having a significant impact on visitors (there is a CensusHub file allowing them to locate files), but I don't want to do that for the /FGS/ (family group sheet folder), which is the largest section of my site (11,000 of the 15,000 pages), even though there is an index file to these folders.  There are distinct advantages to being able to browse the files there. 

Is there some way we can get the sitemap generator to stop adding links to the directory listings of open folders?

Re: not all files being indexed
« Reply #2 on: November 22, 2009, 08:25:46 AM »
OK, that was what was happening.  Because the /Census/ and /FGS/ subfolders had no index files in them, Sitemap was outputing links to the folder directories, in addition to listing the files they contained.  I added index files to the /Census/ folders, and that problem disappeared.  However, I need another solution for the /FGS/ folders (the bulk of my web site) because I want them to remain accessible for browsing by my visitors.

Other problems have cropped up, or at least now I'm noticing them.  I will start a new thread for each as the cause and solution, if there is one, will likely be different.