New User Questions
« on: July 19, 2008, 01:51:43 AM »
Hi, I just bought the script and installed it on my server.

I've got 8,000+ pages.  The script found about a dozen of them. 

The processing didn't report any errors.   Everything seemed to work,
but hardly any pages were found.

Does this thing examine the directory structure, or actually visit the
home page and follow the links?

First impression.  Promising.  But, way too complicated.  Would
have paid more for less.

Thanks.
Re: New User Questions
« Reply #1 on: July 19, 2008, 02:38:14 PM »
Hmm.  It looks like this script crawls the site like any bot would.    Why?

I can understand why your free online service has to crawl the site like a bot, because that's the only way it can find and record the pages of a remote site.

But the paid script we install on our servers could just read the directory structure, in a few seconds.  Right?   It seems this would be very reliable and fast.   

What am I not understanding?   Thanks.
Re: New User Questions
« Reply #2 on: July 21, 2008, 01:30:28 AM »
Hello,

the script crawls the site (which is normally faster when script is installed on the same server where the site is located), since most websites nowadays are not static anymore, and you cannot find out the URLs by reading files in directory. The pages must be opened similar to normal visitors to get everything indexed correctly.
Re: New User Questions
« Reply #3 on: July 21, 2008, 04:20:13 PM »
Thanks for your reply.

The problem with crawling like a normal visitor, is that you miss any page
which can not be reached by clickable links.    On our site, your script finds
9 out of 8,000+ pages.   

Our site navigation is by javascript drop down menus,
and a search function.   It's easy for visitors to navigate.   Creating clickable
links to all 8,000+ pages would really require a sitemap.   In other words,
I can't create a sitemap with your script until I already have a sitemap
for it to crawl. 

Can you point me to the part of your sales pages that explains this limitation?
If yes, ok, my problem for not reading carefully enough, I'll eat the $20.
If not, how about a refund?   That would be an agreeable resolution too.

For your info, I wrote a perl script in about an hour that reads our directory
structure and creates a list of URLs.    Not a real XML map yet, that will
take a bit longer.   This script finds the 8,000 files and creates the list
in about 1-2 seconds.  You might consider coding this option for those
reporting all the crawling problems etc.

Anyway, if you would agree to issue a refund, I'll consider it a learning
experience, and wish you well with your project. 

Your reply, or refund, is appreciated, thanks.

Re: New User Questions
« Reply #4 on: July 22, 2008, 04:43:56 PM »
Hello,

in most cases sites are dynamic (with databases etc), so you cannot just scan the directory to make a sitemap (which would be easy to do) - sorry to hear that it doesn't quite suit your needs.

The system specs are described in documentation: https://www.xml-sitemaps.com/documentation-xml-sitemap-generator.html#sysreq
Quote
# Sitemap generator connects to your website via http port 80, so your host should allow local network connections for php scripts
Re: New User Questions
« Reply #5 on: July 22, 2008, 08:29:43 PM »
Thanks for your reply.   

Ok, I understand what you mean about databases.   Still, most sites are built of pages,
not databases, so a directory read option would be likely be helpful for some of these folks reporting crawl problems.   

Anyway, what your docs don't make clear is that all documents must be available by clickable links, or they can't be reached by your script.   Thus, it would seem your script is not appropriate for alternate navigation systems such as flash, javascript menus, database search etc.   

I'm just suggesting it might be wise to make this limitation clear upfront, before the purchase.  You could do so in half the time you are spending on this thread.

As it stands, you are building a list of folks like me.   You have my $20, I have nothing.  Not a great public relations strategy really.   

I realize you're not trying to rip anybody off.   But you need to either explain your software a bit better, or lighten up on the refund issue.

Hope that's helpful.