XML Sitemaps Generator

    Advanced search
Sitemap Generator Forum
August 08, 2008, 09:39:40 PM
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
   Home   Help Search Login Register  
Sitemap software 2.9 released - Email notifications, html sitemap customizing and more
7070 Posts in 1740 Topics by Members
Latest Member: COMPUTERS
Pages: [1]
  Print  
Author Topic: Script doesnt find all links  (Read 7945 times)
web6
Registered Customer
Newbie
*
Posts: 4


View Profile
« on: February 20, 2007, 01:24:58 PM »

Hi admin,
some time ago we bought your Unlimited sitemap generator. We runs it from command line and successfuly generate xml file. But file doesnt contain all links. We have on our site ([external links are visible to admins only]) a lot of links which starts with "player.php?uiqurl=...."  Theese links are NOT incloded in html file. Could you help me with this problem?
David
Logged
xml-sitemaps
Administrator
Full Member
*****
Posts: 76


View Profile
« Reply #1 on: February 20, 2007, 06:03:06 PM »

Are these links crawlable? To check that out you can input your sites URL in our search engine robot simulator which can be found at http://www.xml-sitemaps.com/se-bot-simulator.html to see what is picked up.
Logged
web6
Registered Customer
Newbie
*
Posts: 4


View Profile
« Reply #2 on: February 21, 2007, 09:33:41 AM »

Thanks for your reply.
When I did some tests with your engine robot simulator it shows me that our links are crawlable. So why this links are not followed by your crawler?

For example see result of folowing page: http://www.xml-sitemaps.com/se-bot-simulator.html?op=se-bot-simulator&go=1&pageurl=http%3A%2F%2Fn-joy.cz%2Fkategorie%2Fn-videa%2F&se=googlebot&submit=Start


Logged
xml-sitemaps
Administrator
Full Member
*****
Posts: 76


View Profile
« Reply #3 on: February 21, 2007, 05:22:54 PM »

Hi,

I checked a couple of the urls in that page like http://n-joy.cz/user/Austin/ and http://n-joy.cz/?uiqurl=EZBdiZbcxgAAeoxw&action=add to find they were both redirected to another page.

Our script won't follow redirects which is why it is not picking up these page.
Logged
web6
Registered Customer
Newbie
*
Posts: 4


View Profile
« Reply #4 on: February 27, 2007, 08:16:33 AM »

Hello admin,
thanks for your replay,

There is may be some misunderstanding. I understand that your robot doeesnt follow redirection to another page. But we dont want to index the page after redirection, we would like to index the links inside (especially links with player.php?.....) !
I cannot see the reason why its not possible, when Google does it.
See this example: [external links are visible to admins only]

Thank you for understanding,
David Heger
Logged
admin
Administrator
Hero Member
*****
Posts: 2956


View Profile
« Reply #5 on: February 27, 2007, 11:24:02 PM »

Hello,

you can add this in "Do not parse URLs" option:
player.php

in this case player links will be still included in sitemap.
Logged

web6
Registered Customer
Newbie
*
Posts: 4


View Profile
« Reply #6 on: March 01, 2007, 09:36:13 AM »

OK, it helped,
thank you.
Logged
easterro
Registered Customer
Newbie
*
Posts: 1


View Profile
« Reply #7 on: March 08, 2007, 05:40:57 PM »

I believe I am having the same problem with redirection causing certain pages not to be included in the site map. Unfortunately, I don't have a single file (like player.php in web6's case) that I can place in the "Do not parse URLs", rather each page has a different file name. What's more, new pages will continue to be added.

Any thoughts on how I can get these pages into the site map?
Logged
admin
Administrator
Hero Member
*****
Posts: 2956


View Profile
« Reply #8 on: March 08, 2007, 11:46:24 PM »

Hello,

you should select a common "pattern" in your redirection URLs so that you can define them in "Do not parse" exclusion list or add the new entries when new URLs are used for redirection.
Logged

Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.5 | SMF © 2006, Simple Machines LLC Valid XHTML 1.0! Valid CSS!