• Welcome to Sitemap Generator Forum.
 

Script doesnt find all links

Started by web6, February 20, 2007, 01:24:58 PM

Previous topic - Next topic

web6

Hi admin,
some time ago we bought your Unlimited sitemap generator. We runs it from command line and successfuly generate xml file. But file doesnt contain all links. We have on our site ([ External links are visible to forum administrators only ]) a lot of links which starts with "player.php?uiqurl=...."  Theese links are NOT incloded in html file. Could you help me with this problem?
David

xml-sitemaps

Are these links crawlable? To check that out you can input your sites URL in our search engine robot simulator which can be found at https://www.xml-sitemaps.com/se-bot-simulator.html to see what is picked up.
Philip Nicosia

web6

Thanks for your reply.
When I did some tests with your engine robot simulator it shows me that our links are crawlable. So why this links are not followed by your crawler?

For example see result of folowing page: https://www.xml-sitemaps.com/se-bot-simulator.html?op=se-bot-simulator&go=1&pageurl=http%3A%2F%2Fn-joy.cz%2Fkategorie%2Fn-videa%2F&se=googlebot&submit=Start



xml-sitemaps

Hi,

I checked a couple of the urls in that page like [ External links are visible to forum administrators only ] and [ External links are visible to forum administrators only ] to find they were both redirected to another page.

Our script won't follow redirects which is why it is not picking up these page.
Philip Nicosia

web6

Hello admin,
thanks for your replay,

There is may be some misunderstanding. I understand that your robot doeesnt follow redirection to another page. But we dont want to index the page after redirection, we would like to index the links inside (especially links with player.php?.....) !
I cannot see the reason why its not possible, when Google does it.
See this example: [ External links are visible to forum administrators only ]

Thank you for understanding,
David Heger

XML-Sitemaps Support

Hello,

you can add this in "Do not parse URLs" option:
player.php

in this case player links will be still included in sitemap.

web6


easterro

I believe I am having the same problem with redirection causing certain pages not to be included in the site map. Unfortunately, I don't have a single file (like player.php in web6's case) that I can place in the "Do not parse URLs", rather each page has a different file name. What's more, new pages will continue to be added.

Any thoughts on how I can get these pages into the site map?

XML-Sitemaps Support

Hello,

you should select a common "pattern" in your redirection URLs so that you can define them in "Do not parse" exclusion list or add the new entries when new URLs are used for redirection.