I'm caught by this one. I have used 2 tools
1. Firstly Windows local GSiteCrawler (good but needs Windows and uses VB, forget it) (found in Google page : http://code.google.com/sm_thirdparty.html)
2. Then Applet, really cool : http://www.auditmypc.com/free-sitemap-generator.asp (tool link is http://www.auditmypc.com/xml-sitemap.asp). It's not ASP only the site is (found by my myself).
I have generated a sitemap.xml file, checked it in Eclipse using Oxygen and http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd. (I used this URLset <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
Compatible Google, Yahoo, MSN (see http://www.sitemap.org)
So all was smooth until I wanted to submit it in Google tools (https://www.google.com/webmasters/tools/). I had to submit it adding control/main/ to the real domain site name (eg http://www.mysite.com/control/main/sitemap.xml) because even if the file is in the OFBiz root, using Apache and Tomcat connector (AJP1
3) you need to do so (at least in my case). Then I got caught, Google forever says that my file has not the right format (I tried many ways).
Anybody an idea why Google does not want my file (which is correct for sure). An issue related to VirtualHost and JkMout ? I have a really simple one :
JkMount / ofbizServer
JkMount /control ofbizServer
JkMount /control/* ofbizServer
- list of the banned JLR 5/5/7 (did not work not sure why => use known IP addresses)
#SetEnvIfNoCase User-Agent "^TMCrawler" banned
- deny them access
#deny from env=banned
deny from 128.241.20. #tmcrawler bot
I suspect a problem there because I was not able to validate my sitemap.xml by 3d party sitemap validators (but the errors were incomprehensible)
Following sitemap.org advice, I have also put a line in the robot.txt file (in OFBiz root), hoping it will work (crossing fingers).