Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2467

Sitemap type field can be null

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 1.13
    • 1.15
    • None
    • None
    • Patch Available

    Description

      sitemap.isIndex() can return null for real sitemap indices, so there contents won't be added to the CrawlDB. Example, the indices https://www.reisenco.nl/sitemap_index.xml points to are not processed.

      Attachments

        1. NUTCH-2467.patch
          0.6 kB
          Markus Jelsma

        Issue Links

          Activity

            People

              markus17 Markus Jelsma
              markus17 Markus Jelsma
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: