Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1917

index.parse.md, index.content.md and index.db.md should support wildcard

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.9
    • 1.21
    • indexer
    • None

    Description

      Right now metatags.names supports the '*' character for a catch all.
      I believe that the above index properties should also support catch all as a mechanism for quickly building augmented data models from crawl data. Individual identification and manual inclusion of tags one by one is error prone and time consuming.

      Attachments

        1. MetadataIndexer.java.patch
          0.9 kB
          David Johnson

        Activity

          People

            Unassigned Unassigned
            lewismc Lewis John McGibbney
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: