Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1561

improve usability of parse-metatags and index-metadata

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.6
    • Fix Version/s: 1.9
    • Component/s: None
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      Usually, the plugins parse-metatags and index-metadata are used in combination: the former "extracts" meta tags, the latter adds the extracted tags as fields to the index.

      Configuration of the two plugins differs which causes pitfalls and reduces the usability (see example config):

      • the property "metatags.names" of parse-metatags uses ';' as separator instead of ',' used by index-metadata
      • meta tags have to be lowercased in index-metadata
      <property>
        <name>metatags.names</name>
        <value>DC.creator;DCTERMS.bibliographicCitation</value>
      </property>
      
      <property>
        <name>index.parse.md</name>
        <value>metatag.dc.creator,metatag.dcterms.bibliographiccitation</value>
      </property>
      

        Attachments

        1. NUTCH-1561-v1.patch
          1 kB
          kiran
        2. NUTCH-1561-trunk-v2.patch
          7 kB
          Sebastian Nagel
        3. NUTCH-1561-trunk-v3.patch
          14 kB
          Sebastian Nagel
        4. NUTCH-1561-trunk-v4.patch
          14 kB
          Sebastian Nagel

          Activity

            People

            • Assignee:
              wastl-nagel Sebastian Nagel
              Reporter:
              wastl-nagel Sebastian Nagel
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: