Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-140

Add alias capability in parse-plugins.xml file that allows mimeType->extensionId mapping

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.8
    • fetcher
    • None
    • Power Mac OS X 10.4, Dual Processor G5 2.0 Ghz, 1.5 GB RAM, although bug is independent of environment

    Description

      Jerome and I have been talking about an idea to address the current issue raised by Stefan G. about having a mapping of mimeType->list of pluginIds rather than mimeType->list of extensionIds in the parse-plugins.xml file. We've come up with the following proposed update that would seemingly fix this problem.

      We propose to have the concept of "aliases" in the parse-plugins.xml file, defined at the end of the file, something lie:

      <parse-plugins>
      ....

      <mimeType name="text/html">
      <plugin id="parse-html"/>
      </mimeType>

      .....

      <aliases>
      <alias name="parse-html"
      extension-point="org.apache.nutch.parse.html.HtmlParser"/>

      ....
      <alias name="parse-html2" extension-point="my.other.html.Parser"/>

      ....
      </aliases>
      </parse-plugins>

      What do you guys think? This approach would be flexible enough to allow the mapping of extensionIds to mimeTypes, but without impacting the current "pluginId" concept.

      Comments welcome.

      Attachments

        1. NUTCH-140.20051502.patch.txt
          25 kB
          Chris A. Mattmann

        Activity

          People

            chrismattmann Chris A. Mattmann
            chrismattmann Chris A. Mattmann
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: