Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2792

nutch index -params is only used in Solr indexer

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.17
    • 1.21
    • indexer
    • None

    Description

      `nutch index` help displays:

       General options:
      ...
       -params k1=v1&k2=v2... parameters passed to indexer plugins
       (via property indexer.additional.params)

      The option does nothing when used with CSV or dummy indexers. Looking at the code, the property is defined in:

      https://github.com/apache/nutch/blob/master/src/java/org/apache/nutch/indexer/IndexerMapReduce.java#L78

      which is only used in:

      https://github.com/apache/nutch/blob/master/src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrIndexWriter.java#L141

      Several possibilities:

      • Drop the parameter from the help. Does not break backward compatibility.
      • Move the -params handling in IndexWriters.java and add them to IndexWriterParams of every indexer. Not too impactful but not super clean either: the parameters are not "namespaced" per indexer, if someone uses multiple indexers there may be parameter collisions.
      • Refactor the way these parameters are passed, to prefix them with target indexer. Would break backward compatibility. In that case, it would be good to change the format completely: turn -params into -param, allow multiple values to be passed and forget the '=/&' syntax (which does not handle escaping anyway).

      Not sure how much this parameter is used. I would have used it to configure the output path for indexer-csv or indexer-dummy.

      Attachments

        Activity

          People

            Unassigned Unassigned
            pmezard Patrick Mézard
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: