Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2670

org.apache.nutch.indexer.IndexerMapReduce does not read the value of "indexer.delete" from nutch-site.xml

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • 1.14, 1.15
    • None
    • indexer
    • None
    • macOS Mojave and High Sierra
      MacBook Pro (Retina, 13-inch, Mid 2014)
      Oracle Java 1.8.0_144-b01 and previous versions

    Description

      Inside org.apache.nutch.indexer.IndexerMapReduce.IndexerReducer, the setup() function should read the value of "indexer.delete" from nutch-site.xml, and assign the value to the variable of "delete". See the following line of code.
      (line 201) delete = conf.getBoolean(INDEXER_DELETE, false);

      However, the value of "indexer.delete" set in nutch-site.xml and nutch-default.xml is not assigned to the variable, "delete". I put the following setting in one of nutch-site.xml and nutch-default.xml, or in both of them. The variable of "delete" remains false.

      <property>
      <name>indexer.delete</name>
      <value>true</value>
      <description>Whether the indexer will delete documents GONE or REDIRECTS by indexing filters
      </description>
      </property>

      I also changed the line of code to
      delete = conf.getBoolean(INDEXER_DELETE, true);

      Whatever value of "indexer.delete" is set in nutch-site.xml or nutch-default.xml, the value of "delete" remains false.

      Attachments

        Activity

          People

            Unassigned Unassigned
            aquaticwater Junqiang Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: