Nutch
  1. Nutch
  2. NUTCH-1293

IndexingFiltersChecker to store detected content type in crawldatum metadata

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      NUTCH-1259 is not implemented in the checker.

        Issue Links

          Activity

          Hide
          Markus Jelsma added a comment -

          Patch for 1.5.

          Show
          Markus Jelsma added a comment - Patch for 1.5.
          Hide
          Julien Nioche added a comment -

          wrong patch?

          Show
          Julien Nioche added a comment - wrong patch?
          Hide
          Markus Jelsma added a comment -

          Wrong patch indeed

          Show
          Markus Jelsma added a comment - Wrong patch indeed
          Hide
          Julien Nioche added a comment -

          +1

          Show
          Julien Nioche added a comment - +1
          Hide
          Markus Jelsma added a comment -

          Committed for 1.5 in rev. 1295614.

          Show
          Markus Jelsma added a comment - Committed for 1.5 in rev. 1295614.
          Hide
          Hudson added a comment -

          Integrated in nutch-trunk-maven #178 (See https://builds.apache.org/job/nutch-trunk-maven/178/)
          NUTCH-1293 IndexingFiltersChecker to store detected content type in crawldatum metadata (Revision 1295614)

          Result = SUCCESS
          markus :
          Files :

          • /nutch/trunk/CHANGES.txt
          • /nutch/trunk/src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java
          Show
          Hudson added a comment - Integrated in nutch-trunk-maven #178 (See https://builds.apache.org/job/nutch-trunk-maven/178/ ) NUTCH-1293 IndexingFiltersChecker to store detected content type in crawldatum metadata (Revision 1295614) Result = SUCCESS markus : Files : /nutch/trunk/CHANGES.txt /nutch/trunk/src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java
          Hide
          Hudson added a comment -

          Integrated in Nutch-trunk #1774 (See https://builds.apache.org/job/Nutch-trunk/1774/)
          NUTCH-1293 IndexingFiltersChecker to store detected content type in crawldatum metadata (Revision 1295614)

          Result = SUCCESS
          markus : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1295614
          Files :

          • /nutch/trunk/CHANGES.txt
          • /nutch/trunk/src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java
          Show
          Hudson added a comment - Integrated in Nutch-trunk #1774 (See https://builds.apache.org/job/Nutch-trunk/1774/ ) NUTCH-1293 IndexingFiltersChecker to store detected content type in crawldatum metadata (Revision 1295614) Result = SUCCESS markus : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1295614 Files : /nutch/trunk/CHANGES.txt /nutch/trunk/src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java
          Hide
          Sebastian Nagel added a comment -

          The content type should be added to metadata after the check for content == null.

          % nutch indexchecker file:/xxxx
          fetching: file:/xxxx
          org.apache.nutch.protocol.file.FileError: File Error: 404
             ...
          Exception in thread "main" java.lang.NullPointerException at org.apache.nutch.indexer.IndexingFiltersChecker.run(IndexingFiltersChecker.java:71)
          
          Show
          Sebastian Nagel added a comment - The content type should be added to metadata after the check for content == null. % nutch indexchecker file:/xxxx fetching: file:/xxxx org.apache.nutch.protocol.file.FileError: File Error: 404 ... Exception in thread "main" java.lang.NullPointerException at org.apache.nutch.indexer.IndexingFiltersChecker.run(IndexingFiltersChecker.java:71)
          Hide
          Markus Jelsma added a comment -

          You're right. Please open a new issue as this is already part of 1.5.

          Show
          Markus Jelsma added a comment - You're right. Please open a new issue as this is already part of 1.5.

            People

            • Assignee:
              Markus Jelsma
              Reporter:
              Markus Jelsma
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development