Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1487

Nutch parse fails first time for PDF files and works on reparse

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.1
    • Fix Version/s: 2.5
    • Component/s: parser, storage
    • Labels:

      Description

      The parser is failing to parse pdf files at one go and working on re-parsing command the number of times the total number of PDF files as discussed in the mailing list here (http://www.mail-archive.com/user%40nutch.apache.org/msg07952.html)

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              kiranch kiran
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: