Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.0.0
    • Fix Version/s: 1.1
    • Component/s: fetcher
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      Demoting this issue and moving to 1.1 - current patch is not suitable due to LGPL licensed parts.

      1. NUTCH-705.patch
        174 kB
        Dmitry Lihachev

        Issue Links

          Activity

          Hide
          Dmitry Lihachev added a comment -

          This parser correctly handles non ascii input

          Show
          Dmitry Lihachev added a comment - This parser correctly handles non ascii input
          Hide
          Sami Siren added a comment -

          I think that the patch contains some lgpl code that we cannot commit into apache repository.

          Show
          Sami Siren added a comment - I think that the patch contains some lgpl code that we cannot commit into apache repository.
          Hide
          Dmitry Lihachev added a comment -

          Yes, it looks a bit like a problem... How can we handle this?

          Show
          Dmitry Lihachev added a comment - Yes, it looks a bit like a problem... How can we handle this?
          Hide
          Sami Siren added a comment -

          I think we should start looking at Apache Tika for most (or all) of our parsers.

          Show
          Sami Siren added a comment - I think we should start looking at Apache Tika for most (or all) of our parsers.
          Hide
          Julien Nioche added a comment -

          RTF parsing is now handled by the TikaPlugin (NUTCH-766). Please open an issue on Tika if the original problem with non-ascii chars still occurs

          Show
          Julien Nioche added a comment - RTF parsing is now handled by the TikaPlugin ( NUTCH-766 ). Please open an issue on Tika if the original problem with non-ascii chars still occurs

            People

            • Assignee:
              Unassigned
              Reporter:
              Dmitry Lihachev
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development