Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1318

Use of Deprecated Word6Extractor.getParagraphText() Method

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 1.5
    • Fix Version/s: 2.0, 1.17
    • Component/s: parser
    • Labels:

      Description

      org.apache.tika.parser.microsoft.WordExtractor.parseWord6() uses the deprecated Word6Extractor.getParagraphText() method. getParagraphText() is supposed to return a String[] with an element for each paragraph in the text. The replacement is getText(), which lets paragraph, cell, etc separation be implementation specific. I'm not sure, at this point, how the POI WordExtractor separates them.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              tpalsulich Tyler Bui-Palsulich
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: