Tika
  1. Tika
  2. TIKA-1100

cannot extract text in text-box for Excel 2007 file(.xlsx, .xlsm)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.3
    • Fix Version/s: 1.5
    • Component/s: parser
    • Labels:
      None
    • Environment:

      Windows7 64bit

      Description

      When I launch Tika gui from command-line and drag and drop .xlsx file that have textbox, no text in the textbox are extracted.

      When drag and drop .xls file, text in the textbox are extracted.

        Activity

        Kazuaki Matsuba created issue -
        Hide
        Tim Allison added a comment -

        Waiting for improvements in POI-55292. Will make Tika-side upgrades when the next version of POI is released.

        Reference: http://issues.apache.org/bugzilla/show_bug.cgi?id=55292

        Show
        Tim Allison added a comment - Waiting for improvements in POI-55292. Will make Tika-side upgrades when the next version of POI is released. Reference: http://issues.apache.org/bugzilla/show_bug.cgi?id=55292
        Hide
        Tim Allison added a comment -

        Simple example file attached for now. Will fill out with test cases when POI is ready.

        Show
        Tim Allison added a comment - Simple example file attached for now. Will fill out with test cases when POI is ready.
        Tim Allison made changes -
        Field Original Value New Value
        Attachment testEXCEL_textbox.xlsx [ 12593705 ]
        Hide
        Kazuaki Matsuba added a comment -

        Thanks, Tim

        I'll wait for the next version of POI is bundled in Tika.

        Show
        Kazuaki Matsuba added a comment - Thanks, Tim I'll wait for the next version of POI is bundled in Tika.
        Hide
        Tim Allison added a comment -

        Updated XSSFExcelExtractorDecorator and added test as of r1526489.

        Show
        Tim Allison added a comment - Updated XSSFExcelExtractorDecorator and added test as of r1526489.
        Hide
        Tim Allison added a comment -

        r1526498

        Show
        Tim Allison added a comment - r1526498
        Tim Allison made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 1.5 [ 12324552 ]
        Resolution Fixed [ 1 ]
        Jukka Zitting made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        176d 12h 26m 1 Tim Allison 26/Sep/13 14:03
        Resolved Resolved Closed Closed
        180d 2h 17m 1 Jukka Zitting 25/Mar/14 16:21

          People

          • Assignee:
            Unassigned
            Reporter:
            Kazuaki Matsuba
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development