Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2089

Macros not extracted from ppt files

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.16
    • Component/s: None
    • Labels:
      None

      Description

      On TIKA-2069, I found that macros weren't extracted from the one ppt test file that I generated. There's a chance something is wrong with the test file. If not, let's use this to track poi 60162.

        Issue Links

          Activity

          Hide
          tallison@mitre.org Tim Allison added a comment - - edited

          PPT stores macros in a different stream than do xls and doc. Until this is all streamlined in POI, and it isn't clear that there's a non-trivial option, we need to plagiarize from POI's TestBugs.getMacrosFromHSLF() to extract macros from PPT.

          Show
          tallison@mitre.org Tim Allison added a comment - - edited PPT stores macros in a different stream than do xls and doc. Until this is all streamlined in POI, and it isn't clear that there's a non-trivial option, we need to plagiarize from POI's TestBugs.getMacrosFromHSLF() to extract macros from PPT.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Jenkins build Tika-trunk #1304 (See https://builds.apache.org/job/Tika-trunk/1304/)
          TIKA-2089 – extract macros from ppt (tallison: https://github.com/apache/tika/commit/95c515db2f303db049c79a7bd0c260ec559186b1)

          • (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/HSLFExtractor.java
          • (edit) CHANGES.txt
          • (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/PowerPointParserTest.java
          • (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/AbstractPOIFSExtractor.java
          • (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/OfficeParser.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Jenkins build Tika-trunk #1304 (See https://builds.apache.org/job/Tika-trunk/1304/ ) TIKA-2089 – extract macros from ppt (tallison: https://github.com/apache/tika/commit/95c515db2f303db049c79a7bd0c260ec559186b1 ) (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/HSLFExtractor.java (edit) CHANGES.txt (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/PowerPointParserTest.java (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/AbstractPOIFSExtractor.java (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/OfficeParser.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Jenkins build Tika-trunk #1309 (See https://builds.apache.org/job/Tika-trunk/1309/)
          TIKA-2089 - bug fix, check for nulls (tallison: https://github.com/apache/tika/commit/cb7b84a48aa7f500c08733b06fc78e6c1f0bba14)

          • (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/HSLFExtractor.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Jenkins build Tika-trunk #1309 (See https://builds.apache.org/job/Tika-trunk/1309/ ) TIKA-2089 - bug fix, check for nulls (tallison: https://github.com/apache/tika/commit/cb7b84a48aa7f500c08733b06fc78e6c1f0bba14 ) (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/HSLFExtractor.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Jenkins build Tika-trunk #1310 (See https://builds.apache.org/job/Tika-trunk/1310/)
          TIKA-2089 – clean up try/catch with autocloseable (tallison: https://github.com/apache/tika/commit/621ded89fd909380ff5c3d2c5d13d73dd6e5b982)

          • (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/HSLFExtractor.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Jenkins build Tika-trunk #1310 (See https://builds.apache.org/job/Tika-trunk/1310/ ) TIKA-2089 – clean up try/catch with autocloseable (tallison: https://github.com/apache/tika/commit/621ded89fd909380ff5c3d2c5d13d73dd6e5b982 ) (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/HSLFExtractor.java

            People

            • Assignee:
              tallison@mitre.org Tim Allison
              Reporter:
              tallison@mitre.org Tim Allison
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development