Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2295

Image not extracted via -z or -J in ODT

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0, 1.15
    • Component/s: None
    • Labels:
      None

      Description

      Sam Bayer identified this issue and shared the attached triggering document. We should extract images from odt.

      1. Test.odt
        10 kB
        Tim Allison

        Activity

        Hide
        tallison@mitre.org Tim Allison added a comment -

        A bit of a hack, but we're pulling everything under "/Pictures" and "/Thumbnails" for now.

        I couldn't figure out how to embed 1) a pdf without it being just the raw bytes or 2) a doc file without it being converted to an embedded odt object...maybe we want to add handling for embedded objects (odt within an odt).

        For anyone who knows more about embedded objects in odt, please open another issue.

        Thank you, Sam. Cheers, Tim.

        Show
        tallison@mitre.org Tim Allison added a comment - A bit of a hack, but we're pulling everything under "/Pictures" and "/Thumbnails" for now. I couldn't figure out how to embed 1) a pdf without it being just the raw bytes or 2) a doc file without it being converted to an embedded odt object...maybe we want to add handling for embedded objects (odt within an odt). For anyone who knows more about embedded objects in odt, please open another issue. Thank you, Sam. Cheers, Tim.
        Hide
        tallison@mitre.org Tim Allison added a comment -

        triggering file with very small image.

        Show
        tallison@mitre.org Tim Allison added a comment - triggering file with very small image.

          People

          • Assignee:
            tallison@mitre.org Tim Allison
            Reporter:
            tallison@mitre.org Tim Allison
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development