Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1467

pdf:encrypted:false with encrypted pdf

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.6
    • None
    • parser
    • None
    • $java -version
      java version "1.6.0_25"
      Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
      Java HotSpot(TM) Client VM (build 20.0-b11, mixed mode, sharing)

    Description

      When extracting metadata from the encryption_noprinting.pdf file found in the pdfCabinetOfHorrors (https://github.com/openplanets/format-corpus/tree/master/pdfCabinetOfHorrors)

      $java -jar tika-app-1.7-20141105.092424-471.jar -j encryption_noprinting.pdf

      We get a
      INFO - Document is encrypted

      but the resulting JSON has : "pdf:encrypted":"false"

      Looking at the PDFParser, it seems that the first information comes when reading the PDF but when the metadata is retrieve the PDF is no longer encrypted... the encryption fact should be retain to be added to the metadata.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tledouxfr@yahoo.fr Thomas Ledoux
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: