Tika
  1. Tika
  2. TIKA-981

Text isn't extracted from PDF pop-up annotations

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3
    • Component/s: None
    • Labels:
      None
    1. TIKA-981.patch
      4 kB
      Michael McCandless

      Activity

      Michael McCandless created issue -
      Hide
      Michael McCandless added a comment -

      Patch with test case and fix. I removed the check for only
      PDAnnotationMarkup.SUB_TYPE_FREETEXT (pop-ups, at least for this test
      PDF, are PDAnnotationText.SUB_TYPE: I'm not sure what other annotation
      types have useful text as their title/subject/contents. Separately I
      noticed we were failing to extract the subject properly (it was the
      same as title).

      Show
      Michael McCandless added a comment - Patch with test case and fix. I removed the check for only PDAnnotationMarkup.SUB_TYPE_FREETEXT (pop-ups, at least for this test PDF, are PDAnnotationText.SUB_TYPE: I'm not sure what other annotation types have useful text as their title/subject/contents. Separately I noticed we were failing to extract the subject properly (it was the same as title).
      Michael McCandless made changes -
      Field Original Value New Value
      Attachment TIKA-981.patch [ 12542597 ]
      Michael McCandless made changes -
      Status Open [ 1 ] Resolved [ 5 ]
      Resolution Fixed [ 1 ]

        People

        • Assignee:
          Michael McCandless
          Reporter:
          Michael McCandless
        • Votes:
          0 Vote for this issue
          Watchers:
          1 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development