Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2390

Extract images embedded in Html

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Duplicate
    • Affects Version/s: 1.15
    • Fix Version/s: 1.18, 2.0.0
    • Component/s: parser
    • Labels:
      None

      Description

      We should handle images embedded in html like we do for other formats, as attachments. There are encodings other than base64 used out there to embed images in html?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                lfcnassif Luís Filipe Nassif
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: