Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2092

Integrate Math equation image extraction

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.0
    • Fix Version/s: None
    • Component/s: detector, ocr

      Description

      "A general-purpose, deep learning-based system to decompile an image into presentational markup."

      As described in :

      What You Get Is What You See: A Visual Markup Decompiler
      Yuntian Deng, Anssi Kanervisto, and Alexander M. Rush
      http://arxiv.org/pdf/1609.04938v1.pdf

      code here:
      https://github.com/harvardnlp/im2markup

      demo here:
      http://lstm.seas.harvard.edu/latex/

      Would be useful as part of the OCR efforts in tika.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              craig.pfeifer@gmail.com Craig Pfeifer
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: