Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2092

Integrate Math equation image extraction

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.0.0
    • None
    • detector, ocr

    Description

      "A general-purpose, deep learning-based system to decompile an image into presentational markup."

      As described in :

      What You Get Is What You See: A Visual Markup Decompiler
      Yuntian Deng, Anssi Kanervisto, and Alexander M. Rush
      http://arxiv.org/pdf/1609.04938v1.pdf

      code here:
      https://github.com/harvardnlp/im2markup

      demo here:
      http://lstm.seas.harvard.edu/latex/

      Would be useful as part of the OCR efforts in tika.

      Attachments

        Activity

          People

            Unassigned Unassigned
            craig.pfeifer@gmail.com Craig Pfeifer
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: