[TIKA-2092] Integrate Math equation image extraction - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.0.0
Fix Version/s: None
Component/s: detector, ocr
Labels:
- deeplearning
- image
- parse

Description

"A general-purpose, deep learning-based system to decompile an image into presentational markup."

As described in :

What You Get Is What You See: A Visual Markup Decompiler
Yuntian Deng, Anssi Kanervisto, and Alexander M. Rush
http://arxiv.org/pdf/1609.04938v1.pdf

code here:
https://github.com/harvardnlp/im2markup

demo here:
http://lstm.seas.harvard.edu/latex/

Would be useful as part of the OCR efforts in tika.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Craig Pfeifer

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 22/Sep/16 17:12

Updated:: 12/Apr/21 13:00