"A general-purpose, deep learning-based system to decompile an image into presentational markup."
As described in :
What You Get Is What You See: A Visual Markup Decompiler
Yuntian Deng, Anssi Kanervisto, and Alexander M. Rush
Would be useful as part of the OCR efforts in tika.