Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2264

Better handling of footnotes/endnotes for ODF files

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.14
    • None
    • parser
    • N/A

    Description

      Springs from my question here (http://stackoverflow.com/questions/42031237/modify-apache-tika-parsing-of-old-1997-2003-ms-word-docs) ... I have improved the class OpenDocumentContentParser so that it puts footnotes/endnotes at the end of the line to which they belong and doesn't break up the line in question. As with .docx parsing the notes can be linked to the reference easily. The respondee in Stack Overflow suggested I open an issue here...

      Attachments

        1. test.odt
          17 kB
          Mike Rodent
        2. _ImprovedODFContentParserUTest.java
          10 kB
          Mike Rodent
        3. ImprovedODFContentParser.java
          23 kB
          Mike Rodent

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mrodent Mike Rodent
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: