Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2310

Try to order chapters in epub correctly

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.21
    • Component/s: None
    • Labels:
      None

      Description

      Johan van der Knijff recently pointed out on twitter that our Epub parser doesn't handle chapters in the right order. We should try to fix our parser so that the output is in the correct order.

      Epub is new to me, but it looks like we can scrape the order out of content.opf.

      This would require dumping the stream to a ZipFile for direct access to zip entries, but we require that of ooxml...

        Attachments

        1. Dzhordzh_Oruell_1984_en_.epub
          280 kB
          Alexey Zhukov

          Activity

            People

            • Assignee:
              tallison Tim Allison
              Reporter:
              tallison Tim Allison
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: