Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1880

Attribute number-columns-repeated not correctly used in ODS documents

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.12
    • None
    • parser

    Description

      When the ODS writer has first written, it made the assumption that the the `number-columns-repeated` attribute for cells would only be used for blank cells. This is not the case with documents created by (at least) LibreOffice 4.4.7.2. The current work approach to repeated cells is to use the html concept of spanning, which is not suitable for repeated content.

      The note in the Tika source (OpenDocumentContentParser.java#L459):

      TODO: The following is not correct, the cell should be repeated not spanned!
      Code generates a HTML cell, spanning all repeated columns, to make the cell look correct. Problems may occur when both spanning and repeating is given, which is not allowed by spec. Cell spanning instead of repeating is not a problem, because OpenOffice uses it only for empty cells.

      Attachments

        1. example.ods
          7 kB
          Ryan Desmond

        Activity

          People

            Unassigned Unassigned
            rddesmond@gmail.com Ryan Desmond
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: