Tika
  1. Tika
  2. TIKA-360

Outstanding Improvements to Number/Date Formatting in ExcelParser

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.1
    • Component/s: parser
    • Labels:
      None
    • Environment:

      All Operating Systems (Seen on my Ubuntu 9.10, Solaris 10 and Windows 7 instances)

      Description

      As highlighted in TIKA-103, there are issues with Tikas parsing of Excel files due to the way Excel stores dates as numbers as well as the formatting applied by POI for these numbers. To provide base support for number/date formatting an initial patch was applied as part of TIKA-103 to apply POI's out-of-the-box formatting.

      This issue is being raised to capture the progress of addressing outstanding formatting issues such as Fractions within the POI library.

        Issue Links

          Activity

          Nick Burch made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 1.1 [ 12318849 ]
          Resolution Fixed [ 1 ]
          Hide
          Nick Burch added a comment -

          I believe this was solved in Tika 1.1. If we identify any other areas where the Excel formatting isn't quite right, it's probably best to open a new specific issue for that.

          Show
          Nick Burch added a comment - I believe this was solved in Tika 1.1. If we identify any other areas where the Excel formatting isn't quite right, it's probably best to open a new specific issue for that.
          Hide
          Nick Burch added a comment -

          Fractions will be supported when we upgrade to POI 3.8 beta 6 / 3.8 Final

          Are there any other areas we still need to enhance?

          Show
          Nick Burch added a comment - Fractions will be supported when we upgrade to POI 3.8 beta 6 / 3.8 Final Are there any other areas we still need to enhance?
          Hide
          Dave Meikle added a comment -

          I am currently progressing Fractions support within POI under the bugzilla id 45678 [1]. Will look at other formatting issues once this is complete.

          Cheers,
          Dave

          [1] https://issues.apache.org/bugzilla/show_bug.cgi?id=45678

          Show
          Dave Meikle added a comment - I am currently progressing Fractions support within POI under the bugzilla id 45678 [1] . Will look at other formatting issues once this is complete. Cheers, Dave [1] https://issues.apache.org/bugzilla/show_bug.cgi?id=45678
          Hide
          Dave Meikle added a comment -

          This issue is the progression of TIKA-103.

          Show
          Dave Meikle added a comment - This issue is the progression of TIKA-103 .
          Dave Meikle made changes -
          Field Original Value New Value
          Link This issue is related to TIKA-103 [ TIKA-103 ]
          Dave Meikle created issue -

            People

            • Assignee:
              Dave Meikle
              Reporter:
              Dave Meikle
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development