Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3042

Date format extraction problem in XLS/XLSX

    XMLWordPrintableJSON

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.24
    • Component/s: None
    • Labels:
      None

      Description

      Currently TIKA/ManifoldCF 2.10 extracts dates from the attached file tis way:

      2018.05.10 -> 10/05/18
      2002.02.02 -> 2/2/2

      We need this format:

      2018.05.10 -> 2018-05-10

      2002.02.02 -> 2002-02-02

      This occurs only when the field type is date. When the field type is text then the output is fine.

       

      Please help us with a recommendation with any settings in the pipeline (Tika configs, excel setting, OS local settings, etc.), or provide a fix. 

        Attachments

        1. exceldatum.xlsx
          9 kB
          Zoltan Farago

          Activity

            People

            • Assignee:
              tallison Tim Allison
              Reporter:
              zfarago Zoltan Farago
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: