Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-9107 [C++][Dataset] Time-based types support
  3. ARROW-9065

[C++] Support parsing date32 in dataset partition folders

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.0.0
    • C++, Python

    Description

      I have some data which is partitioned by year/month/date. It would be useful if the date could be automatically parsed:

      
      In [17]: schema = pa.schema([("year", pa.int16()), ("month", pa.int8()), ("day", pa.date32())])
      
      In [18]: partition = DirectoryPartitioning(schema)
      
      In [19]: partition.parse("/2020/06/2020-06-08")
      ---------------------------------------------------------------------------
      ArrowNotImplementedError Traceback (most recent call last)
      <ipython-input-19-c227c808b401> in <module>
      ----> 1 partition.parse("/2020/06/2020-06-08")
      
      ~\envs\dev\lib\site-packages\pyarrow\_dataset.pyx in pyarrow._dataset.Partitioning.parse()
      
      ~\envs\dev\lib\site-packages\pyarrow\error.pxi in pyarrow.lib.pyarrow_internal_check_status()
      
      ~\envs\dev\lib\site-packages\pyarrow\error.pxi in pyarrow.lib.check_status()
      
      ArrowNotImplementedError: parsing scalars of type date32[day]
      

      Not a big issue since you can just use string and convert, but nevertheless it would be nice if it Just Worked

      
      In [22]: schema = pa.schema([("year", pa.int16()), ("month", pa.int8()), ("day", pa.string())])
      
      In [23]: partition = DirectoryPartitioning(schema)
      
      In [24]: partition.parse("/2020/06/2020-06-08")
      Out[24]: <pyarrow.dataset.AndExpression (((year == 2020:int16) and (month == 6:int8)) and (day == 2020-06-08:string))>
      

      Attachments

        Activity

          People

            bkietz Ben Kietzman
            dhirschfeld Dave Hirschfeld
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: