Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-13233 [C++] Support ORC in Arrow Dataset
  3. ARROW-13572

[C++][Python] Add basic ORC support to the pyarrow.datasets API

    XMLWordPrintableJSON

Details

    Description

      There is significant interest in having directory-partitioned ORC support from users of Dask.  Since Dask already leverages the pyarrow.datasets API for parquet-formatted data, having ORC support through the same pyarrow API would be extremely useful.

      Attachments

        Issue Links

          Activity

            People

              jorisvandenbossche Joris Van den Bossche
              rjzamora Rick Zamora
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 8.5h
                  8.5h