Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-7914

[Python] Allow pandas datetime as index for feather

    XMLWordPrintableJSON

Details

    Description

      Sorry in advance if I mess anything up. This is my first issue.

      I have hourly data for 3 years using a  Pandas datetime as the index. Pandas allows me load/save .csv with the following code (only one month with 2 variables shown):
      `

      Write data to .csv

      jan90.to_csv('PEC fine course 1 grid 199001.csv', index=True)

      Load data from .csv

      jan90 = pd.read_csv('PEC fine course 1 grid 199001.csv', index_col=0, parse_dates=True)
      `
      Using .csv works, but is slow when I get to the full dataset of 26k+ rows and 21.6k+ columns (and more columns may be coming if I have to add lags to my data). So, a more efficient load/save routine is very desirable. I was excited when I found feather, but the lost index is a no-go for my use.

      Thanks for your consideration.

      Attachments

        1. PEC fine course 1 grid 199001.csv
          35 kB
          Samuel Jones
        2. PEC fine course 1 grid 199001.feather
          12 kB
          Samuel Jones

        Issue Links

          Activity

            People

              salonijain27 saloni jain
              zhongsiming Samuel Jones
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m