Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-7914

[Python] Allow pandas datetime as index for feather

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 0.15.1
    • None
    • Python
    • Windows, python 3.6.7,

    Description

      Sorry in advance if I mess anything up. This is my first issue.

      I have hourly data for 3 years using a  Pandas datetime as the index. Pandas allows me load/save .csv with the following code (only one month with 2 variables shown):
      `

      Write data to .csv

      jan90.to_csv('PEC fine course 1 grid 199001.csv', index=True)

      Load data from .csv

      jan90 = pd.read_csv('PEC fine course 1 grid 199001.csv', index_col=0, parse_dates=True)
      `
      Using .csv works, but is slow when I get to the full dataset of 26k+ rows and 21.6k+ columns (and more columns may be coming if I have to add lags to my data). So, a more efficient load/save routine is very desirable. I was excited when I found feather, but the lost index is a no-go for my use.

      Thanks for your consideration.

      Attachments

        1. PEC fine course 1 grid 199001.feather
          12 kB
          Samuel Jones
        2. PEC fine course 1 grid 199001.csv
          35 kB
          Samuel Jones

        Activity

          People

            Unassigned Unassigned
            zhongsiming Samuel Jones
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: