Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-7914

[Python] Allow pandas datetime as index for feather

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Sorry in advance if I mess anything up. This is my first issue.

      I have hourly data for 3 years using a  Pandas datetime as the index. Pandas allows me load/save .csv with the following code (only one month with 2 variables shown):
      `

      Write data to .csv

      jan90.to_csv('PEC fine course 1 grid 199001.csv', index=True)

      Load data from .csv

      jan90 = pd.read_csv('PEC fine course 1 grid 199001.csv', index_col=0, parse_dates=True)
      `
      Using .csv works, but is slow when I get to the full dataset of 26k+ rows and 21.6k+ columns (and more columns may be coming if I have to add lags to my data). So, a more efficient load/save routine is very desirable. I was excited when I found feather, but the lost index is a no-go for my use.

      Thanks for your consideration.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            salonijain27 saloni jain
            zhongsiming Samuel Jones
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 10m
                1h 10m

                Slack

                  Issue deployment