Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10623

[R] Version 1.0.1 breaks data.frame attributes when reading file written by 2.0.0

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.1, 2.0.0
    • 3.0.0
    • R

    Description

      How to reproduce

      • Create a data frame:
        df <- data.frame(col1 = 1:100)
      • Write it to parquet file using apache 2.0.0. The demo uses R 3.6 but same happens if you use R 4.0
      • Read the parquet file using apache 1.0.1. I only tried that in R 3.6

      Expected

      The data frame is the same as it was before:

      structure(list(col1 = 1:100), row.names = c(NA, 100L), class = "data.frame")

      Actual

      The data frame has lost some information:

      structure(list(1:100), class = "data.frame")

      Demo

      I'm not sure what the easiest way is to put up a demo project for this, since you need to switch between arrow installations. But I've created this docker based demo:

      https://github.com/fdlk/arrow2/

      Attachments

        Activity

          People

            jonkeane Jonathan Keane
            fdlk Fleur Kelpin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 2.5h
                2.5h