Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10623

[R] Version 1.0.1 breaks data.frame attributes when reading file written by 2.0.0

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.1, 2.0.0
    • Fix Version/s: 2.0.1, 3.0.0
    • Component/s: R

      Description

      How to reproduce

      • Create a data frame:
        df <- data.frame(col1 = 1:100)
      • Write it to parquet file using apache 2.0.0. The demo uses R 3.6 but same happens if you use R 4.0
      • Read the parquet file using apache 1.0.1. I only tried that in R 3.6

      Expected

      The data frame is the same as it was before:

      structure(list(col1 = 1:100), row.names = c(NA, 100L), class = "data.frame")

      Actual

      The data frame has lost some information:

      structure(list(1:100), class = "data.frame")

      Demo

      I'm not sure what the easiest way is to put up a demo project for this, since you need to switch between arrow installations. But I've created this docker based demo:

      https://github.com/fdlk/arrow2/

        Attachments

          Activity

            People

            • Assignee:
              jonkeane Jonathan Keane
              Reporter:
              fdlk Fleur Kelpin

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 2.5h
                2.5h

                  Issue deployment