Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-18161

[Ruby] Tables can have buffers get GC'ed

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 9.0.0
    • None
    • Ruby
    • Ruby 3.1.2

    Description

      ven an Arrow::Table with several columns "X"

       

      # Rails console outputs
      3.1.2 :107 > x.schema
       => 
      #<Arrow::Schema:0x7ff2fbc426d8 ptr=0x55851587bc20 actual_values: int64
      dates: date32[day]                             
      expected_values: double>                       
      3.1.2 :108 > x.schema
       => 
      #<Arrow::Schema:0x7ff2fbbcda68 ptr=0x55851a541020 actual_values: int64
      dates: date32[day]                             
      expected_values: double>                       
      3.1.2 :109 >  

      Note that the object and pointer have both changed values.

      But the far bigger issue is that repeated reads from it will cause different results:

      3.1.2 :097 > x[1][0]
       => Sun, 22 Aug 2021 
      3.1.2 :098 > x[1][1]
       => nil 
      3.1.2 :099 > x[1][0]
       => nil 

      I have a lot of issues like this - when I have done these types of read operations, I get the original table with the data in the columns all shuffled around or deleted. 

      I do ingest the data slightly oddly in the first place as it comes in over GRPC and I am using Arrow::Buffer to read it from the GRPC and then passing that into Arrow::Table.load. But I would not expect that once it was in Arrow::Table that I could do anything to permute it unintentionally.

       

       

      Attachments

        Issue Links

          Activity

            People

              kou Kouhei Sutou
              nhorton Noah Horton
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m