Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-18274

[Go] Sparse union of structs is buggy

    XMLWordPrintableJSON

Details

    Description

      There is a bug with union of structs in V10.

      The first unit test crash with a panic (i.e. invalid memory address or nil pointer dereference). The second test works as expected.

       

      func TestDoesNotWork(t *testing.T) {
         dt1 := arrow.SparseUnionOf([]arrow.Field{
            {Name: "c", Type: arrow2.DictU16String},
         }, []arrow.UnionTypeCode{0})
         dt2 := arrow.StructOf(
            arrow.Field{Name: "b", Type: dt1},
         )
         dt3 := arrow.SparseUnionOf([]arrow.Field{
            {Name: "a", Type: dt2},
         }, []arrow.UnionTypeCode{0})
         pool := memory.NewGoAllocator()
      
         builder := array.NewSparseUnionBuilder(pool, dt3)
         defer builder.Release()
         arr := builder.NewArray()
         defer arr.Release()
         assert.Equal(t, 0, arr.Len())
      }
      
      func TestWorksAsExpected(t *testing.T) {
         dt1 := arrow.SparseUnionOf([]arrow.Field{
            {Name: "c", Type: &arrow.DictionaryType{
               IndexType: arrow.PrimitiveTypes.Uint16,
               ValueType: arrow.BinaryTypes.String,
               Ordered:   false,
            }},
         }, []arrow.UnionTypeCode{0})
         dt2 := arrow.SparseUnionOf([]arrow.Field{
            {Name: "a", Type: dt1},
         }, []arrow.UnionTypeCode{0})
         pool := memory.NewGoAllocator()
      
         builder := array.NewSparseUnionBuilder(pool, dt2)
         defer builder.Release()
         arr := builder.NewArray()
         defer arr.Release()
         assert.Equal(t, 0, arr.Len())
      } 

       

      Analysis:

      • The `NewSparseUnionBuilder` calls the builders for each variant and also calls defer builder.Release. 
      • The Struct Release method calls the Release methods of every field even if the refCount is not 0, so the Release method of the second union is called followed by the Release method of the dictionary. 
      • Although, the union builder is returned without error, the builder is not usable.
      • This bug doesn't happen with 2 nested unions. As the internal counter is properly tested.

       

      First, I don't understand why the Release method of each variant is called right after the Union constructor is created. I also don't understand why the Release method of the structure calls the Release method of each field regardless of the value of the internal refCount. This looks like a bug to me, but I'm not quite sure yet what the right way to fix it will be.

       

      Any idea?

      Attachments

        Activity

          People

            zeroshade Matthew Topol
            lquerel Laurent Querel
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 20m
                1h 20m