Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17169

[Go] goPanicIndex in firstTimeBitmapWriter.Finish()

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 8.0.1, 9.0.0
    • 10.0.0
    • Go, Parquet
    • go (1.18.3), Linux, AMD64

    Description

      I'm working with complex parquet files with 500+ "root" columns where some fields are lists of structs, internally referred to as 'topics'.  Some of these structs have 100's of columns.  When reading a particular topic, I get an Index Panic at the line indicated below. This error occurs when the value for the topic is Null, as in, for this particular root record, this topic has no data.  The root is household data, the topic is auto, so the error occurs when the household has no autos.  The auto field is a Nullable List of Struct.

       

      /* Finish() was called from defLevelsToBitmapInternal.
      
      data values when panic occurs....
      bw.length == 17531
      bw.bitMask == 1
      bw.pos == 3424
      bw.length == 17531
      len(bw.Buf) == 428
      cap(bw.Buf) == 448
      bw.byteOffset == 428
      bw.curByte == 0
      */
      
      // bitmap_writer.go
      func (bw *firstTimeBitmapWriter) Finish() {
      // store curByte into the bitmap
           if bw.length >0&& bw.bitMask !=0x01|| bw.pos < bw.length {
                bw.buf[int(bw.byteOffset)] = bw.curByte   // <---- Panic index
           }
      }
      

      In every case, when the panic occurs, bw.byteOffset == len(bw.Buf). I tested the below modification and it does remedy the bug. However, it's probably only masking the actual bug.

      // Test version: No Panic
      func (bw *firstTimeBitmapWriter) Finish() {
      	// store curByte into the bitmap
      	if bw.length > 0 && bw.bitMask != 0x01 || bw.pos < bw.length {
                      if int(bw.byteOffset) == len(bw.Buf) {
                           bw.buf = append(bw.buf, bw.curByte)
                      } else {
      		     bw.buf[int(bw.byteOffset)] = bw.curByte
                     }
      	}
      }

      Attachments

        Issue Links

          Activity

            People

              zeroshade Matthew Topol
              Purdom Robert Purdom
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m