Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6307

Handle empty batches in record batch sizer correctly

    XMLWordPrintableJSON

Details

    Description

      when we get empty batch, record batch sizer calculates row width as zero. In that case, we do not do accounting and memory allocation correctly for outgoing batches. 

      For example, in merge join, for outer left join, if right side batch is empty, we still have to include the right side columns as null in outgoing batch. 

      Say first batch is empty. Then, for outgoing, we allocate empty vectors with zero capacity.  When we read the next batch with data, we will end up going through realloc loop. If we use right side row width as 0 in outgoing row width calculation, number of rows we will calculate will be higher and later when we get a non empty batch, we might exceed the memory limits. 

      One possible workaround/solution : Allocate memory based on std size for empty input batch. Use allocation width as width of the batch in number of rows calculation. 

      Attachments

        Issue Links

          Activity

            People

              ppenumarthy Padma Penumarthy
              ppenumarthy Padma Penumarthy
              Paul Rogers Paul Rogers
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: