Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17824

[C++][Gandiva] Implement preallocation for variable length output buffer

    XMLWordPrintableJSON

Details

    Description

      When the output type of an expression is of variable length, e.g. string, Gandiva would realloc the output buffer to make space for new outputs for each row. When num of rows is high some memory allocators perform poorly.

      We can use the std::vector like approach to amortize the allcation cost. First allocate some initial space depending on the input size. Each time we run out of space, double the buffer size. In the end shrink it to fit the actual size. Arrow string builder also uses this approach.

      Attachments

        Issue Links

          Activity

            People

              jinshang Jin Shang
              jinshang Jin Shang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 10m
                  2h 10m