Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5192

Avoid hard coding pointer to the tuple pool into generated IR of Tuple::CodegenMaterializeExprs()

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Impala 2.9.0
    • Component/s: Backend
    • Labels:
    • Epic Color:
      ghx-label-7

      Description

      We currently require having a tuple pool that is attached to the exec node because Codegening Tuple::CodegenMaterializeExprs() requires having a pointer to the tuple pool at codegen time. It would be better if it was possible to use the tuple pool of the row batch.

      This affects Union and TopN nodes (and maybe others).

        Issue Links

          Activity

          Hide
          kwho Michael Ho added a comment -

          https://github.com/apache/incubator-impala/commit/e78d71e63328397ffdb59066982c0c6e83feb3d9

          IMPALA-5192: Don't bake MemPool* into IR
          Tuple::CodegenMaterializeExprs() currently bakes the MemPool*
          provided by its caller into the generated IR. The MemPool*
          usually belongs to some exec nodes which owns the codegend
          function and it's used for allocating string buffer. With
          multi-threading, IR needs to be shared across multiple fragment
          instances so IR can no longer contain pointers not shared
          across fragment instances.

          This change fixes the problem above by using the MemPool*
          argument passed to the IR function. This also cleans up
          UnionNode by removing the field tuple_pool_ from it and
          the logic for transferring buffer from tuple_pool_ to the
          MemPool of the row batch.

          Change-Id: I09d620e48032351ab9805825a4afb6536bed2302
          Reviewed-on: http://gerrit.cloudera.org:8080/6657
          Reviewed-by: Michael Ho <kwho@cloudera.com>
          Tested-by: Impala Public Jenkins

          Show
          kwho Michael Ho added a comment - https://github.com/apache/incubator-impala/commit/e78d71e63328397ffdb59066982c0c6e83feb3d9 IMPALA-5192 : Don't bake MemPool* into IR Tuple::CodegenMaterializeExprs() currently bakes the MemPool* provided by its caller into the generated IR. The MemPool* usually belongs to some exec nodes which owns the codegend function and it's used for allocating string buffer. With multi-threading, IR needs to be shared across multiple fragment instances so IR can no longer contain pointers not shared across fragment instances. This change fixes the problem above by using the MemPool* argument passed to the IR function. This also cleans up UnionNode by removing the field tuple_pool_ from it and the logic for transferring buffer from tuple_pool_ to the MemPool of the row batch. Change-Id: I09d620e48032351ab9805825a4afb6536bed2302 Reviewed-on: http://gerrit.cloudera.org:8080/6657 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins
          Hide
          kwho Michael Ho added a comment -

          We shouldn't bake MemPool* into IR code.

          Show
          kwho Michael Ho added a comment - We shouldn't bake MemPool* into IR code.

            People

            • Assignee:
              kwho Michael Ho
              Reporter:
              tarasbob Taras Bobrovytsky
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development