Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3206

Support Decimal type in generated code of Avro scanner

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.6.0
    • Fix Version/s: Impala 2.7.0
    • Component/s: Backend
    • Labels:

      Description

      HdfsAvroScanner::CodegenReadScalar() doesn't have AVRO_DECIMAL in its switch statement so it will not be codegened if AVRO_DECIMAL is in the schema. However, it appears Avro scanner was ready to call ReadAvroDecimal() all along as the following code snippet suggests:

        if (slot_desc != NULL) {
          // Field corresponds to a materialized column, fill in relevant arguments
          write_slot_val = builder->getTrue();
          if (slot_desc->type().type == TYPE_DECIMAL) {
            // ReadAvroDecimal() takes slot byte size instead of slot type
            slot_type_val = builder->getInt32(slot_desc->type().GetByteSize());
          } else {
            slot_type_val = builder->getInt32(slot_desc->type().type);
          }
          Value* slot_val =
              builder->CreateStructGEP(tuple_val, slot_desc->field_idx(), "slot");
          opaque_slot_val =
              builder->CreateBitCast(slot_val, codegen->ptr_type(), "opaque_slot");
        }
      

      We should just add AVRO_DECIMAL to the switch statement and see if things work.

        Activity

        Hide
        kwho Michael Ho added a comment -

        IMPALA-3206: Enable codegen for AVRO_DECIMAL

        This change adds the missing switch statement in
        CodegenReadScalar() for AVRO_DECIMAL so that we will
        also codegen if an avro table contains AVRO_DECIMAL.
        With this change, the following query improves by 37.5%,
        going from 8s to 5s:

        select count(distinct l_linenumber), avg(l_extendedprice), max(l_discount), min(l_tax) from tpch15_avro.lineitem;

        This change also un-inlines BitUtil::ByteSwap() as the
        third argument 'len' is not compilation constant for
        all call sites.

        Change-Id: I51adf0c1ba76e055f31ccb0034a0d23ea2afb30e
        Reviewed-on: http://gerrit.cloudera.org:8080/3489
        Reviewed-by: Michael Ho <kwho@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        kwho Michael Ho added a comment - IMPALA-3206 : Enable codegen for AVRO_DECIMAL This change adds the missing switch statement in CodegenReadScalar() for AVRO_DECIMAL so that we will also codegen if an avro table contains AVRO_DECIMAL. With this change, the following query improves by 37.5%, going from 8s to 5s: select count(distinct l_linenumber), avg(l_extendedprice), max(l_discount), min(l_tax) from tpch15_avro.lineitem; This change also un-inlines BitUtil::ByteSwap() as the third argument 'len' is not compilation constant for all call sites. Change-Id: I51adf0c1ba76e055f31ccb0034a0d23ea2afb30e Reviewed-on: http://gerrit.cloudera.org:8080/3489 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            kwho Michael Ho
            Reporter:
            kwho Michael Ho
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development