Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.7.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Frontend
    • Labels:

      Description

      [localhost:21000] > create view test_view as SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col, year, month, day FROM functional.alltypesagg;
      Query: create view test_view as SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col, year, month, day FROM functional.alltypesagg
      ++
      ||
      ++
      ++
      Fetched 0 row(s) in 0.06s
      [localhost:21000] > select count(distinct null) from test_view;
      Query: select count(distinct null) from test_view
      Query submitted at: 2016-08-29 17:07:44 (Coordinator: http://mj-desktop.ca.cloudera.com:25000)
      ERROR: IllegalStateException: Grouping expr NULL returns type NULL_TYPE but its output tuple slot has type BOOLEAN
      

        Activity

        Hide
        yonghyun_impala_8905 Yonghyun Hwang added a comment -

        I am working on it now. my first step is to reproduce it on my side.

        Show
        yonghyun_impala_8905 Yonghyun Hwang added a comment - I am working on it now. my first step is to reproduce it on my side.
        Hide
        yonghyun_impala_8905 Yonghyun Hwang added a comment -

        It seems that the class "AggregateInfo" throws the exception in "checkConsistency" method, where precondition for type checking is performed. I am browsing around source code to have _rough_ understanding on how the sql hits this code path. (starting from sql parsing, cup & jflex) My challenges for now is to read "Java" code while learning it.

        Show
        yonghyun_impala_8905 Yonghyun Hwang added a comment - It seems that the class "AggregateInfo" throws the exception in "checkConsistency" method, where precondition for type checking is performed. I am browsing around source code to have _ rough _ understanding on how the sql hits this code path. (starting from sql parsing, cup & jflex) My challenges for now is to read "Java" code while learning it.
        Hide
        yonghyun_impala_8905 Yonghyun Hwang added a comment -

        [1] com.cloudera.impala.analysis.AggregateInfo.checkConsistency (AggregateInfo.java:684)
        [2] com.cloudera.impala.planner.AggregationNode.init (AggregationNode.java:163)
        [3] com.cloudera.impala.planner.SingleNodePlanner.createAggregationPlan (SingleNodePlanner.java:864)
        [4] com.cloudera.impala.planner.SingleNodePlanner.createSelectPlan (SingleNodePlanner.java:588)
        [5] com.cloudera.impala.planner.SingleNodePlanner.createQueryPlan (SingleNodePlanner.java:236)
        [6] com.cloudera.impala.planner.SingleNodePlanner.createSingleNodePlan (SingleNodePlanner.java:144)
        [7] com.cloudera.impala.planner.Planner.createPlan (Planner.java:79)
        [8] com.cloudera.impala.service.Frontend.createExecRequest (Frontend.java:975)
        [9] com.cloudera.impala.service.JniFrontend.createExecRequest (JniFrontend.java:150)

        I am starting from JniFrontend.java

        Show
        yonghyun_impala_8905 Yonghyun Hwang added a comment - [1] com.cloudera.impala.analysis.AggregateInfo.checkConsistency (AggregateInfo.java:684) [2] com.cloudera.impala.planner.AggregationNode.init (AggregationNode.java:163) [3] com.cloudera.impala.planner.SingleNodePlanner.createAggregationPlan (SingleNodePlanner.java:864) [4] com.cloudera.impala.planner.SingleNodePlanner.createSelectPlan (SingleNodePlanner.java:588) [5] com.cloudera.impala.planner.SingleNodePlanner.createQueryPlan (SingleNodePlanner.java:236) [6] com.cloudera.impala.planner.SingleNodePlanner.createSingleNodePlan (SingleNodePlanner.java:144) [7] com.cloudera.impala.planner.Planner.createPlan (Planner.java:79) [8] com.cloudera.impala.service.Frontend.createExecRequest (Frontend.java:975) [9] com.cloudera.impala.service.JniFrontend.createExecRequest (JniFrontend.java:150) I am starting from JniFrontend.java
        Hide
        yonghyun_impala_8905 Yonghyun Hwang added a comment -

        I found interesting code that would be a cause of bug.

        FunctionCallExpr.java

        • L393: special code for count func
            if (fnName_.getFunction().equals("count") && params_.isDistinct()) {
              // Treat COUNT(DISTINCT ...) special because of how we do the rewrite.
              // There is no version of COUNT() that takes more than 1 argument but after
              // the rewrite, we only need count(*).
              // TODO: fix how we rewrite count distinct.
              argTypes = new Type[0];
              Function searchDesc = new Function(fnName_, argTypes, Type.INVALID, false);
              fn_ = db.getFunction(searchDesc, Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF);
              type_ = fn_.getReturnType();
              // Make sure BE doesn't see any TYPE_NULL exprs
              for (int i = 0; i < children_.size(); ++i) {
                if (getChild(i).getType().isNull()) {
                  uncheckedCastChild(ScalarType.BOOLEAN, i);   
                }
              }
              return;
            }
        

        As shown above, we change NULL_TYPE into BOOLEAN while this would cause failure in a type checking.

        I am digging further to stitch out all the clues to resolve this.

        Show
        yonghyun_impala_8905 Yonghyun Hwang added a comment - I found interesting code that would be a cause of bug. FunctionCallExpr.java L393: special code for count func if (fnName_.getFunction().equals("count") && params_.isDistinct()) { // Treat COUNT(DISTINCT ...) special because of how we do the rewrite. // There is no version of COUNT() that takes more than 1 argument but after // the rewrite, we only need count(*). // TODO: fix how we rewrite count distinct. argTypes = new Type[0]; Function searchDesc = new Function(fnName_, argTypes, Type.INVALID, false); fn_ = db.getFunction(searchDesc, Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF); type_ = fn_.getReturnType(); // Make sure BE doesn't see any TYPE_NULL exprs for (int i = 0; i < children_.size(); ++i) { if (getChild(i).getType().isNull()) { uncheckedCastChild(ScalarType.BOOLEAN, i); } } return; } As shown above, we change NULL_TYPE into BOOLEAN while this would cause failure in a type checking. I am digging further to stitch out all the clues to resolve this.
        Hide
        yonghyun_impala_8905 Yonghyun Hwang added a comment -

        I am seeing that expression substitution changes BOOLEAN back to NULL_TYPE, which makes the precondition check fail. I am reading code to understand what's substitution for and come up w/ a fix.

        Show
        yonghyun_impala_8905 Yonghyun Hwang added a comment - I am seeing that expression substitution changes BOOLEAN back to NULL_TYPE, which makes the precondition check fail. I am reading code to understand what's substitution for and come up w/ a fix.
        Hide
        yonghyun_impala_8905 Yonghyun Hwang added a comment -

        // in Expr.java

          protected Expr substituteImpl(ExprSubstitutionMap smap, Analyzer analyzer)                                                  
              throws AnalysisException {                                                                                              
            if (isImplicitCast()) return getChild(0).substituteImpl(smap, analyzer);                                                  
            if (smap != null) {                                                                                                       
              Expr substExpr = smap.get(this);                                                                                        
              if (substExpr != null) return substExpr.clone();                                                                        
            }                                                                                                                         
            for (int i = 0; i < children_.size(); ++i) {                                                                              
              children_.set(i, children_.get(i).substituteImpl(smap, analyzer));                                                      
            }                                                                                                                         
            // SlotRefs must remain analyzed to support substitution across query blocks. All                                         
            // other exprs must be analyzed again after the substitution to add implicit casts                                        
            // and for resolving their correct function signature.                                                                    
            if (!(this instanceof SlotRef)) resetAnalysisState();                                                                     
            return this;                                                                                                              
          }                                                                                                                           
        

        specifically, resetAnalysisState() changes BOOLEAN back to NULL_TYPE.

        next step is to have more understanding on the logic for substitution and try to come up w/ solution.

        Show
        yonghyun_impala_8905 Yonghyun Hwang added a comment - // in Expr.java protected Expr substituteImpl(ExprSubstitutionMap smap, Analyzer analyzer) throws AnalysisException { if (isImplicitCast()) return getChild(0).substituteImpl(smap, analyzer); if (smap != null) { Expr substExpr = smap.get(this); if (substExpr != null) return substExpr.clone(); } for (int i = 0; i < children_.size(); ++i) { children_.set(i, children_.get(i).substituteImpl(smap, analyzer)); } // SlotRefs must remain analyzed to support substitution across query blocks. All // other exprs must be analyzed again after the substitution to add implicit casts // and for resolving their correct function signature. if (!(this instanceof SlotRef)) resetAnalysisState(); return this; } specifically, resetAnalysisState() changes BOOLEAN back to NULL_TYPE. next step is to have more understanding on the logic for substitution and try to come up w/ solution.
        Hide
        yonghyun_impala_8905 Yonghyun Hwang added a comment -

        I've confirmed that "select count(distinct null) from functional.alltypesagg;" works fine while "select count(distinct null) from test_view;" shows the exception. Both of two exercises the same code path except the substitution part. W/ the view, substitution for NullLiteral gets triggered, which sets its type back to NULL while it was set to BOOLEAN before. This is the cause of the bug. My solution is not to substitute NullLiteral. Now, I am implementing a fix.

        Show
        yonghyun_impala_8905 Yonghyun Hwang added a comment - I've confirmed that "select count(distinct null) from functional.alltypesagg;" works fine while "select count(distinct null) from test_view;" shows the exception. Both of two exercises the same code path except the substitution part. W/ the view, substitution for NullLiteral gets triggered, which sets its type back to NULL while it was set to BOOLEAN before. This is the cause of the bug. My solution is not to substitute NullLiteral. Now, I am implementing a fix.
        Hide
        yonghyun_impala_8905 Yonghyun Hwang added a comment -

        I've implemented the fix and run unit tests, planner and analyzestmt. Both of them are passing. With the fix, no exception is thrown. Here is the output.

        [localhost:21000] > select count(distinct null) from test_view;
        Query: select count(distinct null) from test_view
        Query submitted at: 2016-09-30 13:39:05 (Coordinator: http://impala-OptiPlex-7040:25000)
        Query progress can be monitored at: http://impala-OptiPlex-7040:25000/query_plan?query_id=5745df9bb49d729f:da769e8d00000000
        +----------------------+
        | count(distinct null) |
        +----------------------+
        | 0                    |
        +----------------------+
        Fetched 1 row(s) in 9.57s
        
        [localhost:21000] > explain select count(distinct null) from test_view;
        Query: explain select count(distinct null) from test_view
        +-----------------------------------------------------------+
        | Explain String                                            |
        +-----------------------------------------------------------+
        | Estimated Per-Host Requirements: Memory=100.00MB VCores=2 |
        |                                                           |
        | 06:AGGREGATE [FINALIZE]                                   |
        | |  output: count:merge(NULL)                              |
        | |                                                         |
        | 05:EXCHANGE [UNPARTITIONED]                               |
        | |                                                         |
        | 02:AGGREGATE                                              |
        | |  output: count(NULL)                                    |
        | |                                                         |
        | 04:AGGREGATE                                              |
        | |  group by: NULL                                         |
        | |                                                         |
        | 03:EXCHANGE [HASH(NULL)]                                  |
        | |                                                         |
        | 01:AGGREGATE [STREAMING]                                  |
        | |  group by: NULL                                         |
        | |                                                         |
        | 00:SCAN HDFS [functional.alltypesagg]                     |
        |    partitions=11/11 files=11 size=814.73KB                |
        +-----------------------------------------------------------+
        Fetched 20 row(s) in 0.06s
        

        I will run exhaustive test and go for code review once it gets complete successfully.

        Show
        yonghyun_impala_8905 Yonghyun Hwang added a comment - I've implemented the fix and run unit tests, planner and analyzestmt. Both of them are passing. With the fix, no exception is thrown. Here is the output. [localhost:21000] > select count(distinct null) from test_view; Query: select count(distinct null) from test_view Query submitted at: 2016-09-30 13:39:05 (Coordinator: http://impala-OptiPlex-7040:25000) Query progress can be monitored at: http://impala-OptiPlex-7040:25000/query_plan?query_id=5745df9bb49d729f:da769e8d00000000 +----------------------+ | count(distinct null) | +----------------------+ | 0 | +----------------------+ Fetched 1 row(s) in 9.57s [localhost:21000] > explain select count(distinct null) from test_view; Query: explain select count(distinct null) from test_view +-----------------------------------------------------------+ | Explain String | +-----------------------------------------------------------+ | Estimated Per-Host Requirements: Memory=100.00MB VCores=2 | | | | 06:AGGREGATE [FINALIZE] | | | output: count:merge(NULL) | | | | | 05:EXCHANGE [UNPARTITIONED] | | | | | 02:AGGREGATE | | | output: count(NULL) | | | | | 04:AGGREGATE | | | group by: NULL | | | | | 03:EXCHANGE [HASH(NULL)] | | | | | 01:AGGREGATE [STREAMING] | | | group by: NULL | | | | | 00:SCAN HDFS [functional.alltypesagg] | | partitions=11/11 files=11 size=814.73KB | +-----------------------------------------------------------+ Fetched 20 row(s) in 0.06s I will run exhaustive test and go for code review once it gets complete successfully.
        Hide
        yonghyun_impala_8905 Yonghyun Hwang added a comment -

        Currently, the fix is out for code review.

        Show
        yonghyun_impala_8905 Yonghyun Hwang added a comment - Currently, the fix is out for code review.
        Hide
        yonghyun_impala_8905 Yonghyun Hwang added a comment -

        The fix has been merged

        Show
        yonghyun_impala_8905 Yonghyun Hwang added a comment - The fix has been merged
        Hide
        yonghyun_impala_8905 Yonghyun Hwang added a comment -

        IMPALA-4042: Preserve root types when substituting grouping exprs

        In case of count(distinct), FunctionCallExpr.analyze() changes type
        for "NULL" into "BOOLEAN" to make sure that BE doesn't see any
        "NULL_TYPE" exprs. In the meantime, Expr substitution, happening in
        Expr.substituteImpl() reverts this change back to original type,
        "NULL_TYPE".

        This causes an issue when AggregateInfo.checkConsistency() performs
        precondition check where slot types from
        AggregateInfo.outputTupleDesc_ should be matched with the types from
        AggregateInfo.groupingExpr_. The slot type shows "BOOLEAN" while type
        from groupingExpr_ is "NULL_TYPE", which makes the precondition fail
        and throws an exception.

        To resolve the issue, preserveRootType is set to true when
        Expr.substituteList() gets called in AggregateInfo.substitute()

        Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1
        (cherry picked from commit b17785b4890bedd1c825140ce3c48cd7d9734295)
        Reviewed-on: http://gerrit.cloudera.org:8080/4600
        Reviewed-by: Alex Behm <alex.behm@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        yonghyun_impala_8905 Yonghyun Hwang added a comment - IMPALA-4042 : Preserve root types when substituting grouping exprs In case of count(distinct), FunctionCallExpr.analyze() changes type for "NULL" into "BOOLEAN" to make sure that BE doesn't see any "NULL_TYPE" exprs. In the meantime, Expr substitution, happening in Expr.substituteImpl() reverts this change back to original type, "NULL_TYPE". This causes an issue when AggregateInfo.checkConsistency() performs precondition check where slot types from AggregateInfo.outputTupleDesc_ should be matched with the types from AggregateInfo.groupingExpr_. The slot type shows "BOOLEAN" while type from groupingExpr_ is "NULL_TYPE", which makes the precondition fail and throws an exception. To resolve the issue, preserveRootType is set to true when Expr.substituteList() gets called in AggregateInfo.substitute() Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 (cherry picked from commit b17785b4890bedd1c825140ce3c48cd7d9734295) Reviewed-on: http://gerrit.cloudera.org:8080/4600 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            yonghyun_impala_8905 Yonghyun Hwang
            Reporter:
            mjacobs Matthew Jacobs
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development