Hive
  1. Hive
  2. HIVE-4160 Vectorized Query Execution in Hive
  3. HIVE-4592

fix failure to set output isNull to true and other NULL propagation issues; update arithmetic tests

    Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: vectorization-branch
    • Fix Version/s: vectorization-branch, 0.13.0
    • Component/s: None
    • Labels:
      None

      Description

      ColumnArithmeticColumn.txt should set the output column's noNulls flag to true if neither input column has nulls, but it does not do that. This can lead to wrong results is the noNulls was set to false in a previous use of the batch.

      1. HIVE-4592.1.patch
        121 kB
        Eric Hanson
      2. HIVE-4592.3.patch
        222 kB
        Eric Hanson
      3. HIVE-4592.4.patch
        242 kB
        Eric Hanson

        Activity

        Hide
        Eric Hanson added a comment -

        Found some other issues in null propagation as well.

        Show
        Eric Hanson added a comment - Found some other issues in null propagation as well.
        Hide
        Jitendra Nath Pandey added a comment -

        Same issue exists in many other templates. I think we should fix them too in the same jira.
        Also, most of these templates assume that noNulls=false and isRepeating=true means all values are null.

        Show
        Jitendra Nath Pandey added a comment - Same issue exists in many other templates. I think we should fix them too in the same jira. Also, most of these templates assume that noNulls=false and isRepeating=true means all values are null.
        Hide
        Eric Hanson added a comment -

        Let me look at the other templates. I expect I will broaden this to cover other templates. I fixed the issue in this template ColumnArithmeticColumn.txt regarding the assumption that noNulls=false and isRepeating=true means all values are null, which is not a correct assumption.

        So, the patch is not ready yet. Wait for another revision.

        Show
        Eric Hanson added a comment - Let me look at the other templates. I expect I will broaden this to cover other templates. I fixed the issue in this template ColumnArithmeticColumn.txt regarding the assumption that noNulls=false and isRepeating=true means all values are null, which is not a correct assumption. So, the patch is not ready yet. Wait for another revision.
        Hide
        Eric Hanson added a comment -

        Another issue is that output Long column vectors should have 1 in the data vector, and output Double column vectors should have NaN in the data vector. Adding tests and fixes for this.

        Show
        Eric Hanson added a comment - Another issue is that output Long column vectors should have 1 in the data vector, and output Double column vectors should have NaN in the data vector. Adding tests and fixes for this.
        Hide
        Eric Hanson added a comment -

        Completed review and update of Col-Col, Col-Scalar, and Scalar-Col arithmetic templates. Verified isRepeating and noNulls propagation from input(s)to output column. Made sure that NULL values have their data entry set properly in the output data vector as well, based on the spec (1 for long, NaN for double).

        Show
        Eric Hanson added a comment - Completed review and update of Col-Col, Col-Scalar, and Scalar-Col arithmetic templates. Verified isRepeating and noNulls propagation from input(s)to output column. Made sure that NULL values have their data entry set properly in the output data vector as well, based on the spec (1 for long, NaN for double).
        Hide
        Jitendra Nath Pandey added a comment -

        Long-long division is handled specially, as it is cast to double division. These expressions are no longer generated using templates. Please add the fix to those too.
        They are located in: ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/

        Show
        Jitendra Nath Pandey added a comment - Long-long division is handled specially, as it is cast to double division. These expressions are no longer generated using templates. Please add the fix to those too. They are located in: ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/
        Hide
        Eric Hanson added a comment -

        Okay, I will do that.

        Show
        Eric Hanson added a comment - Okay, I will do that.
        Hide
        Eric Hanson added a comment -

        Patch is ready to go. See review board entry https://reviews.apache.org/r/11385/.

        Show
        Eric Hanson added a comment - Patch is ready to go. See review board entry https://reviews.apache.org/r/11385/ .
        Hide
        Ashutosh Chauhan added a comment -

        Committed to branch. Thanks, Eric!

        Show
        Ashutosh Chauhan added a comment - Committed to branch. Thanks, Eric!

          People

          • Assignee:
            Eric Hanson
            Reporter:
            Eric Hanson
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development