Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4160 Vectorized Query Execution in Hive
  3. HIVE-4553

Column Column, and Column Scalar vectorized execution tests

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • vectorization-branch
    • vectorization-branch, 0.13.0
    • None
    • None

    Description

      review board review: https://reviews.apache.org/r/11133/

      This patch adds Column Column, and Column Scalar vectorized execution tests. These tests are generated in parallel with the vectorized expressions. The tests focus is on validating the column vector and the vectorized row batch metadata regarding nulls, repeating, and selection.

      Overview of Changes:

      CodeGen.java:
      + joinPath, getCamelCaseType, readFile and writeFile made static for use in TestCodeGen.java.
      + filter types now specify null as their output type rather than "doesn't matter" to make detection for test generation easier.
      + support for test generation added.

      TestCodeGen.java & Templates:
      TestClass.txt
      TestColumnColumnFilterVectorExpressionEvaluation.txt,
      TestColumnColumnOperationVectorExpressionEvaluation.txt,
      TestColumnScalarFilterVectorExpressionEvaluation.txt,
      TestColumnScalarOperationVectorExpressionEvaluation.txt
      +This class is mutable and maintains a hashmap of TestSuiteClassName to test cases. The tests cases are added over the course of vectorized expressions class generation, with test classes being outputted at the end. For each column vector (inputs and/or outputs) a matrix of pairwise covering Booleans is used to generate test cases across nulls and repeating dimensions. Based on the input column vector(s) nulls and repeating states the states of the output column vector (if there is one) is validated, along with the null vector. For filter operations the selection vector is validated against the generated data. Each template corresponds to a class representing a test suite.

      VectorizedRowGroupUtil.java
      +added methods generateLongColumnVector and generateDoubleColumnVector for generating the respective column vectors with optional nulls and/or repeating values.

      Attachments

        1. HIVE-4553.4.patch
          1.03 MB
          Tony Murphy
        2. HIVE-4553.5.patch
          1.05 MB
          Tony Murphy
        3. HIVE-4553.patch
          922 kB
          Tony Murphy
        4. HIVE-4553 (2).patch
          1000 kB
          Tony Murphy
        5. HIVE-4553 (3).patch
          42 kB
          Tony Murphy

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            anthony.murphy Tony Murphy Assign to me
            anthony.murphy Tony Murphy
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment