Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: vectorization-branch
    • Fix Version/s: vectorization-branch, 0.13.0
    • Component/s: None
    • Labels:
      None

      Description

      Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode.

      This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs.

      I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA.

      1. vectorUDF.4.patch
        24 kB
        Eric Hanson
      2. vectorUDF.5.patch
        24 kB
        Eric Hanson
      3. vectorUDF.8.patch
        36 kB
        Eric Hanson
      4. vectorUDF.9.patch
        48 kB
        Eric Hanson
      5. HIVE-4961.1-vectorization.patch
        51 kB
        Eric Hanson
      6. HIVE-4961.2-vectorization.patch
        50 kB
        Eric Hanson
      7. HIVE-4961.3-vectorization.patch
        50 kB
        Eric Hanson
      8. HIVE-4961.4-vectorization.patch
        52 kB
        Eric Hanson

        Activity

        Hide
        Eric Hanson added a comment -

        See the design specification attached to HIVE-4160 for a design description of this patch.

        Show
        Eric Hanson added a comment - See the design specification attached to HIVE-4160 for a design description of this patch.
        Hide
        Ashutosh Chauhan added a comment -

        Committed to branch. Thanks, Eric!

        Show
        Ashutosh Chauhan added a comment - Committed to branch. Thanks, Eric!
        Hide
        Hive QA added a comment -

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12603642/HIVE-4961.4-vectorization.patch

        ERROR: -1 due to 8 failed/errored test(s), 3954 tests executed
        Failed tests:

        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4
        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json
        org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold
        org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump
        org.apache.hcatalog.cli.TestPermsGrp.testCustomPerms
        org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable
        org.apache.hive.hcatalog.mapreduce.TestHCatExternalHCatNonPartitioned.testHCatNonPartitionedTable
        org.apache.hive.hcatalog.mapreduce.TestHCatExternalPartitioned.testHCatPartitionedTable
        

        Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/786/testReport
        Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/786/console

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests failed with: TestsFailedException: 8 tests failed
        

        This message is automatically generated.

        Show
        Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12603642/HIVE-4961.4-vectorization.patch ERROR: -1 due to 8 failed/errored test(s), 3954 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump org.apache.hcatalog.cli.TestPermsGrp.testCustomPerms org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.mapreduce.TestHCatExternalHCatNonPartitioned.testHCatNonPartitionedTable org.apache.hive.hcatalog.mapreduce.TestHCatExternalPartitioned.testHCatPartitionedTable Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/786/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/786/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 8 tests failed This message is automatically generated.
        Hide
        Eric Hanson added a comment -

        Refactor packages per request from Ashutosh.

        Show
        Eric Hanson added a comment - Refactor packages per request from Ashutosh.
        Hide
        Ashutosh Chauhan added a comment -

        I have few concerns regarding code organization.

        • We should create new packages ql/exec/vector/udf, ql/exec/vector/udf/legacy and ql/exec/vector/udf/generic.
        • We should put classes VectorUDFAdaptor and VectorUDFArgDesc in vector/udf.
        • We should put LongUDF in vector/udf/legacy
        • We should put GenericUDFIsNull in vector/udf/generic

        You can chose to do this in follow-up patch or in this one. I am fine either way, let me know.

        Show
        Ashutosh Chauhan added a comment - I have few concerns regarding code organization. We should create new packages ql/exec/vector/udf, ql/exec/vector/udf/legacy and ql/exec/vector/udf/generic. We should put classes VectorUDFAdaptor and VectorUDFArgDesc in vector/udf. We should put LongUDF in vector/udf/legacy We should put GenericUDFIsNull in vector/udf/generic You can chose to do this in follow-up patch or in this one. I am fine either way, let me know.
        Hide
        Ashutosh Chauhan added a comment -

        hcatalog tests are flaky and we can ignore them. But, none of hive tests fail in trunk. Its not likely related to your patch though. I have seen input4.q and plan_json.q to fail consistently only on vectorization branch, so they need to be debugged on branch. orc tests I am not sure, but if they fail regardless of patch, I think this patch is good to go.

        Show
        Ashutosh Chauhan added a comment - hcatalog tests are flaky and we can ignore them. But, none of hive tests fail in trunk. Its not likely related to your patch though. I have seen input4.q and plan_json.q to fail consistently only on vectorization branch, so they need to be debugged on branch. orc tests I am not sure, but if they fail regardless of patch, I think this patch is good to go.
        Hide
        Eric Hanson added a comment -

        I ran the failing tests on my machine on a clean version of the vectorization branch without my patch. These tests failed:

        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4
        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json
        org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold
        org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump

        These tests would not run in a way that produced output in ant testreport, and my changes should not affect them.

        org.apache.hcatalog.listener.TestNotificationListener.testAMQListener
        org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable
        org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask
        org.apache.hive.hcatalog.pig.TestHCatStorer.testPartColsInData
        org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreInPartiitonedTbl
        org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreMultiTables
        org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreWithNoSchema

        Show
        Eric Hanson added a comment - I ran the failing tests on my machine on a clean version of the vectorization branch without my patch. These tests failed: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump These tests would not run in a way that produced output in ant testreport, and my changes should not affect them. org.apache.hcatalog.listener.TestNotificationListener.testAMQListener org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hive.hcatalog.pig.TestHCatStorer.testPartColsInData org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreInPartiitonedTbl org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreMultiTables org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreWithNoSchema
        Hide
        Eric Hanson added a comment -

        As far as I can tell, the 11 test failures report in the last test run are not related to this patch.

        Show
        Eric Hanson added a comment - As far as I can tell, the 11 test failures report in the last test run are not related to this patch.
        Hide
        Hive QA added a comment -

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12603380/HIVE-4961.3-vectorization.patch

        ERROR: -1 due to 11 failed/errored test(s), 3954 tests executed
        Failed tests:

        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4
        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json
        org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold
        org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump
        org.apache.hcatalog.listener.TestNotificationListener.testAMQListener
        org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable
        org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask
        org.apache.hive.hcatalog.pig.TestHCatStorer.testPartColsInData
        org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreInPartiitonedTbl
        org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreMultiTables
        org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreWithNoSchema
        

        Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/763/testReport
        Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/763/console

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests failed with: TestsFailedException: 11 tests failed
        

        This message is automatically generated.

        Show
        Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12603380/HIVE-4961.3-vectorization.patch ERROR: -1 due to 11 failed/errored test(s), 3954 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump org.apache.hcatalog.listener.TestNotificationListener.testAMQListener org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hive.hcatalog.pig.TestHCatStorer.testPartColsInData org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreInPartiitonedTbl org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreMultiTables org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreWithNoSchema Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/763/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/763/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 11 tests failed This message is automatically generated.
        Hide
        Eric Hanson added a comment -

        Fixed issue with test name (TestUDF) that caused test failure.

        Show
        Eric Hanson added a comment - Fixed issue with test name (TestUDF) that caused test failure.
        Hide
        Hive QA added a comment -

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12603172/HIVE-4961.2-vectorization.patch

        ERROR: -1 due to 6 failed/errored test(s), 3955 tests executed
        Failed tests:

        org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump
        org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold
        org.apache.hive.hcatalog.pig.TestOrcHCatStorer.testStoreTableMulti
        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4
        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json
        org.apache.hadoop.hive.ql.exec.vector.util.TestUDF.initializationError
        

        Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/749/testReport
        Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/749/console

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests failed with: TestsFailedException: 6 tests failed
        

        This message is automatically generated.

        Show
        Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12603172/HIVE-4961.2-vectorization.patch ERROR: -1 due to 6 failed/errored test(s), 3955 tests executed Failed tests: org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hive.hcatalog.pig.TestOrcHCatStorer.testStoreTableMulti org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json org.apache.hadoop.hive.ql.exec.vector.util.TestUDF.initializationError Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/749/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/749/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 6 tests failed This message is automatically generated.
        Hide
        Eric Hanson added a comment -

        Removed line ends and made another minor change based on code review comments. Moved test UDFs to utility directory to avoid test failure. Removed field from ExprNodeGenericFuncDesc that was not needed, and made one field transient to avoid serialization. That should also prevent some test failures due to "diff" answers changing.

        Show
        Eric Hanson added a comment - Removed line ends and made another minor change based on code review comments. Moved test UDFs to utility directory to avoid test failure. Removed field from ExprNodeGenericFuncDesc that was not needed, and made one field transient to avoid serialization. That should also prevent some test failures due to "diff" answers changing.
        Hide
        Hive QA added a comment -

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12602912/HIVE-4961.1-vectorization.patch

        ERROR: -1 due to 49 failed/errored test(s), 3955 tests executed
        Failed tests:

        org.apache.hadoop.hive.ql.parse.TestParse.testParse_input2
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf1
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2
        org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_cast1
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_input8
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_input3
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_join4
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample5
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_join8
        org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testSequenceTableWriteRead
        org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_part1
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_join2
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_input4
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample1
        org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testTextTableWriteRead
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_join7
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_subq
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_input20
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf_when
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath2
        org.apache.hcatalog.cli.TestPermsGrp.testCustomPerms
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf4
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample7
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_input6
        org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testSequenceTableWriteReadMR
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample3
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_case_sensitivity
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample2
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample6
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_input9
        org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testTextTableWriteReadMR
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf6
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_join5
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_join6
        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4
        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_input1
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample4
        org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf_case
        org.apache.hadoop.hive.ql.exec.vector.expressions.TestUDF.initializationError
        

        Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/718/testReport
        Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/718/console

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests failed with: TestsFailedException: 49 tests failed
        

        This message is automatically generated.

        Show
        Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12602912/HIVE-4961.1-vectorization.patch ERROR: -1 due to 49 failed/errored test(s), 3955 tests executed Failed tests: org.apache.hadoop.hive.ql.parse.TestParse.testParse_input2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2 org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump org.apache.hadoop.hive.ql.parse.TestParse.testParse_cast1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input8 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join8 org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testSequenceTableWriteRead org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_part1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample1 org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testTextTableWriteRead org.apache.hadoop.hive.ql.parse.TestParse.testParse_join7 org.apache.hadoop.hive.ql.parse.TestParse.testParse_subq org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input20 org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf_when org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath2 org.apache.hcatalog.cli.TestPermsGrp.testCustomPerms org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample7 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input6 org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testSequenceTableWriteReadMR org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_case_sensitivity org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input9 org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testTextTableWriteReadMR org.apache.hadoop.hive.ql.parse.TestParse.testParse_union org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json org.apache.hadoop.hive.ql.parse.TestParse.testParse_input1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf_case org.apache.hadoop.hive.ql.exec.vector.expressions.TestUDF.initializationError Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/718/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/718/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 49 tests failed This message is automatically generated.
        Hide
        Eric Hanson added a comment -

        Code review available on ReviewBoard: https://reviews.apache.org/r/14113/

        Show
        Eric Hanson added a comment - Code review available on ReviewBoard: https://reviews.apache.org/r/14113/
        Hide
        Eric Hanson added a comment -

        Added support for generic UDFs as well. Includes a unit test.

        Show
        Eric Hanson added a comment - Added support for generic UDFs as well. Includes a unit test.
        Hide
        Eric Hanson added a comment -

        Added unit tests, plus support for isRepeating performance optimization for the case when all input vectors passed into a function are marked as isRepeating = true. Fixed a bug related to setting string output.

        Show
        Eric Hanson added a comment - Added unit tests, plus support for isRepeating performance optimization for the case when all input vectors passed into a function are marked as isRepeating = true. Fixed a bug related to setting string output.
        Hide
        Eric Hanson added a comment -

        Completed working version of bridge to allow custom UDFs that are subclasses
        of UDF to work in vectorized mode. This supports UDFs with evaluate() methods
        that take and return boxed types (e.g. Long), Writable types (e.g. LongWritable)
        and standard types (e.g. long). Generic UDFs are not supported. That will be the
        subject of a future patch.

        I did manual testing for a large set of UDFs taking and returning the types supported
        by vectorization: tinyint, smallint, int, bigint, float, double, boolean, string, timestamp.

        UDFs one argument and multiple arguments were tested. Both constant and variable arguments
        were tested.

        Including the tests with the patch, or doing another patch with end-to-end tests, is yet to be done.

        Show
        Eric Hanson added a comment - Completed working version of bridge to allow custom UDFs that are subclasses of UDF to work in vectorized mode. This supports UDFs with evaluate() methods that take and return boxed types (e.g. Long), Writable types (e.g. LongWritable) and standard types (e.g. long). Generic UDFs are not supported. That will be the subject of a future patch. I did manual testing for a large set of UDFs taking and returning the types supported by vectorization: tinyint, smallint, int, bigint, float, double, boolean, string, timestamp. UDFs one argument and multiple arguments were tested. Both constant and variable arguments were tested. Including the tests with the patch, or doing another patch with end-to-end tests, is yet to be done.
        Hide
        Eric Hanson added a comment -

        Attaching mostly working version of change for safekeeping.

        Show
        Eric Hanson added a comment - Attaching mostly working version of change for safekeeping.

          People

          • Assignee:
            Eric Hanson
            Reporter:
            Eric Hanson
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development