Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-9073

NPE when using custom windowing UDAFs

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.14.0, 1.0.0
    • Fix Version/s: 1.2.0
    • Component/s: UDF
    • Labels:
      None

      Description

      From the hive-user email group:

      While executing a simple select query using a custom windowing UDAF I created I am constantly running into this error.
       
      Error: java.lang.RuntimeException: Error in configuring object
              at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
              at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
              at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
              at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409)
              at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
              at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
              at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
      Caused by: java.lang.reflect.InvocationTargetException
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
              ... 9 more
      Caused by: java.lang.RuntimeException: Reduce operator initialization failed
              at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:173)
              ... 14 more
      Caused by: java.lang.NullPointerException
              at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:647)
              at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getWindowFunctionInfo(FunctionRegistry.java:1875)
              at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.streamingPossible(WindowingTableFunction.java:150)
              at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.setCanAcceptInputAsStream(WindowingTableFunction.java:221)
              at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.initializeStreaming(WindowingTableFunction.java:266)
              at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.initializeStreaming(PTFOperator.java:292)
              at org.apache.hadoop.hive.ql.exec.PTFOperator.initializeOp(PTFOperator.java:86)
              at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
              at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:460)
              at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:416)
              at org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40)
              at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
              at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:166)
              ... 14 more
       
      Just wanted to check if any of you have faced this earlier. Also when I try to run the Custom UDAF on another server it works fine. The only difference I can see it that the hive version I am using on my local machine is 0.13.1 where it is working and on the other machine it is 0.13.0 where I see the above mentioned error. I am not sure if this was a bug which was fixed in the later release but I just wanted to confirm the same.
      
      1. HIVE-9073.1.patch
        10 kB
        Jason Dere
      2. HIVE-9073.2.patch
        10 kB
        Jason Dere
      3. HIVE-9073.2.patch
        10 kB
        Jason Dere
      4. HIVE-9073.3.patch
        10 kB
        Jason Dere

        Issue Links

          Activity

          Hide
          jdere Jason Dere added a comment -

          Looks like the error may be occurring because the Hive is trying to look up the UDF by name during the UDF initialization in the reduce task. Ideally this lookup should only be happening during the compilation phase and not during the map/reduce tasks. This looks like it works ok for built-in windowing UDFs (added to the FunctionRegistry), but custom UDFs are hitting some other logic that really should only happen during compilation. We would have to fix the way WindowingTableFunction is doing its initialization for this to work with UDFs added with ADD TEMP FUNCTION.

          Show
          jdere Jason Dere added a comment - Looks like the error may be occurring because the Hive is trying to look up the UDF by name during the UDF initialization in the reduce task. Ideally this lookup should only be happening during the compilation phase and not during the map/reduce tasks. This looks like it works ok for built-in windowing UDFs (added to the FunctionRegistry), but custom UDFs are hitting some other logic that really should only happen during compilation. We would have to fix the way WindowingTableFunction is doing its initialization for this to work with UDFs added with ADD TEMP FUNCTION.
          Hide
          jdere Jason Dere added a comment -

          Attaching patch to cache values when UDAF is first initialized in query compilation, and to use cached values rather than FunctionRegistry lookup.

          Show
          jdere Jason Dere added a comment - Attaching patch to cache values when UDAF is first initialized in query compilation, and to use cached values rather than FunctionRegistry lookup.
          Hide
          jdere Jason Dere added a comment -
          Show
          jdere Jason Dere added a comment - RB at https://reviews.apache.org/r/28921/
          Hide
          ashutoshc Ashutosh Chauhan added a comment -

          HIVE-4419 purportedly fixed this, but seems like this got broken again in meanwhile. I suggest we also do HIVE-4415 to future proof ourselves from this kind of breakage.

          Show
          ashutoshc Ashutosh Chauhan added a comment - HIVE-4419 purportedly fixed this, but seems like this got broken again in meanwhile. I suggest we also do HIVE-4415 to future proof ourselves from this kind of breakage.
          Hide
          jdere Jason Dere added a comment -

          Patch v2 - fixed qfile name in testconfiguration.properties.

          Show
          jdere Jason Dere added a comment - Patch v2 - fixed qfile name in testconfiguration.properties.
          Hide
          jdere Jason Dere added a comment -

          Looks like it HIVE-7143 broke this, but we didn't see this because the qfile test needs to be run by MiniMR to see the error. In any case I've added a new qfile test that runs in MiniMR.

          Show
          jdere Jason Dere added a comment - Looks like it HIVE-7143 broke this, but we didn't see this because the qfile test needs to be run by MiniMR to see the error. In any case I've added a new qfile test that runs in MiniMR.
          Hide
          jdere Jason Dere added a comment -

          re-upload patch for precommit tests

          Show
          jdere Jason Dere added a comment - re-upload patch for precommit tests
          Hide
          hiveqa Hive QA added a comment -

          Overall: +1 all checks pass

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12686722/HIVE-9073.2.patch

          SUCCESS: +1 6702 tests passed

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2048/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2048/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2048/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          

          This message is automatically generated.

          ATTACHMENT ID: 12686722 - PreCommit-HIVE-TRUNK-Build

          Show
          hiveqa Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12686722/HIVE-9073.2.patch SUCCESS: +1 6702 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2048/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2048/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2048/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated. ATTACHMENT ID: 12686722 - PreCommit-HIVE-TRUNK-Build
          Hide
          jdere Jason Dere added a comment -

          rebasing patch with trunk

          Show
          jdere Jason Dere added a comment - rebasing patch with trunk
          Hide
          hiveqa Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12708489/HIVE-9073.3.patch

          ERROR: -1 due to 13 failed/errored test(s), 8643 tests executed
          Failed tests:

          TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file
          TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file
          TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file
          TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file
          TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file
          TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file
          TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file
          TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file
          TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file
          TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file
          TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file
          TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file
          TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3227/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3227/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3227/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 13 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12708489 - PreCommit-HIVE-TRUNK-Build

          Show
          hiveqa Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12708489/HIVE-9073.3.patch ERROR: -1 due to 13 failed/errored test(s), 8643 tests executed Failed tests: TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3227/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3227/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3227/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed This message is automatically generated. ATTACHMENT ID: 12708489 - PreCommit-HIVE-TRUNK-Build
          Hide
          ashutoshc Ashutosh Chauhan added a comment -

          +1 unless above test failures are because of this patch.

          A natural extension of this is to move Noop TableFunctionEvaluator to contrib/ module and then do create function for it in tests where it is used. Currently, Noop lives in main src tree because of this bug but used heavily in tests. Since contrib/ jar is not on classpath of tests by default, having this test-used class live in contrib/ will help in both regards, keeping src tree free of any test classes and provide true test case for this functionality, since current test uses function which any way is available in classpath of tests.
          Since above may require some refactoring, I am OK with doing that in a follow-up jira.

          Show
          ashutoshc Ashutosh Chauhan added a comment - +1 unless above test failures are because of this patch. A natural extension of this is to move Noop TableFunctionEvaluator to contrib/ module and then do create function for it in tests where it is used. Currently, Noop lives in main src tree because of this bug but used heavily in tests. Since contrib/ jar is not on classpath of tests by default, having this test-used class live in contrib/ will help in both regards, keeping src tree free of any test classes and provide true test case for this functionality, since current test uses function which any way is available in classpath of tests. Since above may require some refactoring, I am OK with doing that in a follow-up jira.
          Hide
          jdere Jason Dere added a comment -

          Failures don't appear to be related.
          I've committed this to trunk

          Show
          jdere Jason Dere added a comment - Failures don't appear to be related. I've committed this to trunk
          Hide
          sushanth Sushanth Sowmyan added a comment -

          This issue has been fixed and released as part of the 1.2.0 release. If you find an issue which seems to be related to this one, please create a new jira and link this one with new jira.

          Show
          sushanth Sushanth Sowmyan added a comment - This issue has been fixed and released as part of the 1.2.0 release. If you find an issue which seems to be related to this one, please create a new jira and link this one with new jira.

            People

            • Assignee:
              jdere Jason Dere
              Reporter:
              jdere Jason Dere
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development