Hive
  1. Hive
  2. HIVE-2666

StackOverflowError when using custom UDF in map join

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9.0
    • Component/s: None
    • Labels:
      None

      Description

      When a custom UDF is used as part of a join which is converted to a map join, the XMLEncoder enters an infinite loop when serializing the map reduce task for the second time, as part of sending it to be executed. This results in a stack overflow error.

        Issue Links

          Activity

          Hide
          Ashutosh Chauhan added a comment -

          This issue is closed now. It was released with the fix in 0.9.0. If there is a problem, please open a new jira and link this one with that.

          Show
          Ashutosh Chauhan added a comment - This issue is closed now. It was released with the fix in 0.9.0. If there is a problem, please open a new jira and link this one with that.
          Hide
          Hudson added a comment -

          Integrated in Hive-trunk-h0.21 #1164 (See https://builds.apache.org/job/Hive-trunk-h0.21/1164/)
          HIVE-2666 [jira] StackOverflowError when using custom UDF in map join
          (Kevin Wilfong via Yongqiang He)

          Summary:
          Resource files are now added to the class path as soon as they are added via the
          CLI. This fixes the stack overflow error mentioned in the JIRA by ensuring a
          consistent class loader between serializers and deserializers for the same
          query.

          Note that now serdes which contain a static block to register themselves are now
          registered twice, once when adding the file to the class loader, and once when
          an instance of the class is created. Previously, registering a serde twice
          resulted in an exception, to avoid this, I have downgraded it to a warning.

          When a custom UDF is used as part of a join which is converted to a map join,
          the XMLEncoder enters an infinite loop when serializing the map reduce task for
          the second time, as part of sending it to be executed. This results in a stack
          overflow error.

          Test Plan:
          I ran the unit tests to verify nothing was broken.

          I ran several queries which used custom UDFs and involved a join which was
          converted to a map join. I verified these completed successfully consistently

          Reviewers: JIRA, heyongqiang

          Reviewed By: heyongqiang

          CC: heyongqiang, kevinwilfong

          Differential Revision: 957

          heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1221830
          Files :

          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProcessor.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/DeleteResourceProcessor.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java
          Show
          Hudson added a comment - Integrated in Hive-trunk-h0.21 #1164 (See https://builds.apache.org/job/Hive-trunk-h0.21/1164/ ) HIVE-2666 [jira] StackOverflowError when using custom UDF in map join (Kevin Wilfong via Yongqiang He) Summary: Resource files are now added to the class path as soon as they are added via the CLI. This fixes the stack overflow error mentioned in the JIRA by ensuring a consistent class loader between serializers and deserializers for the same query. Note that now serdes which contain a static block to register themselves are now registered twice, once when adding the file to the class loader, and once when an instance of the class is created. Previously, registering a serde twice resulted in an exception, to avoid this, I have downgraded it to a warning. When a custom UDF is used as part of a join which is converted to a map join, the XMLEncoder enters an infinite loop when serializing the map reduce task for the second time, as part of sending it to be executed. This results in a stack overflow error. Test Plan: I ran the unit tests to verify nothing was broken. I ran several queries which used custom UDFs and involved a join which was converted to a map join. I verified these completed successfully consistently Reviewers: JIRA, heyongqiang Reviewed By: heyongqiang CC: heyongqiang, kevinwilfong Differential Revision: 957 heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1221830 Files : /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProcessor.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/DeleteResourceProcessor.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java
          Hide
          Hudson added a comment -

          Integrated in Hive-trunk-h0.23.0 #42 (See https://builds.apache.org/job/Hive-trunk-h0.23.0/42/)
          HIVE-2666 [jira] StackOverflowError when using custom UDF in map join
          (Kevin Wilfong via Yongqiang He)

          Summary:
          Resource files are now added to the class path as soon as they are added via the
          CLI. This fixes the stack overflow error mentioned in the JIRA by ensuring a
          consistent class loader between serializers and deserializers for the same
          query.

          Note that now serdes which contain a static block to register themselves are now
          registered twice, once when adding the file to the class loader, and once when
          an instance of the class is created. Previously, registering a serde twice
          resulted in an exception, to avoid this, I have downgraded it to a warning.

          When a custom UDF is used as part of a join which is converted to a map join,
          the XMLEncoder enters an infinite loop when serializing the map reduce task for
          the second time, as part of sending it to be executed. This results in a stack
          overflow error.

          Test Plan:
          I ran the unit tests to verify nothing was broken.

          I ran several queries which used custom UDFs and involved a join which was
          converted to a map join. I verified these completed successfully consistently

          Reviewers: JIRA, heyongqiang

          Reviewed By: heyongqiang

          CC: heyongqiang, kevinwilfong

          Differential Revision: 957

          heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1221830
          Files :

          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProcessor.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/DeleteResourceProcessor.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java
          Show
          Hudson added a comment - Integrated in Hive-trunk-h0.23.0 #42 (See https://builds.apache.org/job/Hive-trunk-h0.23.0/42/ ) HIVE-2666 [jira] StackOverflowError when using custom UDF in map join (Kevin Wilfong via Yongqiang He) Summary: Resource files are now added to the class path as soon as they are added via the CLI. This fixes the stack overflow error mentioned in the JIRA by ensuring a consistent class loader between serializers and deserializers for the same query. Note that now serdes which contain a static block to register themselves are now registered twice, once when adding the file to the class loader, and once when an instance of the class is created. Previously, registering a serde twice resulted in an exception, to avoid this, I have downgraded it to a warning. When a custom UDF is used as part of a join which is converted to a map join, the XMLEncoder enters an infinite loop when serializing the map reduce task for the second time, as part of sending it to be executed. This results in a stack overflow error. Test Plan: I ran the unit tests to verify nothing was broken. I ran several queries which used custom UDFs and involved a join which was converted to a map join. I verified these completed successfully consistently Reviewers: JIRA, heyongqiang Reviewed By: heyongqiang CC: heyongqiang, kevinwilfong Differential Revision: 957 heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1221830 Files : /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProcessor.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/DeleteResourceProcessor.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java
          Hide
          Phabricator added a comment -

          heyongqiang has committed the revision "HIVE-2666 [jira] StackOverflowError when using custom UDF in map join".

          REVISION DETAIL
          https://reviews.facebook.net/D957

          COMMIT
          https://reviews.facebook.net/rHIVE1221830

          Show
          Phabricator added a comment - heyongqiang has committed the revision " HIVE-2666 [jira] StackOverflowError when using custom UDF in map join". REVISION DETAIL https://reviews.facebook.net/D957 COMMIT https://reviews.facebook.net/rHIVE1221830
          Hide
          He Yongqiang added a comment -

          committed, thanks Kevin!

          Show
          He Yongqiang added a comment - committed, thanks Kevin!
          Hide
          Phabricator added a comment -

          heyongqiang has accepted the revision "HIVE-2666 [jira] StackOverflowError when using custom UDF in map join".

          running tests

          REVISION DETAIL
          https://reviews.facebook.net/D957

          Show
          Phabricator added a comment - heyongqiang has accepted the revision " HIVE-2666 [jira] StackOverflowError when using custom UDF in map join". running tests REVISION DETAIL https://reviews.facebook.net/D957
          Hide
          Phabricator added a comment -

          kevinwilfong has commented on the revision "HIVE-2666 [jira] StackOverflowError when using custom UDF in map join".

          There is a unit test query in TestNegativeCliDriver called deletejar, which verifies an error is thrown if you add a jar, delete that jar, and then attempt to use a class in the jar.

          REVISION DETAIL
          https://reviews.facebook.net/D957

          Show
          Phabricator added a comment - kevinwilfong has commented on the revision " HIVE-2666 [jira] StackOverflowError when using custom UDF in map join". There is a unit test query in TestNegativeCliDriver called deletejar, which verifies an error is thrown if you add a jar, delete that jar, and then attempt to use a class in the jar. REVISION DETAIL https://reviews.facebook.net/D957
          Hide
          Phabricator added a comment -

          heyongqiang has commented on the revision "HIVE-2666 [jira] StackOverflowError when using custom UDF in map join".

          INLINE COMMENTS
          ql/src/java/org/apache/hadoop/hive/ql/processors/DeleteResourceProcessor.java:63 have u verified the delete also work fine?

          REVISION DETAIL
          https://reviews.facebook.net/D957

          Show
          Phabricator added a comment - heyongqiang has commented on the revision " HIVE-2666 [jira] StackOverflowError when using custom UDF in map join". INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/processors/DeleteResourceProcessor.java:63 have u verified the delete also work fine? REVISION DETAIL https://reviews.facebook.net/D957
          Hide
          Phabricator added a comment -

          kevinwilfong requested code review of "HIVE-2666 [jira] StackOverflowError when using custom UDF in map join".
          Reviewers: JIRA

          Resource files are now added to the class path as soon as they are added via the CLI. This fixes the stack overflow error mentioned in the JIRA by ensuring a consistent class loader between serializers and deserializers for the same query.

          Note that now serdes which contain a static block to register themselves are now registered twice, once when adding the file to the class loader, and once when an instance of the class is created. Previously, registering a serde twice resulted in an exception, to avoid this, I have downgraded it to a warning.

          When a custom UDF is used as part of a join which is converted to a map join, the XMLEncoder enters an infinite loop when serializing the map reduce task for the second time, as part of sending it to be executed. This results in a stack overflow error.

          TEST PLAN
          EMPTY

          REVISION DETAIL
          https://reviews.facebook.net/D957

          AFFECTED FILES
          serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java
          ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
          ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProcessor.java
          ql/src/java/org/apache/hadoop/hive/ql/processors/DeleteResourceProcessor.java

          MANAGE HERALD DIFFERENTIAL RULES
          https://reviews.facebook.net/herald/view/differential/

          WHY DID I GET THIS EMAIL?
          https://reviews.facebook.net/herald/transcript/1989/

          Tip: use the X-Herald-Rules header to filter Herald messages in your client.

          Show
          Phabricator added a comment - kevinwilfong requested code review of " HIVE-2666 [jira] StackOverflowError when using custom UDF in map join". Reviewers: JIRA Resource files are now added to the class path as soon as they are added via the CLI. This fixes the stack overflow error mentioned in the JIRA by ensuring a consistent class loader between serializers and deserializers for the same query. Note that now serdes which contain a static block to register themselves are now registered twice, once when adding the file to the class loader, and once when an instance of the class is created. Previously, registering a serde twice resulted in an exception, to avoid this, I have downgraded it to a warning. When a custom UDF is used as part of a join which is converted to a map join, the XMLEncoder enters an infinite loop when serializing the map reduce task for the second time, as part of sending it to be executed. This results in a stack overflow error. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D957 AFFECTED FILES serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProcessor.java ql/src/java/org/apache/hadoop/hive/ql/processors/DeleteResourceProcessor.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/1989/ Tip: use the X-Herald-Rules header to filter Herald messages in your client.

            People

            • Assignee:
              Kevin Wilfong
              Reporter:
              Kevin Wilfong
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development