Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3301

Fix quote printing bug in mapreduce_stack_trace.q testcase failure when running hive on hadoop23

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.10.0
    • None
    • None

    Description

      When running hive on hadoop0.23, mapreduce_stack_trace.q is failing due to quote printing bug:

      quote is printed as: '"', instead of "

      Seems not able to state the bug clearly in html:

      quote is printed as 'address sign' + 'quot' + semicolon
      not the expected 'quote sign'

      Attachments

        1. HIVE-3301.1.patch.txt
          4 kB
          Zhenxiao Luo
        2. HIVE-3301.2.patch.txt
          4 kB
          Zhenxiao Luo
        3. HIVE-3301.3.patch.txt
          4 kB
          Zhenxiao Luo

        Issue Links

          Activity

            zhenxiao Zhenxiao Luo added a comment -

            The problem is:

            In hadoop23, TaskLogServlet.java is using a new utility HtmlQuoting.java to print Task Log.

            In TaskLogServlet.java, printTaskLog() function:

            result = taskLogReader.read(b);
            if (result > 0) {
            if (plainText)

            { out.write(b, 0, result); } else { HtmlQuoting.quoteHtmlChars(out, b, 0, result); }
            } else { break; }


            While, in hadoop20, TaskLogServlet.java is using its own utility(there is no such HtmlQuoting.java at all) to print Task Log:

            In TaskLogServlet.java, printTaskLog fucntion:

            result = taskLogReader.read(b);
            if (result > 0) {
            if (plainText) { out.write(b, 0, result); }

            else

            { quotedWrite(out, b, 0, result); }

            } else

            { break; }

            And in Hive, TaskLogProcessor.java is generating stack trace by reading the raw taskAttemptLog.

            In ql/src/java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java, getStackTraces() fuction:

            List<String> stackTrace = null;

            // Patterns that match the middle/end of stack traces
            Pattern stackTracePattern = Pattern.compile("^\tat .*", Pattern.CASE_INSENSITIVE);
            Pattern endStackTracePattern =
            Pattern.compile("^\t... [0-9]+ more.*", Pattern.CASE_INSENSITIVE);

            while ((inputLine = in.readLine()) != null) {

            if (stackTracePattern.matcher(inputLine).matches() ||
            endStackTracePattern.matcher(inputLine).matches()) {

            To have Hive working for both hadoop20 and hadoop23, we should use different mechanisms when hive TaskLogProcessor is parsing TaskAttemptLog.

            My plan is creating a shim, which have different implementations for hadoop20 and hadoop23.

            In hadoop23, HtmlQuoting.unquoteHtmlChars() is used to parse the TaskAttemptLog.

            zhenxiao Zhenxiao Luo added a comment - The problem is: In hadoop23, TaskLogServlet.java is using a new utility HtmlQuoting.java to print Task Log. In TaskLogServlet.java, printTaskLog() function: result = taskLogReader.read(b); if (result > 0) { if (plainText) { out.write(b, 0, result); } else { HtmlQuoting.quoteHtmlChars(out, b, 0, result); } } else { break; } While, in hadoop20, TaskLogServlet.java is using its own utility(there is no such HtmlQuoting.java at all) to print Task Log: In TaskLogServlet.java, printTaskLog fucntion: result = taskLogReader.read(b); if (result > 0) { if (plainText) { out.write(b, 0, result); } else { quotedWrite(out, b, 0, result); } } else { break; } And in Hive, TaskLogProcessor.java is generating stack trace by reading the raw taskAttemptLog. In ql/src/java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java, getStackTraces() fuction: List<String> stackTrace = null; // Patterns that match the middle/end of stack traces Pattern stackTracePattern = Pattern.compile("^\tat .*", Pattern.CASE_INSENSITIVE); Pattern endStackTracePattern = Pattern.compile("^\t... [0-9] + more.*", Pattern.CASE_INSENSITIVE); while ((inputLine = in.readLine()) != null) { if (stackTracePattern.matcher(inputLine).matches() || endStackTracePattern.matcher(inputLine).matches()) { To have Hive working for both hadoop20 and hadoop23, we should use different mechanisms when hive TaskLogProcessor is parsing TaskAttemptLog. My plan is creating a shim, which have different implementations for hadoop20 and hadoop23. In hadoop23, HtmlQuoting.unquoteHtmlChars() is used to parse the TaskAttemptLog.

            You know these hadoop 23 jiras are like death of a thousand paper cuts, if I had known we were going to face so many issues i would have proposed making a larger shim layer. Can we come up with a definitive list of all the 23 problems?

            appodictic Edward Capriolo added a comment - You know these hadoop 23 jiras are like death of a thousand paper cuts, if I had known we were going to face so many issues i would have proposed making a larger shim layer. Can we come up with a definitive list of all the 23 problems?
            zhenxiao Zhenxiao Luo added a comment -

            review request submitted at:
            https://reviews.facebook.net/D4353

            zhenxiao Zhenxiao Luo added a comment - review request submitted at: https://reviews.facebook.net/D4353
            zhenxiao Zhenxiao Luo added a comment -

            @Edward: oh yes. As I know, HIVE-3301, HIVE-3275, HIVE-3273, HIVE-3242, HIVE-3240, HIVE-3257, HIVE-3249 and HIVE-2804 are all hadoop 23 bugs. I am fixing these one by one. Thanks for your advice. I will try to put them into a larger shim layer.

            I just found an Error Code retrieval inconsistency between hadoop20 and hadoop23. Will file another one soon.

            Thanks,
            Zhenxiao

            zhenxiao Zhenxiao Luo added a comment - @Edward: oh yes. As I know, HIVE-3301 , HIVE-3275 , HIVE-3273 , HIVE-3242 , HIVE-3240 , HIVE-3257 , HIVE-3249 and HIVE-2804 are all hadoop 23 bugs. I am fixing these one by one. Thanks for your advice. I will try to put them into a larger shim layer. I just found an Error Code retrieval inconsistency between hadoop20 and hadoop23. Will file another one soon. Thanks, Zhenxiao

            I am not saying we need to shim layer all the fixes, but having a reasonably exhaustive list of the problems linked together in jira would make me more confident that we are taking the right plan of action.

            appodictic Edward Capriolo added a comment - I am not saying we need to shim layer all the fixes, but having a reasonably exhaustive list of the problems linked together in jira would make me more confident that we are taking the right plan of action.

            Zhenxiao I left comments on Phabricator.

            ashutoshc Ashutosh Chauhan added a comment - Zhenxiao I left comments on Phabricator.
            zhenxiao Zhenxiao Luo added a comment -

            Link with hadoop23 integration related bugs

            zhenxiao Zhenxiao Luo added a comment - Link with hadoop23 integration related bugs
            zhenxiao Zhenxiao Luo added a comment -

            @Edward: related HIVE tickets are linked. I will add more whenever any new bugs filed. Do we need a separate upper level JIRA to trace all the hadoop23 integration bugs?

            zhenxiao Zhenxiao Luo added a comment - @Edward: related HIVE tickets are linked. I will add more whenever any new bugs filed. Do we need a separate upper level JIRA to trace all the hadoop23 integration bugs?
            zhenxiao Zhenxiao Luo added a comment -

            @ashutosh: Thanks a lot for the comments.

            I made updates and resubmitted review request at:
            https://reviews.facebook.net/D4353

            zhenxiao Zhenxiao Luo added a comment - @ashutosh: Thanks a lot for the comments. I made updates and resubmitted review request at: https://reviews.facebook.net/D4353
            zhenxiao Zhenxiao Luo added a comment -

            patch has prefix in it

            zhenxiao Zhenxiao Luo added a comment - patch has prefix in it
            zhenxiao Zhenxiao Luo added a comment -

            updated patch without prefix
            could apply cleanly

            zhenxiao Zhenxiao Luo added a comment - updated patch without prefix could apply cleanly

            +1 Running tests.

            ashutoshc Ashutosh Chauhan added a comment - +1 Running tests.

            Committed to trunk. Thanks, Zhenxiao!

            ashutoshc Ashutosh Chauhan added a comment - Committed to trunk. Thanks, Zhenxiao!
            hudson Hudson added a comment -

            Integrated in Hive-trunk-h0.21 #1570 (See https://builds.apache.org/job/Hive-trunk-h0.21/1570/)
            HIVE-3301 : Fix quote printing bug in mapreduce_stack_trace.q testcase failure when running hive on hadoop23 (Zhenxiao Luo via Ashutosh Chauhan) (Revision 1366233)

            Result = SUCCESS
            hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1366233
            Files :

            • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java
            • /hive/trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java
            • /hive/trunk/shims/src/common-secure/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java
            • /hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java
            hudson Hudson added a comment - Integrated in Hive-trunk-h0.21 #1570 (See https://builds.apache.org/job/Hive-trunk-h0.21/1570/ ) HIVE-3301 : Fix quote printing bug in mapreduce_stack_trace.q testcase failure when running hive on hadoop23 (Zhenxiao Luo via Ashutosh Chauhan) (Revision 1366233) Result = SUCCESS hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1366233 Files : /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java /hive/trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java /hive/trunk/shims/src/common-secure/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java /hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java
            hudson Hudson added a comment -

            Integrated in Hive-trunk-hadoop2 #54 (See https://builds.apache.org/job/Hive-trunk-hadoop2/54/)
            HIVE-3301 : Fix quote printing bug in mapreduce_stack_trace.q testcase failure when running hive on hadoop23 (Zhenxiao Luo via Ashutosh Chauhan) (Revision 1366233)

            Result = ABORTED
            hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1366233
            Files :

            • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java
            • /hive/trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java
            • /hive/trunk/shims/src/common-secure/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java
            • /hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java
            hudson Hudson added a comment - Integrated in Hive-trunk-hadoop2 #54 (See https://builds.apache.org/job/Hive-trunk-hadoop2/54/ ) HIVE-3301 : Fix quote printing bug in mapreduce_stack_trace.q testcase failure when running hive on hadoop23 (Zhenxiao Luo via Ashutosh Chauhan) (Revision 1366233) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1366233 Files : /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java /hive/trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java /hive/trunk/shims/src/common-secure/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java /hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java

            This issue is fixed and released as part of 0.10.0 release. If you find an issue which seems to be related to this one, please create a new jira and link this one with new jira.

            ashutoshc Ashutosh Chauhan added a comment - This issue is fixed and released as part of 0.10.0 release. If you find an issue which seems to be related to this one, please create a new jira and link this one with new jira.

            People

              zhenxiao Zhenxiao Luo
              zhenxiao Zhenxiao Luo
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: