Pig
  1. Pig
  2. PIG-2484

Fix several e2e test failures/aborts for 23

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.2, 0.10.0, 0.11
    • Fix Version/s: 0.10.0, 0.9.3, 0.11
    • Component/s: impl
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      There are still a couple of e2e test aborts/failures for hadoop23. Most of them are due to test infrastructure, minor backward incompatibility change in 23, or recent changes in Pig. Here is a list:

      Scripting_1/Scripting_2: MAPREDUCE-3700

      Native_3: 23 test need a hadoop23-steaming.jar

      MonitoredUDF_1: Seems related to guava upgrade (PIG-2460), Pig's guava is newer than hadoop23's

      UdfException_1, UdfException_2, UdfException_3, UdfException_4: Error message change

      Checkin_2, GroupAggFunc_7, GroupAggFunc_9, GroupAggFunc_12, GroupAggFunc_13, Types_6, Scalar_1: float precision

      Limit_2: The specific output records change, test infrastructure should allow this

      1. PIG-2484-4-branch0.9.patch
        22 kB
        Rohini Palaniswamy
      2. PIG-2484-3.patch
        16 kB
        Daniel Dai
      3. PIG-2484-2.patch
        16 kB
        Daniel Dai
      4. PIG-2484-1.patch
        12 kB
        Daniel Dai

        Issue Links

          Activity

          Hide
          Daniel Dai added a comment -

          Added hadoop-0.23.0-streaming.jar to 0.9 branch.

          Show
          Daniel Dai added a comment - Added hadoop-0.23.0-streaming.jar to 0.9 branch.
          Hide
          Rohini Palaniswamy added a comment -

          test/e2e/pig/lib/hadoop-0.23.0-streaming.jar also needs to be checked in. It was not part of the patch as it is a binary. Had mentioned it as a additional step during checkin.

          Show
          Rohini Palaniswamy added a comment - test/e2e/pig/lib/hadoop-0.23.0-streaming.jar also needs to be checked in. It was not part of the patch as it is a binary. Had mentioned it as a additional step during checkin.
          Hide
          Daniel Dai added a comment -

          Commit PIG-2484-4-branch0.9.patch to 0.9 branch as per requested by Rohini. Note this patch also include fix for PIG-2859.

          Show
          Daniel Dai added a comment - Commit PIG-2484 -4-branch0.9.patch to 0.9 branch as per requested by Rohini. Note this patch also include fix for PIG-2859 .
          Hide
          Rohini Palaniswamy added a comment -

          Did some debugging to see why the float precision is different between 20 and 23 to ensure that is not a cause for concern. Summing up of doubles in java has known issues (http://www.velocityreviews.com/forums/t139008-java-double-precision.html). The precision differs based on the order in which the numbers are summed up. That is the reason for precision differing between H20 and H23. The order in which key and values come to the output.collect from the map are same, but when the reduce of the combiner is called the order of values in Iterable<values> is different in 20 and 23. There must be some algo change somewhere for grouping of elements for the combiner. Did not take time and dig deeper to see what the actual change in mapred is.
          We should be able to safely ignore the float precision change in the e2e tests.

          Show
          Rohini Palaniswamy added a comment - Did some debugging to see why the float precision is different between 20 and 23 to ensure that is not a cause for concern. Summing up of doubles in java has known issues ( http://www.velocityreviews.com/forums/t139008-java-double-precision.html ). The precision differs based on the order in which the numbers are summed up. That is the reason for precision differing between H20 and H23. The order in which key and values come to the output.collect from the map are same, but when the reduce of the combiner is called the order of values in Iterable<values> is different in 20 and 23. There must be some algo change somewhere for grouping of elements for the combiner. Did not take time and dig deeper to see what the actual change in mapred is. We should be able to safely ignore the float precision change in the e2e tests.
          Hide
          Rohini Palaniswamy added a comment -

          Still Failing Tests not addressed in the patch:
          2) MonitoredUDF_1_Local

          • It is skipped only for 0.23 mapred and not 0.23 local.
          Show
          Rohini Palaniswamy added a comment - Still Failing Tests not addressed in the patch: 2) MonitoredUDF_1_Local It is skipped only for 0.23 mapred and not 0.23 local.
          Hide
          Rohini Palaniswamy added a comment -

          Patch for Branch 0.9.

          • Removed ignore for Scripting_1 and Scripting_2 as PIG-2761 fixes the issue.
          • Included the benchmark patch from PIG-2711 to make testing faster.
          • Additional tests fixed from what is mentioned in the description. Will create a patch for these for 0.10 and trunk on a separate jira.
            1)ClassResolution
          • Fully qualified UDF names were not used.
            2)Native_x_Local
          • This was failing as local.conf did not set mapredjars
            3)Jython_Macro_1_Local
          • - Was failing because the input file was the data directory itself instead
            of studenttab10K. In case of mapred and local, the number of files under data
            directory was different so Jython_Macro_1 passed with the benchmark created during mapred, but
            Jython_Macro_1_local failed with the benchmark from mapred.

          Still Failing Tests not addressed in the patch:
          1)Jython_CompileBindRun_3_local

          • Problem exists with 0.10 also. It launches 3 threads to launch multiple jobs. But with LocalJobRunner it does not work as all write to test/e2e/pig/testdist/build/test/mapred/local/localRunner/job_local_0001.xml and parsing it fails with org.xml.sax.SAXParseException: Content is not allowed in trailing section.

          Additional steps for checkin:
          test/e2e/pig/lib/hadoop-0.23.0-streaming.jar is not part of the patch. Copy file from branch-0.10 and svn add before committing the patch.

          Show
          Rohini Palaniswamy added a comment - Patch for Branch 0.9. Removed ignore for Scripting_1 and Scripting_2 as PIG-2761 fixes the issue. Included the benchmark patch from PIG-2711 to make testing faster. Additional tests fixed from what is mentioned in the description. Will create a patch for these for 0.10 and trunk on a separate jira. 1)ClassResolution Fully qualified UDF names were not used. 2)Native_x_Local This was failing as local.conf did not set mapredjars 3)Jython_Macro_1_Local - Was failing because the input file was the data directory itself instead of studenttab10K. In case of mapred and local, the number of files under data directory was different so Jython_Macro_1 passed with the benchmark created during mapred, but Jython_Macro_1_local failed with the benchmark from mapred. Still Failing Tests not addressed in the patch: 1)Jython_CompileBindRun_3_local Problem exists with 0.10 also. It launches 3 threads to launch multiple jobs. But with LocalJobRunner it does not work as all write to test/e2e/pig/testdist/build/test/mapred/local/localRunner/job_local_0001.xml and parsing it fails with org.xml.sax.SAXParseException: Content is not allowed in trailing section. Additional steps for checkin: test/e2e/pig/lib/hadoop-0.23.0-streaming.jar is not part of the patch. Copy file from branch-0.10 and svn add before committing the patch.
          Hide
          Daniel Dai added a comment -

          Patch committed to 0.10/trunk. I don't commit to 0.9 branch now, but we may when we feel a need.

          Show
          Daniel Dai added a comment - Patch committed to 0.10/trunk. I don't commit to 0.9 branch now, but we may when we feel a need.
          Hide
          Daniel Dai added a comment -

          MonitoredUDF_1 exposes a general issue of mapreduce, how to override hadoop libraries. We see similar issue in AvroLoader as well. MAPREDUCE-1938 uses HADOOP_USER_CLASSPATH_FIRST (frontend) and mapreduce.user.classpath.first (backend). But that patch does not go into hadoop23. We need to push hadoop folks to apply this patch to 23, or provide a mechanism to solve the issue.

          Show
          Daniel Dai added a comment - MonitoredUDF_1 exposes a general issue of mapreduce, how to override hadoop libraries. We see similar issue in AvroLoader as well. MAPREDUCE-1938 uses HADOOP_USER_CLASSPATH_FIRST (frontend) and mapreduce.user.classpath.first (backend). But that patch does not go into hadoop23. We need to push hadoop folks to apply this patch to 23, or provide a mechanism to solve the issue.
          Hide
          Thejas M Nair added a comment -

          +1
          The 23 specific test case parameters are very useful. We know what to revisit later (ignore23 etc).
          I think we should see if the " register './python/scriptingudf.py'.. " use case would work with hadoop 23 if pig removes the "./" prefix. Opened PIG-2486 to track that.

          Show
          Thejas M Nair added a comment - +1 The 23 specific test case parameters are very useful. We know what to revisit later (ignore23 etc). I think we should see if the " register './python/scriptingudf.py'.. " use case would work with hadoop 23 if pig removes the "./" prefix. Opened PIG-2486 to track that.
          Hide
          Daniel Dai added a comment -

          Attached PIG-2484-1.patch in an attempt to fix all e2e tests (or disable in 23). We also need to copy hadoop-0.23.0-streaming.jar into test/e2e/pig/lib.

          Show
          Daniel Dai added a comment - Attached PIG-2484 -1.patch in an attempt to fix all e2e tests (or disable in 23). We also need to copy hadoop-0.23.0-streaming.jar into test/e2e/pig/lib.
          Hide
          Daniel Dai added a comment -

          Exactly. In hadoop 20.x, I see an option HADOOP_USER_CLASSPATH_FIRST (though guava never cause an issue in 20.x since 20.x does not use guava). However, I didn't see an equivalent option in hadoop 23. I will follow up with hadoop folks.

          Show
          Daniel Dai added a comment - Exactly. In hadoop 20.x, I see an option HADOOP_USER_CLASSPATH_FIRST (though guava never cause an issue in 20.x since 20.x does not use guava). However, I didn't see an equivalent option in hadoop 23. I will follow up with hadoop folks.
          Hide
          Dmitriy V. Ryaboy added a comment -

          For MonitoredUDF – we should be putting our jars ahead of hadoop's.

          Show
          Dmitriy V. Ryaboy added a comment - For MonitoredUDF – we should be putting our jars ahead of hadoop's.

            People

            • Assignee:
              Daniel Dai
              Reporter:
              Daniel Dai
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development