Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4644

mapreduce-client-jobclient-tests do not run from dist tarball

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Not a Problem
    • Affects Version/s: 2.0.2-alpha
    • Fix Version/s: None
    • Component/s: build, test
    • Labels:
      None

      Description

      The mapreduce jobclient tests rely on junit which is missing from the dist tarball. This prevents running often-used tests like sleep jobs.

        Issue Links

          Activity

          Jason Lowe created issue -
          Hide
          Jason Lowe added a comment -

          Error traceback when trying to launch the tests jar:

          $ hadoop jar hadoop-*-SNAPSHOT/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-*-SNAPSHOT-tests.jar
          java.lang.NoClassDefFoundError: junit/framework/TestCase
          	at java.lang.ClassLoader.defineClass1(Native Method)
          	at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
          	at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
          	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
          	at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
          	at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
          	at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
          	at java.security.AccessController.doPrivileged(Native Method)
          	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
          	at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
          	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
          	at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
          	at org.apache.hadoop.test.MapredTestDriver.<init>(MapredTestDriver.java:60)
          	at org.apache.hadoop.test.MapredTestDriver.<init>(MapredTestDriver.java:54)
          	at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:123)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          	at java.lang.reflect.Method.invoke(Method.java:597)
          	at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
          Caused by: java.lang.ClassNotFoundException: junit.framework.TestCase
          	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
          	at java.security.AccessController.doPrivileged(Native Method)
          	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
          	at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
          	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
          	at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
          	... 20 more
          An example program must be given as the first argument.
          Valid program names are:
          
          Show
          Jason Lowe added a comment - Error traceback when trying to launch the tests jar: $ hadoop jar hadoop-*-SNAPSHOT/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-*-SNAPSHOT-tests.jar java.lang.NoClassDefFoundError: junit/framework/TestCase at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) at java.lang.ClassLoader.defineClass(ClassLoader.java:615) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) at java.net.URLClassLoader.access$000(URLClassLoader.java:58) at java.net.URLClassLoader$1.run(URLClassLoader.java:197) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at org.apache.hadoop.test.MapredTestDriver.<init>(MapredTestDriver.java:60) at org.apache.hadoop.test.MapredTestDriver.<init>(MapredTestDriver.java:54) at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:123) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: java.lang.ClassNotFoundException: junit.framework.TestCase at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 20 more An example program must be given as the first argument. Valid program names are:
          Hide
          Jason Lowe added a comment -

          HADOOP-8738 explicitly removed the junit jar from the distro which lead to the jobclient tests breakage.

          Show
          Jason Lowe added a comment - HADOOP-8738 explicitly removed the junit jar from the distro which lead to the jobclient tests breakage.
          Thomas Graves made changes -
          Field Original Value New Value
          Link This issue is broken by HADOOP-8738 [ HADOOP-8738 ]
          Hide
          Thomas Graves added a comment -

          Alejandro, was the junit.jar being included breaking something?

          Show
          Thomas Graves added a comment - Alejandro, was the junit.jar being included breaking something?
          Hide
          Alejandro Abdelnur added a comment -

          The problem was that junit was ending up in the cluster classpath, and in ALL jobs classpath.

          Show
          Alejandro Abdelnur added a comment - The problem was that junit was ending up in the cluster classpath, and in ALL jobs classpath.
          Jason Lowe made changes -
          Link This issue is duplicated by MAPREDUCE-4656 [ MAPREDUCE-4656 ]
          Hide
          Jason Lowe added a comment -

          Seems like we need to move the *-tests jars along with their specific dependencies out of the classpath, along the lines of what HADOOP-8723 was trying to do. The patch there only moves the *-tests jars out, so it would need to be enhanced to also move out dependencies that are only needed by those jars. Does that sound reasonable?

          Show
          Jason Lowe added a comment - Seems like we need to move the *-tests jars along with their specific dependencies out of the classpath, along the lines of what HADOOP-8723 was trying to do. The patch there only moves the *-tests jars out, so it would need to be enhanced to also move out dependencies that are only needed by those jars. Does that sound reasonable?
          Hide
          Alejandro Abdelnur added a comment -

          Jason, out where? are you suggesting to create a new libtest/ dir? Wouldn't make more sense to have a tools module for this? this seems like a tool usecase. Also, if you bring in all test scope JARs, you have to make sure they all have compatible licenses (ie, if JDIFF endups there we are in trouble).

          Show
          Alejandro Abdelnur added a comment - Jason, out where? are you suggesting to create a new libtest/ dir? Wouldn't make more sense to have a tools module for this? this seems like a tool usecase. Also, if you bring in all test scope JARs, you have to make sure they all have compatible licenses (ie, if JDIFF endups there we are in trouble).
          Hide
          Colin Patrick McCabe added a comment -

          Is there a workaround? Not being able to run the mapreduce client tests is frustrating.

          Show
          Colin Patrick McCabe added a comment - Is there a workaround? Not being able to run the mapreduce client tests is frustrating.
          Hide
          Jason Lowe added a comment -

          Couple of workarounds:

          1. copy the junit jar from the .m2 repository to share/hadoop/mapreduce/lib/
          2. Edit hadoop-mapreduce-project/pom.xml and remove <scope>test</scope> from the junit dependency
          Show
          Jason Lowe added a comment - Couple of workarounds: copy the junit jar from the .m2 repository to share/hadoop/mapreduce/lib/ Edit hadoop-mapreduce-project/pom.xml and remove <scope>test</scope> from the junit dependency
          Hide
          Jason Lowe added a comment -

          Agreed it's risky to blindly pull in all test scope JARs, so maybe it needs to be an explicit list of jars to include.

          are you suggesting to create a new libtest/ dir? Wouldn't make more sense to have a tools module for this? this seems like a tool usecase.

          I'm not sure it's best to clutter the tools directory with the unit test jars. They're not really related, and it can lead to some of the same problems we're trying to avoid when someone runs tools. IMHO the test stuff shouldn't be in the classpath unless you're running tests, and tools aren't tests.

          So yes, I'm proposing a separate place to store the tests and their dependencies. This implies the user needs to modify the classpath to run the tests, and we could make this easier by providing a "hadoop tests-classpath" tool or something similar to aid them in doing so.

          Show
          Jason Lowe added a comment - Agreed it's risky to blindly pull in all test scope JARs, so maybe it needs to be an explicit list of jars to include. are you suggesting to create a new libtest/ dir? Wouldn't make more sense to have a tools module for this? this seems like a tool usecase. I'm not sure it's best to clutter the tools directory with the unit test jars. They're not really related, and it can lead to some of the same problems we're trying to avoid when someone runs tools. IMHO the test stuff shouldn't be in the classpath unless you're running tests, and tools aren't tests. So yes, I'm proposing a separate place to store the tests and their dependencies. This implies the user needs to modify the classpath to run the tests, and we could make this easier by providing a "hadoop tests-classpath" tool or something similar to aid them in doing so.
          Hide
          Alejandro Abdelnur added a comment -

          I'd argue that if you are running TestDFSIO in a cluster you are using a testcase as a tool to determine something. And this, IMO, applies to all tests used in this manner.

          Show
          Alejandro Abdelnur added a comment - I'd argue that if you are running TestDFSIO in a cluster you are using a testcase as a tool to determine something. And this, IMO, applies to all tests used in this manner.
          Hide
          Alejandro Abdelnur added a comment -

          (the testcase has been overloaded as a tool).

          Show
          Alejandro Abdelnur added a comment - (the testcase has been overloaded as a tool).
          Hide
          Jason Lowe added a comment -

          But then don't we have the issues of the tests polluting the other tools, or am I missing something? When TOOL_PATH is added to the classpath, as is done when running tools like archive and distcp, the tool could potentially encounter the same kinds of issues with tests in the classpath that we're trying to avoid.

          Show
          Jason Lowe added a comment - But then don't we have the issues of the tests polluting the other tools, or am I missing something? When TOOL_PATH is added to the classpath, as is done when running tools like archive and distcp, the tool could potentially encounter the same kinds of issues with tests in the classpath that we're trying to avoid.
          Hide
          Alejandro Abdelnur added a comment -

          Well, each tool should have its own lib/ dir, that would solve the problem. Until now tools don't depend on additional JARs than then ones provided in the Hadoop cluster, thus never was a need for that.

          Show
          Alejandro Abdelnur added a comment - Well, each tool should have its own lib/ dir, that would solve the problem. Until now tools don't depend on additional JARs than then ones provided in the Hadoop cluster, thus never was a need for that.
          Hide
          Jason Lowe added a comment -

          Actually I'm thinking of cases where the test jars themselves cause the problems, see HDFS-3831. There are a lot of things in these tests jars besides the items that are invoked by ToolRunner, and not all test jars even use ToolRunner. If desired we could separate these into "tests that are really tools" which would go into tools/ and shouldn't rely on junit or other test framework stuff and "tests that are really unit tests" that go into something like tests/. That would make running the "tests that are tools" a bit easier since we hopefully don't need a separate classpath beyond TOOL_PATH, but the junit test cases are completely out of the way.

          Show
          Jason Lowe added a comment - Actually I'm thinking of cases where the test jars themselves cause the problems, see HDFS-3831 . There are a lot of things in these tests jars besides the items that are invoked by ToolRunner, and not all test jars even use ToolRunner. If desired we could separate these into "tests that are really tools" which would go into tools/ and shouldn't rely on junit or other test framework stuff and "tests that are really unit tests" that go into something like tests/. That would make running the "tests that are tools" a bit easier since we hopefully don't need a separate classpath beyond TOOL_PATH, but the junit test cases are completely out of the way.
          Hide
          Tom White added a comment -

          If desired we could separate these into "tests that are really tools" which would go into tools/ and shouldn't rely on junit or other test framework stuff and "tests that are really unit tests" that go into something like tests/.

          +1. Many of the "tests that are really tools" are benchmarks so we could call them that.

          Show
          Tom White added a comment - If desired we could separate these into "tests that are really tools" which would go into tools/ and shouldn't rely on junit or other test framework stuff and "tests that are really unit tests" that go into something like tests/. +1. Many of the "tests that are really tools" are benchmarks so we could call them that.
          Hide
          Robert Joseph Evans added a comment -

          We need to be in the process of separating out true unit test from system and integration tests. Unit tests should run fast and are something that we can do for all components as part of the pre-commit build. If the test is full featured enough that it could be called as a "tool" then there is no way that it is a true unit test. I am +1 for moving those out to be part of a tools or examples package somewhere.

          We also need to look at cleaning up our classpaths. Separating each tool out into a directory with a full list of its dependencies seems like a reasonable solution. It is what Oozie asks users to do for their work flows and seems to work fairly well. But that starts to sound like a larger effort then moving a few classes around and splitting a launcher into two. I think it is something that needs to be done, but perhaps needs some design work, especially in relation to how we may want to do dependency isolation in the future with OSGi or something else. Alejandro, I know you have been looking at and thinking about the classpath issue with YARN/MR, and how we should package things a lot already. MAPREDUCE-3745, HADOOP-7935, and MAPREDUCE-4421 do we need another JIRA explicitly for tools? How do we handle the case of a tool having a map/reduce dependency now that the MR code is going to be separated out so that we can use a different version of MR? Does that mean that they have to provide their own tools with a MR dependency and a config to point to them? It just seems like a change like this needs a full design.

          Show
          Robert Joseph Evans added a comment - We need to be in the process of separating out true unit test from system and integration tests. Unit tests should run fast and are something that we can do for all components as part of the pre-commit build. If the test is full featured enough that it could be called as a "tool" then there is no way that it is a true unit test. I am +1 for moving those out to be part of a tools or examples package somewhere. We also need to look at cleaning up our classpaths. Separating each tool out into a directory with a full list of its dependencies seems like a reasonable solution. It is what Oozie asks users to do for their work flows and seems to work fairly well. But that starts to sound like a larger effort then moving a few classes around and splitting a launcher into two. I think it is something that needs to be done, but perhaps needs some design work, especially in relation to how we may want to do dependency isolation in the future with OSGi or something else. Alejandro, I know you have been looking at and thinking about the classpath issue with YARN/MR, and how we should package things a lot already. MAPREDUCE-3745 , HADOOP-7935 , and MAPREDUCE-4421 do we need another JIRA explicitly for tools? How do we handle the case of a tool having a map/reduce dependency now that the MR code is going to be separated out so that we can use a different version of MR? Does that mean that they have to provide their own tools with a MR dependency and a config to point to them? It just seems like a change like this needs a full design.
          Hide
          Alejandro Abdelnur added a comment -

          Moving each tools JARs into separate lib/ dirs it is quite easy (modifying a single assembly). What we should think is a common bootstrap script for that so each tool does not have to duplicate (and get wrong) such script. I'll open a JIRA for that.

          Show
          Alejandro Abdelnur added a comment - Moving each tools JARs into separate lib/ dirs it is quite easy (modifying a single assembly). What we should think is a common bootstrap script for that so each tool does not have to duplicate (and get wrong) such script. I'll open a JIRA for that.
          Alejandro Abdelnur made changes -
          Link This issue is related to MAPREDUCE-4658 [ MAPREDUCE-4658 ]
          Hide
          Alejandro Abdelnur added a comment -
          Show
          Alejandro Abdelnur added a comment - Opened MAPREDUCE-4644
          Hide
          Harsh J added a comment -

          There is a workaround for this right? To place it manually on classpath and via -libjars for each app?

          I mean to ask: Should this really be a blocker?

          Show
          Harsh J added a comment - There is a workaround for this right? To place it manually on classpath and via -libjars for each app? I mean to ask: Should this really be a blocker?
          Hide
          Arun C Murthy added a comment -

          This is really a blocker - the inability to run tests (system tests) on a release is a blocker.

          For now I think we should revert HADOOP-8738 and revisit it once we have a fix.

          Show
          Arun C Murthy added a comment - This is really a blocker - the inability to run tests (system tests) on a release is a blocker. For now I think we should revert HADOOP-8738 and revisit it once we have a fix.
          Hide
          Arun C Murthy added a comment -

          Resolving since we've reverted HADOOP-8738.

          Show
          Arun C Murthy added a comment - Resolving since we've reverted HADOOP-8738 .
          Arun C Murthy made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Not A Problem [ 8 ]

            People

            • Assignee:
              Unassigned
              Reporter:
              Jason Lowe
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development