Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4987

TestMRJobs#testDistributedCache fails on Windows due to classpath problems and unexpected behavior of symlinks

    Details

      Description

      On Windows, TestMRJobs#testDistributedCache fails on an assertion while checking the length of a symlink. It expects to see the length of the target of the symlink, but Java 6 on Windows always reports that a symlink has length 0.

      1. MAPREDUCE-4987.6.patch
        6 kB
        Chris Nauroth
      2. MAPREDUCE-4987.5.patch
        17 kB
        Chris Nauroth
      3. MAPREDUCE-4987.4.patch
        17 kB
        Chris Nauroth
      4. MAPREDUCE-4987.3.patch
        16 kB
        Chris Nauroth
      5. MAPREDUCE-4987.2.patch
        16 kB
        Chris Nauroth
      6. MAPREDUCE-4987.1.patch
        14 kB
        Chris Nauroth

        Issue Links

          Activity

          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Suresh Srinivas made changes -
          Fix Version/s 2.0.5-beta [ 12324032 ]
          Fix Version/s 3.0.0 [ 12320355 ]
          Target Version/s 3.0.0 [ 12320355 ]
          Hide
          Suresh Srinivas added a comment -

          I merged the patch to branch-2.

          Show
          Suresh Srinivas added a comment - I merged the patch to branch-2.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1405 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1405/)
          MAPREDUCE-4987. TestMRJobs#testDistributedCache fails on Windows due to classpath problems and unexpected behavior of symlinks (Chris Nauroth via bikas) (Revision 1470003)

          Result = SUCCESS
          bikas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1470003
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1405 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1405/ ) MAPREDUCE-4987 . TestMRJobs#testDistributedCache fails on Windows due to classpath problems and unexpected behavior of symlinks (Chris Nauroth via bikas) (Revision 1470003) Result = SUCCESS bikas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1470003 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1378 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1378/)
          MAPREDUCE-4987. TestMRJobs#testDistributedCache fails on Windows due to classpath problems and unexpected behavior of symlinks (Chris Nauroth via bikas) (Revision 1470003)

          Result = FAILURE
          bikas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1470003
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1378 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1378/ ) MAPREDUCE-4987 . TestMRJobs#testDistributedCache fails on Windows due to classpath problems and unexpected behavior of symlinks (Chris Nauroth via bikas) (Revision 1470003) Result = FAILURE bikas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1470003 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Yarn-trunk #189 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/189/)
          MAPREDUCE-4987. TestMRJobs#testDistributedCache fails on Windows due to classpath problems and unexpected behavior of symlinks (Chris Nauroth via bikas) (Revision 1470003)

          Result = SUCCESS
          bikas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1470003
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
          Show
          Hudson added a comment - Integrated in Hadoop-Yarn-trunk #189 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/189/ ) MAPREDUCE-4987 . TestMRJobs#testDistributedCache fails on Windows due to classpath problems and unexpected behavior of symlinks (Chris Nauroth via bikas) (Revision 1470003) Result = SUCCESS bikas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1470003 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
          Hide
          Chris Nauroth added a comment -

          Bikas, thank you for the help on code reviews and commits!

          Show
          Chris Nauroth added a comment - Bikas, thank you for the help on code reviews and commits!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk-Commit #3637 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3637/)
          MAPREDUCE-4987. TestMRJobs#testDistributedCache fails on Windows due to classpath problems and unexpected behavior of symlinks (Chris Nauroth via bikas) (Revision 1470003)

          Result = SUCCESS
          bikas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1470003
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
          Show
          Hudson added a comment - Integrated in Hadoop-trunk-Commit #3637 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3637/ ) MAPREDUCE-4987 . TestMRJobs#testDistributedCache fails on Windows due to classpath problems and unexpected behavior of symlinks (Chris Nauroth via bikas) (Revision 1470003) Result = SUCCESS bikas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1470003 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
          Bikas Saha made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 3.0.0 [ 12320355 ]
          Resolution Fixed [ 1 ]
          Hide
          Bikas Saha added a comment -

          +1. Committed to trunk.

          Show
          Bikas Saha added a comment - +1. Committed to trunk.
          Hide
          Chris Nauroth added a comment -

          Bikas, thanks for the feedback.

          I thought that static blocks are executed once per class loader. So I am not sure why this one would be executed per inner class object creation.

          It's true that static initialization happens at class load time, so once per JVM/class loader. The problem here is that the test is submitting a MapReduce job with a mapper class defined as a nested class of the test class. Then, each map task runs inside its own JVM/class loader. Therefore, each separate JVM loads the class separately and executes the static block separately. All of these JVMs were trying to create a MiniDFSCluster with the same configuration, so they were all colliding on the same directories for namenode metadata and datanode blocks.

          I would be wary of simply doubling the test timeouts.

          The biggest driver of timeout changes has been differences in developer environments. Test timeouts have been problematic for people like me who primarily develop on Mac or Linux but now want to contribute to Windows compatibility. We end up needing to run tests on under-powered VMs. Timeout values are generally an arbitrary choice by the original author of the test, based on his or her own machine's performance characteristics at the time. There has been some discussion of trying to parameterize JUnit to scale the timeouts up or down to suit your development hardware. For right now, I don't have any better solution than increasing the timeouts.

          this probably can be done once instead of multiple times right? I am assuming this is a slow filesystem operation.

          This logic iterates over multiple distinct localized resources, potentially a mix of file and directories, so we need to check isDirectory for each one.

          btw, there doesnt seem to be a test about explicitly adding local resources to the classpath in this patch, right?

          No, this case is already covered by the existing test TestMRJobs#testDistributedCache. The test fails before this patch and passes after this patch.

          Finally, this will have to be split into common, mr and yarn jiras+patches, though we will need a combined patch to get a successful jenkins run. we can attach the combined patch to the common jira because that will be committed first.

          This is done. The related jiras are: HADOOP-9488, MAPREDUCE-4987, and YARN-593.

          Show
          Chris Nauroth added a comment - Bikas, thanks for the feedback. I thought that static blocks are executed once per class loader. So I am not sure why this one would be executed per inner class object creation. It's true that static initialization happens at class load time, so once per JVM/class loader. The problem here is that the test is submitting a MapReduce job with a mapper class defined as a nested class of the test class. Then, each map task runs inside its own JVM/class loader. Therefore, each separate JVM loads the class separately and executes the static block separately. All of these JVMs were trying to create a MiniDFSCluster with the same configuration, so they were all colliding on the same directories for namenode metadata and datanode blocks. I would be wary of simply doubling the test timeouts. The biggest driver of timeout changes has been differences in developer environments. Test timeouts have been problematic for people like me who primarily develop on Mac or Linux but now want to contribute to Windows compatibility. We end up needing to run tests on under-powered VMs. Timeout values are generally an arbitrary choice by the original author of the test, based on his or her own machine's performance characteristics at the time. There has been some discussion of trying to parameterize JUnit to scale the timeouts up or down to suit your development hardware. For right now, I don't have any better solution than increasing the timeouts. this probably can be done once instead of multiple times right? I am assuming this is a slow filesystem operation. This logic iterates over multiple distinct localized resources, potentially a mix of file and directories, so we need to check isDirectory for each one. btw, there doesnt seem to be a test about explicitly adding local resources to the classpath in this patch, right? No, this case is already covered by the existing test TestMRJobs#testDistributedCache . The test fails before this patch and passes after this patch. Finally, this will have to be split into common, mr and yarn jiras+patches, though we will need a combined patch to get a successful jenkins run. we can attach the combined patch to the common jira because that will be committed first. This is done. The related jiras are: HADOOP-9488 , MAPREDUCE-4987 , and YARN-593 .
          Chris Nauroth made changes -
          Summary TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks TestMRJobs#testDistributedCache fails on Windows due to classpath problems and unexpected behavior of symlinks
          Chris Nauroth made changes -
          Link This issue relates to YARN-593 [ YARN-593 ]
          Chris Nauroth made changes -
          Link This issue is related to HADOOP-9488 [ HADOOP-9488 ]
          Chris Nauroth made changes -
          Attachment MAPREDUCE-4987.6.patch [ 12579572 ]
          Hide
          Chris Nauroth added a comment -

          Attaching patch with just the MapReduce portion of the changes.

          Show
          Chris Nauroth added a comment - Attaching patch with just the MapReduce portion of the changes.
          Chris Nauroth made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Bikas Saha added a comment -

          I thought that static blocks are executed once per class loader. So I am not sure why this one would be executed per inner class object creation. In any case, moving the code to @BeforeClass is a right thing to do in general.

          Are the test times similar in Linux and Windows or they are close to timing out only on Windows. I would be wary of simply doubling the test timeouts.

          this probably can be done once instead of multiple times right? I am assuming this is a slow filesystem operation.

          +            if (new File(entry.getKey().toUri().getPath()).isDirectory()) { <---- THIS
          +              newClassPath.append(Path.SEPARATOR);
          +            }
          

          btw, there doesnt seem to be a test about explicitly adding local resources to the classpath in this patch, right?

          Finally, this will have to be split into common, mr and yarn jiras+patches, though we will need a combined patch to get a successful jenkins run. we can attach the combined patch to the common jira because that will be committed first.

          Show
          Bikas Saha added a comment - I thought that static blocks are executed once per class loader. So I am not sure why this one would be executed per inner class object creation. In any case, moving the code to @BeforeClass is a right thing to do in general. Are the test times similar in Linux and Windows or they are close to timing out only on Windows. I would be wary of simply doubling the test timeouts. this probably can be done once instead of multiple times right? I am assuming this is a slow filesystem operation. + if ( new File(entry.getKey().toUri().getPath()).isDirectory()) { <---- THIS + newClassPath.append(Path.SEPARATOR); + } btw, there doesnt seem to be a test about explicitly adding local resources to the classpath in this patch, right? Finally, this will have to be split into common, mr and yarn jiras+patches, though we will need a combined patch to get a successful jenkins run. we can attach the combined patch to the common jira because that will be committed first.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12575122/MAPREDUCE-4987.5.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 tests included appear to have a timeout.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3462//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3462//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12575122/MAPREDUCE-4987.5.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 tests included appear to have a timeout. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3462//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3462//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12575105/MAPREDUCE-4987.4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 tests included appear to have a timeout.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3460//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3460//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12575105/MAPREDUCE-4987.4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 tests included appear to have a timeout. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3460//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3460//console This message is automatically generated.
          Hide
          Arpit Agarwal added a comment -

          +1

          Show
          Arpit Agarwal added a comment - +1
          Chris Nauroth made changes -
          Attachment MAPREDUCE-4987.5.patch [ 12575122 ]
          Hide
          Chris Nauroth added a comment -

          Thanks for the catch, Arpit. Here is version 5 of the patch to fix the typo.

          Show
          Chris Nauroth added a comment - Thanks for the catch, Arpit. Here is version 5 of the patch to fix the typo.
          Hide
          Arpit Agarwal added a comment -

          Typo in comment (should be 'substitution').

          +        // context.  Do the same thing here for correct subtitution of
          

          +1 otherwise. I verified the patch on Windows and OS X.

          Show
          Arpit Agarwal added a comment - Typo in comment (should be 'substitution'). + // context. Do the same thing here for correct subtitution of +1 otherwise. I verified the patch on Windows and OS X.
          Chris Nauroth made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Chris Nauroth made changes -
          Attachment MAPREDUCE-4987.4.patch [ 12575105 ]
          Hide
          Chris Nauroth added a comment -

          I'm attaching version 4 of the patch, which just makes one more change in ContainerLaunch#sanitizeEnv. I said earlier that we need to use the environment variables of the container when building the classpath jar. That was only partially correct. Instead, we need to start with the environment variables of the current process, and then add/overwrite the container environment established within sanitizeEnv. This agrees with the logic of launching the container process, which uses Shell#ShellCommandExector#execute. This uses a ProcessBuilder initialized with the environment of the parent process before adding the container environment variables. This is significant when launching a cluster from the distribution and you want to override things like HADOOP_MAPRED_HOME in your dev environment.

          I also reordered the code so that sanitizeEnv completes all of its changes to the environment before we create the classpath jar.

          I reran the tests on Mac and Windows after this final change.

          Show
          Chris Nauroth added a comment - I'm attaching version 4 of the patch, which just makes one more change in ContainerLaunch#sanitizeEnv . I said earlier that we need to use the environment variables of the container when building the classpath jar. That was only partially correct. Instead, we need to start with the environment variables of the current process, and then add/overwrite the container environment established within sanitizeEnv . This agrees with the logic of launching the container process, which uses Shell#ShellCommandExector#execute . This uses a ProcessBuilder initialized with the environment of the parent process before adding the container environment variables. This is significant when launching a cluster from the distribution and you want to override things like HADOOP_MAPRED_HOME in your dev environment. I also reordered the code so that sanitizeEnv completes all of its changes to the environment before we create the classpath jar. I reran the tests on Mac and Windows after this final change.
          Chris Nauroth made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Chris Nauroth added a comment -

          Canceling the patch for a moment. I may have found something that I want to amend. I will update later today.

          Show
          Chris Nauroth added a comment - Canceling the patch for a moment. I may have found something that I want to amend. I will update later today.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12574912/MAPREDUCE-4987.3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 tests included appear to have a timeout.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3452//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3452//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12574912/MAPREDUCE-4987.3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 tests included appear to have a timeout. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3452//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3452//console This message is automatically generated.
          Chris Nauroth made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Chris Nauroth made changes -
          Attachment MAPREDUCE-4987.3.patch [ 12574912 ]
          Hide
          Chris Nauroth added a comment -

          Attaching new patch, rebased after commit of YARN-488.

          Show
          Chris Nauroth added a comment - Attaching new patch, rebased after commit of YARN-488 .
          Chris Nauroth made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Chris Nauroth added a comment -

          I need to rebase this patch now that YARN-488 has been committed.

          Show
          Chris Nauroth added a comment - I need to rebase this patch now that YARN-488 has been committed.
          Chris Nauroth made changes -
          Link This issue is related to YARN-488 [ YARN-488 ]
          Hide
          Chris Nauroth added a comment -

          This patch may cause a merge conflict with my patch on YARN-488, depending on which one gets committed first. After one of these gets committed, I'll check and rebase if necessary.

          Show
          Chris Nauroth added a comment - This patch may cause a merge conflict with my patch on YARN-488 , depending on which one gets committed first. After one of these gets committed, I'll check and rebase if necessary.
          Hide
          Chris Nauroth added a comment -

          The new test failures appear to be unrelated, and I believe that they were introduced by the patch on MAPREDUCE-5028. I'm following up on that jira.

          Show
          Chris Nauroth added a comment - The new test failures appear to be unrelated, and I believe that they were introduced by the patch on MAPREDUCE-5028 . I'm following up on that jira.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12574326/MAPREDUCE-4987.2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 tests included appear to have a timeout.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

          org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser
          org.apache.hadoop.mapreduce.lib.db.TestDataDrivenDBInputFormat
          org.apache.hadoop.mapred.TestFieldSelection
          org.apache.hadoop.mapreduce.TestLocalRunner
          org.apache.hadoop.mapred.TestUserDefinedCounters
          org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution
          org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
          org.apache.hadoop.mapreduce.TestMROutputFormat
          org.apache.hadoop.mapred.TestLineRecordReader
          org.apache.hadoop.mapreduce.lib.fieldsel.TestMRFieldSelection
          org.apache.hadoop.mapred.TestMiniMRClasspath
          org.apache.hadoop.mapreduce.lib.map.TestMultithreadedMapper
          org.apache.hadoop.mapred.TestReduceFetchFromPartialMem
          org.apache.hadoop.mapred.lib.TestChainMapReduce
          org.apache.hadoop.mapreduce.security.TestBinaryTokenFile
          org.apache.hadoop.mapreduce.TestMapReduce
          org.apache.hadoop.mapred.TestLazyOutput
          org.apache.hadoop.mapreduce.lib.join.TestJoinDatamerge
          org.apache.hadoop.mapred.lib.TestKeyFieldBasedComparator
          org.apache.hadoop.mapred.lib.TestMultithreadedMapRunner
          org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService
          org.apache.hadoop.mapreduce.TestChild
          org.apache.hadoop.mapred.lib.TestMultipleOutputs
          org.apache.hadoop.mapred.TestJavaSerialization
          org.apache.hadoop.mapreduce.lib.input.TestLineRecordReader
          org.apache.hadoop.mapreduce.security.ssl.TestEncryptedShuffle
          org.apache.hadoop.mapreduce.lib.output.TestMRMultipleOutputs
          org.apache.hadoop.mapred.TestClusterMapReduceTestCase
          org.apache.hadoop.mapred.TestCollect
          org.apache.hadoop.fs.slive.TestSlive
          org.apache.hadoop.mapred.join.TestDatamerge
          org.apache.hadoop.mapreduce.TestMapCollection
          org.apache.hadoop.mapred.TestMiniMRClientCluster
          org.apache.hadoop.fs.TestDFSIO
          org.apache.hadoop.mapred.TestMapRed
          org.apache.hadoop.mapred.TestFileOutputFormat
          org.apache.hadoop.mapreduce.TestValueIterReset
          org.apache.hadoop.mapreduce.v2.TestMROldApiJobs
          org.apache.hadoop.mapred.TestJobCounters
          org.apache.hadoop.fs.TestFileSystem
          org.apache.hadoop.mapreduce.v2.TestUberAM
          org.apache.hadoop.conf.TestNoDefaultsJobConf
          org.apache.hadoop.mapred.TestReporter
          org.apache.hadoop.mapred.TestJobName
          org.apache.hadoop.mapreduce.lib.partition.TestMRKeyFieldBasedComparator
          org.apache.hadoop.mapreduce.lib.chain.TestChainErrors
          org.apache.hadoop.mapreduce.lib.chain.TestSingleElementChain
          org.apache.hadoop.mapreduce.lib.input.TestMultipleInputs
          org.apache.hadoop.mapreduce.v2.TestMRJobs
          org.apache.hadoop.mapred.TestComparators
          org.apache.hadoop.mapreduce.lib.chain.TestMapReduceChain
          org.apache.hadoop.mapred.jobcontrol.TestLocalJobControl
          org.apache.hadoop.mapreduce.TestMapReduceLazyOutput
          org.apache.hadoop.mapreduce.lib.jobcontrol.TestMapReduceJobControl

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3433//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3433//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12574326/MAPREDUCE-4987.2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 tests included appear to have a timeout. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser org.apache.hadoop.mapreduce.lib.db.TestDataDrivenDBInputFormat org.apache.hadoop.mapred.TestFieldSelection org.apache.hadoop.mapreduce.TestLocalRunner org.apache.hadoop.mapred.TestUserDefinedCounters org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath org.apache.hadoop.mapreduce.TestMROutputFormat org.apache.hadoop.mapred.TestLineRecordReader org.apache.hadoop.mapreduce.lib.fieldsel.TestMRFieldSelection org.apache.hadoop.mapred.TestMiniMRClasspath org.apache.hadoop.mapreduce.lib.map.TestMultithreadedMapper org.apache.hadoop.mapred.TestReduceFetchFromPartialMem org.apache.hadoop.mapred.lib.TestChainMapReduce org.apache.hadoop.mapreduce.security.TestBinaryTokenFile org.apache.hadoop.mapreduce.TestMapReduce org.apache.hadoop.mapred.TestLazyOutput org.apache.hadoop.mapreduce.lib.join.TestJoinDatamerge org.apache.hadoop.mapred.lib.TestKeyFieldBasedComparator org.apache.hadoop.mapred.lib.TestMultithreadedMapRunner org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService org.apache.hadoop.mapreduce.TestChild org.apache.hadoop.mapred.lib.TestMultipleOutputs org.apache.hadoop.mapred.TestJavaSerialization org.apache.hadoop.mapreduce.lib.input.TestLineRecordReader org.apache.hadoop.mapreduce.security.ssl.TestEncryptedShuffle org.apache.hadoop.mapreduce.lib.output.TestMRMultipleOutputs org.apache.hadoop.mapred.TestClusterMapReduceTestCase org.apache.hadoop.mapred.TestCollect org.apache.hadoop.fs.slive.TestSlive org.apache.hadoop.mapred.join.TestDatamerge org.apache.hadoop.mapreduce.TestMapCollection org.apache.hadoop.mapred.TestMiniMRClientCluster org.apache.hadoop.fs.TestDFSIO org.apache.hadoop.mapred.TestMapRed org.apache.hadoop.mapred.TestFileOutputFormat org.apache.hadoop.mapreduce.TestValueIterReset org.apache.hadoop.mapreduce.v2.TestMROldApiJobs org.apache.hadoop.mapred.TestJobCounters org.apache.hadoop.fs.TestFileSystem org.apache.hadoop.mapreduce.v2.TestUberAM org.apache.hadoop.conf.TestNoDefaultsJobConf org.apache.hadoop.mapred.TestReporter org.apache.hadoop.mapred.TestJobName org.apache.hadoop.mapreduce.lib.partition.TestMRKeyFieldBasedComparator org.apache.hadoop.mapreduce.lib.chain.TestChainErrors org.apache.hadoop.mapreduce.lib.chain.TestSingleElementChain org.apache.hadoop.mapreduce.lib.input.TestMultipleInputs org.apache.hadoop.mapreduce.v2.TestMRJobs org.apache.hadoop.mapred.TestComparators org.apache.hadoop.mapreduce.lib.chain.TestMapReduceChain org.apache.hadoop.mapred.jobcontrol.TestLocalJobControl org.apache.hadoop.mapreduce.TestMapReduceLazyOutput org.apache.hadoop.mapreduce.lib.jobcontrol.TestMapReduceJobControl +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3433//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3433//console This message is automatically generated.
          Hide
          Ivan Mitic added a comment -

          These tests were running pretty close to the timeouts in my environment, even on Mac. Here is a new patch that increases the timeouts.

          Thanks, I verified that the test now passes, +1 on the patch

          Show
          Ivan Mitic added a comment - These tests were running pretty close to the timeouts in my environment, even on Mac. Here is a new patch that increases the timeouts. Thanks, I verified that the test now passes, +1 on the patch
          Chris Nauroth made changes -
          Attachment MAPREDUCE-4987.2.patch [ 12574326 ]
          Hide
          Chris Nauroth added a comment -

          These tests were running pretty close to the timeouts in my environment, even on Mac. Here is a new patch that increases the timeouts.

          Show
          Chris Nauroth added a comment - These tests were running pretty close to the timeouts in my environment, even on Mac. Here is a new patch that increases the timeouts.
          Hide
          Ivan Mitic added a comment -

          I played around with the test a bit, and the following tests fail because of the timeout on my box: testRandomWriter, testFailingMapper, testSleepJobWithSecurityOn.

          Increasing the test timeouts by the factor of 2 helped make the tests pass on my box.

          Show
          Ivan Mitic added a comment - I played around with the test a bit, and the following tests fail because of the timeout on my box: testRandomWriter, testFailingMapper, testSleepJobWithSecurityOn. Increasing the test timeouts by the factor of 2 helped make the tests pass on my box.
          Hide
          Ivan Mitic added a comment -

          Hi Chris, I played around with the test a bit, and the following tests fail because of the timeout on my box: testRandomWriter, testFailingMapper, testSleepJobWithSecurityOn.

          Show
          Ivan Mitic added a comment - Hi Chris, I played around with the test a bit, and the following tests fail because of the timeout on my box: testRandomWriter, testFailingMapper, testSleepJobWithSecurityOn.
          Hide
          Ivan Mitic added a comment -

          Is there a particular test within the suite that is timing out consistently for you?

          I tried to remove all timeouts from the test and it is passing now. Let me find the exact test case that is causing problems.

          Show
          Ivan Mitic added a comment - Is there a particular test within the suite that is timing out consistently for you? I tried to remove all timeouts from the test and it is passing now. Let me find the exact test case that is causing problems.
          Hide
          Chris Nauroth added a comment -

          Thanks, Ivan. TestMRJobs has been passing consistently for me. I haven't been experiencing timeout failures. Is there a particular test within the suite that is timing out consistently for you?

          Show
          Chris Nauroth added a comment - Thanks, Ivan. TestMRJobs has been passing consistently for me. I haven't been experiencing timeout failures. Is there a particular test within the suite that is timing out consistently for you?
          Hide
          Ivan Mitic added a comment -

          Thanks Chris, patch looks good overall, +1

          I noticed that TestMRJobs fails with timeout on my box. Does it consistently succeed for you? I see that the timeouts are set quite high (5 minutes). This is non blocking, I'll take a look when I get a chance, just thought I'll ask.

          TestFileUtil passes fine.

          Show
          Ivan Mitic added a comment - Thanks Chris, patch looks good overall, +1 I noticed that TestMRJobs fails with timeout on my box. Does it consistently succeed for you? I see that the timeouts are set quite high (5 minutes). This is non blocking, I'll take a look when I get a chance, just thought I'll ask. TestFileUtil passes fine.
          Hide
          Arpit Agarwal added a comment -

          +1

          Chris explained to me offline about the change in FileUtil#createJarWithClassPath. Quoting here since I found it helpful to understand the change.

          In the method sanitizeEnv, you'll see that nodemanager does various things to set up a new environment for the container to be launched. The final state of this environment will be different from the environment of the currently running process (the nodemanager itself).

          The most glaring problem with this bug was the setting of PWD to the new container work directory. There are various classpath entries for the distributed cache files that are of the form $PWD/file on Mac or %PWD%/file on Windows, and FileUtil#createJarWithClassPath needs to expand this to <container_dir>/file. Without this change, the variable expansion would be incorrect: <nodemanager_working_dir>/file on Mac or just /file on Windows (since Windows doesn't intrinsically have %PWD% defined until nodemanager sets it in sanitizeEnv).

          Show
          Arpit Agarwal added a comment - +1 Chris explained to me offline about the change in FileUtil#createJarWithClassPath. Quoting here since I found it helpful to understand the change. In the method sanitizeEnv, you'll see that nodemanager does various things to set up a new environment for the container to be launched. The final state of this environment will be different from the environment of the currently running process (the nodemanager itself). The most glaring problem with this bug was the setting of PWD to the new container work directory. There are various classpath entries for the distributed cache files that are of the form $PWD/file on Mac or %PWD%/file on Windows, and FileUtil#createJarWithClassPath needs to expand this to <container_dir>/file. Without this change, the variable expansion would be incorrect: <nodemanager_working_dir>/file on Mac or just /file on Windows (since Windows doesn't intrinsically have %PWD% defined until nodemanager sets it in sanitizeEnv).
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12573897/MAPREDUCE-4987.1.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 tests included appear to have a timeout.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3420//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3420//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12573897/MAPREDUCE-4987.1.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 tests included appear to have a timeout. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3420//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3420//console This message is automatically generated.
          Chris Nauroth made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Chris Nauroth made changes -
          Attachment MAPREDUCE-4987.1.patch [ 12573897 ]
          Hide
          Chris Nauroth added a comment -

          I'm attaching a patch. This fixes the issue of symlink handling on Windows by copying the files instead of truly symlinking, similar to the approach taken in prior patches like HADOOP-9061. This also fixes the logic for bundling the classpath into a jar manifest by guaranteeing that localized resources get added to the classpath, even if those localized resource don't exist in the container path yet. (The classpath jar must get created before the container launch script runs to symlink or copy files from filecache, so this was a chicken-and-egg problem.) With these changes in place, TestMRJobs#testDistributedCache passes on Mac and Windows.

          Here is a summary of the changes in each file:

          FileUtil#createJarWithClassPath - Accept environment provided by caller, because YARN will construct an environment different from the current system environment. Provide a way to maintain a classpath entry with a trailing '/' even though the directory doesn't exist, because the container launch script hasn't run yet.

          TestFileUtil#testCreateJarWithClassPath - Change test to cover new logic.

          TestMRJobs - Initialize MiniDFSCluster in a @BeforeClass method instead of a static initialization block. This test uses an inner class, DistributedCacheChecker, as the job's mapper. Since this is an inner class, it has a back-reference to the TestMRJobs class. This means that the TestMRJobs static initialization runs for each mapper task in addition to running in the JUnit runner. Therefore, this would start multiple instances of MiniDFSCluster pointing at the same directories, which would sometimes cause deadlocks. Moving the initialization to a @BeforeClass method prevents it from running in the mappers. I also needed to add a special check that a path is a symlinked directory, because FileUtils#isSymlink does not work as expected on Windows.

          ContainerLaunch - Copy files instead of symlinking on Windows. Guarantee that localized resources get added to the classpath correctly, even if the paths do not exist yet.

          Show
          Chris Nauroth added a comment - I'm attaching a patch. This fixes the issue of symlink handling on Windows by copying the files instead of truly symlinking, similar to the approach taken in prior patches like HADOOP-9061 . This also fixes the logic for bundling the classpath into a jar manifest by guaranteeing that localized resources get added to the classpath, even if those localized resource don't exist in the container path yet. (The classpath jar must get created before the container launch script runs to symlink or copy files from filecache, so this was a chicken-and-egg problem.) With these changes in place, TestMRJobs#testDistributedCache passes on Mac and Windows. Here is a summary of the changes in each file: FileUtil#createJarWithClassPath - Accept environment provided by caller, because YARN will construct an environment different from the current system environment. Provide a way to maintain a classpath entry with a trailing '/' even though the directory doesn't exist, because the container launch script hasn't run yet. TestFileUtil#testCreateJarWithClassPath - Change test to cover new logic. TestMRJobs - Initialize MiniDFSCluster in a @BeforeClass method instead of a static initialization block. This test uses an inner class, DistributedCacheChecker , as the job's mapper. Since this is an inner class, it has a back-reference to the TestMRJobs class. This means that the TestMRJobs static initialization runs for each mapper task in addition to running in the JUnit runner. Therefore, this would start multiple instances of MiniDFSCluster pointing at the same directories, which would sometimes cause deadlocks. Moving the initialization to a @BeforeClass method prevents it from running in the mappers. I also needed to add a special check that a path is a symlinked directory, because FileUtils#isSymlink does not work as expected on Windows. ContainerLaunch - Copy files instead of symlinking on Windows. Guarantee that localized resources get added to the classpath correctly, even if the paths do not exist yet.
          Chris Nauroth made changes -
          Assignee Chris Nauroth [ cnauroth ]
          Affects Version/s 3.0.0 [ 12320355 ]
          Affects Version/s trunk-win [ 12323449 ]
          Target Version/s trunk-win [ 12323449 ] 3.0.0 [ 12320355 ]
          Hide
          Chris Nauroth added a comment -

          In addition to the symlink problem mentioned earlier, I've discovered that there are also some problems with the logic for packaging the classpath into a jar manifest on Windows. This can cause us to miss distributed cache entries on the classpath. I'm going to try to address all of this in the same patch.

          Show
          Chris Nauroth added a comment - In addition to the symlink problem mentioned earlier, I've discovered that there are also some problems with the logic for packaging the classpath into a jar manifest on Windows. This can cause us to miss distributed cache entries on the classpath. I'm going to try to address all of this in the same patch.
          Hide
          Ivan Mitic added a comment -

          Thanks for reporting this Chris.

          But symlinks CAN be used for functional purposes i.e linking to libraries etc. ?

          Hi Vinod. I believe we'll have to port the branch-1-win semantic to trunk to properly support symlinks on both Java6 and Java7 on Windows. Yes, symlinks can be created to point to folders and files, however, Java6 does not interpret them correctly. We've seen so many issues with symlinks on Java6, and the only option that worked fine (and was signed off on) is to do a file copy in case of Java6. HADOOP-9061 talks about some of these problems.

          If so we can just do the platform check in the test-case.

          We also initially thought this would be fine (you can check thru branch-1-win history ). However, the real problem comes when someone tries to access the symlink thru Java APIs. Examples of problems are, File#length on symlinks returns zero. This means that RLFS does not work on top of symlinks. Additionally, File#renameTo on symlink renames the target file instead of the symlink (really strange I know ).

          Hope this helps

          Show
          Ivan Mitic added a comment - Thanks for reporting this Chris. But symlinks CAN be used for functional purposes i.e linking to libraries etc. ? Hi Vinod. I believe we'll have to port the branch-1-win semantic to trunk to properly support symlinks on both Java6 and Java7 on Windows. Yes, symlinks can be created to point to folders and files, however, Java6 does not interpret them correctly. We've seen so many issues with symlinks on Java6, and the only option that worked fine (and was signed off on) is to do a file copy in case of Java6. HADOOP-9061 talks about some of these problems. If so we can just do the platform check in the test-case. We also initially thought this would be fine (you can check thru branch-1-win history ). However, the real problem comes when someone tries to access the symlink thru Java APIs. Examples of problems are, File#length on symlinks returns zero. This means that RLFS does not work on top of symlinks. Additionally, File#renameTo on symlink renames the target file instead of the symlink (really strange I know ). Hope this helps
          Hide
          Vinod Kumar Vavilapalli added a comment -

          But symlinks CAN be used for functional purposes i.e linking to libraries etc. ? If so we can just do the platform check in the test-case.

          Show
          Vinod Kumar Vavilapalli added a comment - But symlinks CAN be used for functional purposes i.e linking to libraries etc. ? If so we can just do the platform check in the test-case.
          Chris Nauroth made changes -
          Link This issue is part of YARN-191 [ YARN-191 ]
          Chris Nauroth made changes -
          Link This issue is part of MAPREDUCE-4401 [ MAPREDUCE-4401 ]
          Chris Nauroth made changes -
          Link This issue is part of YARN-191 [ YARN-191 ]
          Chris Nauroth made changes -
          Link This issue relates to HADOOP-9061 [ HADOOP-9061 ]
          Hide
          Chris Nauroth added a comment -

          In TestMRJobs, the last assertion in this code fragment fails:

                // Check lengths of the files
                Map<String, Path> filesMap = pathsToMap(files);
                Assert.assertTrue(filesMap.containsKey("distributed.first.symlink"));
                Assert.assertEquals(1, localFs.getFileStatus(
                  filesMap.get("distributed.first.symlink")).getLen());
          

          This is a known issue with Java 6 on Windows. It always reports a symlink as having length zero instead of the length of the target file. This problem was fixed on branch-1-win in HADOOP-9061 by detecting if the runtime environment is Windows + Java 6, and if so, copying files into the symlink location instead of actually creating a symlink. Applying the same logic to branch-trunk-win will require different code changes. In YARN, the symlinks for the distributed cache get generated by the container launch scripts. See ContainerLaunch#WindowsShellScriptBuilder#link.

          Show
          Chris Nauroth added a comment - In TestMRJobs , the last assertion in this code fragment fails: // Check lengths of the files Map< String , Path> filesMap = pathsToMap(files); Assert.assertTrue(filesMap.containsKey( "distributed.first.symlink" )); Assert.assertEquals(1, localFs.getFileStatus( filesMap.get( "distributed.first.symlink" )).getLen()); This is a known issue with Java 6 on Windows. It always reports a symlink as having length zero instead of the length of the target file. This problem was fixed on branch-1-win in HADOOP-9061 by detecting if the runtime environment is Windows + Java 6, and if so, copying files into the symlink location instead of actually creating a symlink. Applying the same logic to branch-trunk-win will require different code changes. In YARN, the symlinks for the distributed cache get generated by the container launch scripts. See ContainerLaunch#WindowsShellScriptBuilder#link .
          Chris Nauroth made changes -
          Field Original Value New Value
          Description On Windows, TestMRJobs#testDistributedCache fails on an assertion while checking the length of a symlink. It expects to see the length of the target of the symlink, but Java 6 on Windows always reports that a symlink has length 0. On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while checking the length of a symlink. It expects to see the length of the target of the symlink, but Java 6 on Windows always reports that a symlink has length 0.
          Chris Nauroth created issue -

            People

            • Assignee:
              Chris Nauroth
              Reporter:
              Chris Nauroth
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development