Hadoop Common
  1. Hadoop Common
  2. HADOOP-673

the task execution environment should have a current working directory that is task specific

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.2
    • Fix Version/s: 0.10.0
    • Component/s: None
    • Labels:
      None

      Description

      The tasks should be run in a work directory that is specific to a single task. In particular, I'd suggest using the <local>/jobcache/<jobid>/<taskid> as the current working directory.

      1. cwd_1.patch
        7 kB
        Mahadev konar
      2. cwd.patch
        7 kB
        Mahadev konar

        Activity

        Hide
        Mahadev konar added a comment -

        I do think the suggestion is right-- only that it would involve changes to hadoop-streaming and might break other applications that assume the current working directory to be the directory where the unjarred files lie. That was the only reason I chose the current working directory to be the same for all tasks, but it does seem ugly that all the tasks would share the same workspace. I can make the required changes if people agree on it.

        Show
        Mahadev konar added a comment - I do think the suggestion is right-- only that it would involve changes to hadoop-streaming and might break other applications that assume the current working directory to be the directory where the unjarred files lie. That was the only reason I chose the current working directory to be the same for all tasks, but it does seem ugly that all the tasks would share the same workspace. I can make the required changes if people agree on it.
        Hide
        Dick King added a comment -

        If you create links from the task-specific directory to the per-job-directory files and make the task-specific directory you'll avoid breaking existing code. This assumes people don't write in the per-job-directory but it's tough to do that in a principled manner so I wouldn't expect that to happen.

        Show
        Dick King added a comment - If you create links from the task-specific directory to the per-job-directory files and make the task-specific directory you'll avoid breaking existing code. This assumes people don't write in the per-job-directory but it's tough to do that in a principled manner so I wouldn't expect that to happen.
        Hide
        Mahadev konar added a comment -

        I have the following proposal:

        The tasks run in the task specific directory that is
        <local>/jobcache/<jobid>/<taskid> to create a virtual isolated environment for the tasks

        and streaming and all other applications access the job.jar files
        using
        ../work/

        does that seem ok?
        I will be implementing this soon so please comment if you have any suggestions or problems with the above proposal.

        Show
        Mahadev konar added a comment - I have the following proposal: The tasks run in the task specific directory that is <local>/jobcache/<jobid>/<taskid> to create a virtual isolated environment for the tasks and streaming and all other applications access the job.jar files using ../work/ does that seem ok? I will be implementing this soon so please comment if you have any suggestions or problems with the above proposal.
        Hide
        Richard Kasperski added a comment -

        I think that it is very important that there be a way for an application to run in a sandbox <local>/jobcache/<jobid>/<taskid> that contains the contents of the jar file and is the current working directory. How one accomplishes this is an implementation detail. If the only way to do it is to unjar the archive more than once then I guess that would have to be the solution. This saves the potential for a lot of grief. No shared files are ever modified because they aren't actually shared. This causes more unpacking of jars but I don't really see that as a problem. How the jar's are copied to a node and the subsequent reuse of the jar is important.

        That is the sandbox. Then there is the shared sandbox which is also two different instance to memory map a file and pay a single cost. This is best handled by either softlinks or hardlinks under unix/linux.

        Why do I think these are important models? Most of the programs that I write and that I use run out of the current directory and expect all of the their configuration files/resources can be read from there. For programs that have more sophisticated models of deployment the models above are still ok. For the simpler programs the proposed external repository doesn't work.

        OTOH I can always run a script that will hard link the files from the directory where the jar to my current working directory. I just don't believe that this should be forced on the users. Having the users do system'ish things is potentially dangerous.

        Even more restrictive then the above sandbox would be one in which the application is run chroot'd. That way there is no way it could muck with anything system like on the nodes. This is an important consideration when one lets arbitrary programs to be run.

        Show
        Richard Kasperski added a comment - I think that it is very important that there be a way for an application to run in a sandbox <local>/jobcache/<jobid>/<taskid> that contains the contents of the jar file and is the current working directory. How one accomplishes this is an implementation detail. If the only way to do it is to unjar the archive more than once then I guess that would have to be the solution. This saves the potential for a lot of grief. No shared files are ever modified because they aren't actually shared. This causes more unpacking of jars but I don't really see that as a problem. How the jar's are copied to a node and the subsequent reuse of the jar is important. That is the sandbox. Then there is the shared sandbox which is also two different instance to memory map a file and pay a single cost. This is best handled by either softlinks or hardlinks under unix/linux. Why do I think these are important models? Most of the programs that I write and that I use run out of the current directory and expect all of the their configuration files/resources can be read from there. For programs that have more sophisticated models of deployment the models above are still ok. For the simpler programs the proposed external repository doesn't work. OTOH I can always run a script that will hard link the files from the directory where the jar to my current working directory. I just don't believe that this should be forced on the users. Having the users do system'ish things is potentially dangerous. Even more restrictive then the above sandbox would be one in which the application is run chroot'd. That way there is no way it could muck with anything system like on the nodes. This is an important consideration when one lets arbitrary programs to be run.
        Hide
        Owen O'Malley added a comment -

        Richard, I don't understand why having the jar expanded in ".." is unacceptable to you. It seems to me the requirements are:
        1. there be an easy way to get to the expanded jar
        2. each task run in sandbox
        In previous systems we've taken this approach and it has worked very well.

        The problem with links is that they are not supported in Java, which is our implementation language, or in Windows, which is a supported platform. Furthermore, since the files are a SHARED resource it is not appropriate for them to be hardlinked into the sandbox. If the application uses "../foo.txt" they are aware that they are leaving the sandbox and that they need to play nicely.

        Show
        Owen O'Malley added a comment - Richard, I don't understand why having the jar expanded in ".." is unacceptable to you. It seems to me the requirements are: 1. there be an easy way to get to the expanded jar 2. each task run in sandbox In previous systems we've taken this approach and it has worked very well. The problem with links is that they are not supported in Java, which is our implementation language, or in Windows, which is a supported platform. Furthermore, since the files are a SHARED resource it is not appropriate for them to be hardlinked into the sandbox. If the application uses "../foo.txt" they are aware that they are leaving the sandbox and that they need to play nicely.
        Hide
        Richard Kasperski added a comment -

        Owen,
        There are two primary reasons.
        1. Maybe I don't have access to the source of all programs that I might want to run, or I have the source and don't desire to make any changes. Having the data for an application be rooted in the current directory allows many more programs to be run with no changes.
        2. I think of hadoop as an infrastructure to run other progams. not as something to write programs too. This means that I do not want to write into my progams hadoop'isms.

        Running streaming apps on hadoop should be as transparent as possible.

        Files being a shared resource is an efficiency consideration not one required for proper operation. Is there an operational requirement for the files being shared such as I need to mmap the same file? Is this likely the way that most programs will be.

        It's not that I'm absolutely against the .. structure, just that it seems to me not to be the most useful structure. I would contend that if you desire the .. placement you need also, some how, provide for placement in the current working directory.

        The above comments only apply to streaming.

        A more general comment, perhaps philosophical, perhaps religious.

        The purpose of hadoop is to allow work to be divided and then executed with a belief of correctness that is the same as would have if I ran it as a single job on a single machine. Sharing data in a way that allows inadvertant/unintended manipulation of 'SHARED' data violates this. It seems to me that this should be used when efficiency is needed and the user can take an appropriate explicit action to enable it.

        Show
        Richard Kasperski added a comment - Owen, There are two primary reasons. 1. Maybe I don't have access to the source of all programs that I might want to run, or I have the source and don't desire to make any changes. Having the data for an application be rooted in the current directory allows many more programs to be run with no changes. 2. I think of hadoop as an infrastructure to run other progams. not as something to write programs too. This means that I do not want to write into my progams hadoop'isms. Running streaming apps on hadoop should be as transparent as possible. Files being a shared resource is an efficiency consideration not one required for proper operation. Is there an operational requirement for the files being shared such as I need to mmap the same file? Is this likely the way that most programs will be. It's not that I'm absolutely against the .. structure, just that it seems to me not to be the most useful structure. I would contend that if you desire the .. placement you need also, some how, provide for placement in the current working directory. The above comments only apply to streaming. A more general comment, perhaps philosophical, perhaps religious. The purpose of hadoop is to allow work to be divided and then executed with a belief of correctness that is the same as would have if I ran it as a single job on a single machine. Sharing data in a way that allows inadvertant/unintended manipulation of 'SHARED' data violates this. It seems to me that this should be used when efficiency is needed and the user can take an appropriate explicit action to enable it.
        Hide
        Owen O'Malley added a comment -

        Ok, I propose that we handle this as a streaming option. So:

        1. For all map/reduce jobs the task has current working directory set to the task specific directory.
        2. For streaming if the property "mapred.create.symlink" is true, the contents of the job directory (other than the task directories) are symlinked into the task directory. Note that this property is already used for the file cache for a similar purpose and in streaming defaults to true.

        If there was a convenient way to make the job cache files read only in Java, that would make me more comfortable about violating the sandbox isolation. But as it is, I guess it is upto the application writer to make sure they don't change any of the shared files.

        Show
        Owen O'Malley added a comment - Ok, I propose that we handle this as a streaming option. So: 1. For all map/reduce jobs the task has current working directory set to the task specific directory. 2. For streaming if the property "mapred.create.symlink" is true, the contents of the job directory (other than the task directories) are symlinked into the task directory. Note that this property is already used for the file cache for a similar purpose and in streaming defaults to true. If there was a convenient way to make the job cache files read only in Java, that would make me more comfortable about violating the sandbox isolation. But as it is, I guess it is upto the application writer to make sure they don't change any of the shared files.
        Hide
        Doug Cutting added a comment -

        > if there was a convenient way to make the job cache files read only [ ... ]

        runtime.exec(new String[]

        { "chmod" "-w", file }

        );

        Works under cygwin!

        Show
        Doug Cutting added a comment - > if there was a convenient way to make the job cache files read only [ ... ] runtime.exec(new String[] { "chmod" "-w", file } ); Works under cygwin!
        Hide
        Richard Kasperski added a comment -

        Owen,
        What you have proposed sounds great! It allows external programs written to run on the cluster and legacy apps to run. thanks

        Show
        Richard Kasperski added a comment - Owen, What you have proposed sounds great! It allows external programs written to run on the cluster and legacy apps to run. thanks
        Hide
        Mahadev konar added a comment -

        this patch fixes the problem. It implements owens second proposal ie.
        1) The current working dir for each task is the task directory
        2) for streaming all the contents of job.jar are symlinked into this current working directory for the tasks.

        This patch depends upon HADOOP-748. I have included the patch for HADOOP-748 in this patch.

        Show
        Mahadev konar added a comment - this patch fixes the problem. It implements owens second proposal ie. 1) The current working dir for each task is the task directory 2) for streaming all the contents of job.jar are symlinked into this current working directory for the tasks. This patch depends upon HADOOP-748 . I have included the patch for HADOOP-748 in this patch.
        Hide
        Hadoop QA added a comment -

        +1, http://issues.apache.org/jira/secure/attachment/12346398/cwd.patch applied and successfully tested against trunk revision r481432

        Show
        Hadoop QA added a comment - +1, http://issues.apache.org/jira/secure/attachment/12346398/cwd.patch applied and successfully tested against trunk revision r481432
        Hide
        Doug Cutting added a comment -

        This patch changes fullyDelete() to run 'rm -rf'. We should only rely on unix command-line tools when we absolutely must, and I don't think we have to here. Instead, before listing a directory, we should compare getAbsoluteFile() to getCanonicalFile() to determine whether it's a symbolic link. If it is a symbolic link, then we should ignore it.

        Also, perhaps most of the code added to TaskRunner could be instead added to DistributedCache, since it's cache-specific stuff?

        Show
        Doug Cutting added a comment - This patch changes fullyDelete() to run 'rm -rf'. We should only rely on unix command-line tools when we absolutely must, and I don't think we have to here. Instead, before listing a directory, we should compare getAbsoluteFile() to getCanonicalFile() to determine whether it's a symbolic link. If it is a symbolic link, then we should ignore it. Also, perhaps most of the code added to TaskRunner could be instead added to DistributedCache, since it's cache-specific stuff?
        Hide
        Sameer Paranjpye added a comment -

        Can the difference between getAbsoluteFile() and getCanonicalFile() be used reliably? What happens under cygwin?

        Show
        Sameer Paranjpye added a comment - Can the difference between getAbsoluteFile() and getCanonicalFile() be used reliably? What happens under cygwin?
        Hide
        Owen O'Malley added a comment -

        I'm worried about the performance of using getCanonicalFile on every directory during the recursion, since it must traverse the entire path up to '/' looking for symlinks.

        Another solution would be to just delete the directory first using File.delete() and if it is a symlink, it will work. (With the slight assumption, which could be checked, that the parent directory was writable.)

        Show
        Owen O'Malley added a comment - I'm worried about the performance of using getCanonicalFile on every directory during the recursion, since it must traverse the entire path up to '/' looking for symlinks. Another solution would be to just delete the directory first using File.delete() and if it is a symlink, it will work. (With the slight assumption, which could be checked, that the parent directory was writable.)
        Hide
        Doug Cutting added a comment -

        > Can the difference between getAbsoluteFile() and getCanonicalFile() be used reliably?

        I think so. The docs for getCanonicalFile() explicitly mention symbolic links, while getAbsoluteFile() is merely supposed to resolve relative paths against CWD.

        > What happens under cygwin?

        Cygwin shouldn't impact this, since these are implemented by the JVM.

        Show
        Doug Cutting added a comment - > Can the difference between getAbsoluteFile() and getCanonicalFile() be used reliably? I think so. The docs for getCanonicalFile() explicitly mention symbolic links, while getAbsoluteFile() is merely supposed to resolve relative paths against CWD. > What happens under cygwin? Cygwin shouldn't impact this, since these are implemented by the JVM.
        Hide
        Andrzej Bialecki added a comment -

        One note re: Cygwin - I discovered that you can't access special filesystems provided by Cygwin, such as /proc, /tmp, mountpoints (/cygdrive/c) etc. exactly because the filesystem layer is implemented in JVM and is unaware of Cygwin, its path translations and its synthetic filesystems ...

        So, when I tried to resolve "/proc/self", which is a symbolic link to the current PID under Cygwin, I got nothing because this path is inside Cygwin's "procfs", which invisible to JVM.

        Show
        Andrzej Bialecki added a comment - One note re: Cygwin - I discovered that you can't access special filesystems provided by Cygwin, such as /proc, /tmp, mountpoints (/cygdrive/c) etc. exactly because the filesystem layer is implemented in JVM and is unaware of Cygwin, its path translations and its synthetic filesystems ... So, when I tried to resolve "/proc/self", which is a symbolic link to the current PID under Cygwin, I got nothing because this path is inside Cygwin's "procfs", which invisible to JVM.
        Hide
        Raghu Angadi added a comment -

        If we depend on Cygwin functionality, then shouldn't JVM also be Cygwin's JVM package (if one exists)?

        Show
        Raghu Angadi added a comment - If we depend on Cygwin functionality, then shouldn't JVM also be Cygwin's JVM package (if one exists)?
        Hide
        Doug Cutting added a comment -

        We depend on cygwin only to access OS features that are not otherwise available to Java. This way we can access such features uniformly on unix and windows systems, with identical code in most places, letting cygwin figure out how to implement things on windows, rather than maintaining two implementations of each.

        > when I tried to resolve "/proc/self" [...] under Cygwin, I got nothing

        Right, that path is not a Windows path, and hence only makes sense to cygwin binaries, so it couldn't be accessed directly from Java, which only sees the Windows filesystem. It could be accessed, e.g., by running 'cat /proc/self' in a sub-process, but that would probably get the sub-process, not the parent.

        Show
        Doug Cutting added a comment - We depend on cygwin only to access OS features that are not otherwise available to Java. This way we can access such features uniformly on unix and windows systems, with identical code in most places, letting cygwin figure out how to implement things on windows, rather than maintaining two implementations of each. > when I tried to resolve "/proc/self" [...] under Cygwin, I got nothing Right, that path is not a Windows path, and hence only makes sense to cygwin binaries, so it couldn't be accessed directly from Java, which only sees the Windows filesystem. It could be accessed, e.g., by running 'cat /proc/self' in a sub-process, but that would probably get the sub-process, not the parent.
        Hide
        Mahadev konar added a comment -

        This patch uses java for deleteting the symlinks and have moved the creating of symliks to DistributedCache. The java code tries deleting a directory first which succeeds if its empty or a symlink and if it does not succeed it tries deleting the directory recursively. This works both on cygwin and linux.

        Show
        Mahadev konar added a comment - This patch uses java for deleteting the symlinks and have moved the creating of symliks to DistributedCache. The java code tries deleting a directory first which succeeds if its empty or a symlink and if it does not succeed it tries deleting the directory recursively. This works both on cygwin and linux.
        Hide
        Hadoop QA added a comment -

        +1, http://issues.apache.org/jira/secure/attachment/12346718/cwd_1.patch applied and successfully tested against trunk revision r483772

        Show
        Hadoop QA added a comment - +1, http://issues.apache.org/jira/secure/attachment/12346718/cwd_1.patch applied and successfully tested against trunk revision r483772
        Hide
        Doug Cutting added a comment -

        I just committed this. Thanks, Mahadev!

        Show
        Doug Cutting added a comment - I just committed this. Thanks, Mahadev!

          People

          • Assignee:
            Mahadev konar
            Reporter:
            Owen O'Malley
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development