Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-6365

distributed cache doesn't work with HDFS and another file system

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.20.1
    • None
    • filecache
    • None
    • CentOS

    Description

      This is a continuation of http://issues.apache.org/jira/browse/HADOOP-5635 (JIRA wouldn't let me edit that one). I found another issue with DistributedCache using something besides HDFS. In my case I have TWO active file systems, with HDFS being the default file system.

      My fix includes two additional changes (from HADOOP-5635) to get it to work with another filesystem scheme (plus the changes from the original patch). I've tested this an it works with my code on HDFS with another filesystem. I have similar changes to mapreduce.filecacheTaskDistributedCacheManager and TrackerDistributedCacheManager (0.22.0).

      Basically, URI.getPath() is called instead of URI.toString(). toString returns the scheme plus path which is important in finding the file to copy (getting the file system). Otherwise it searches the default file system (in this case HDFS) for the file.

      Attachments

        Activity

          People

            Unassigned Unassigned
            enzo69mc Marc Colosimo
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 0.5h
                0.5h
                Remaining:
                Remaining Estimate - 0.5h
                0.5h
                Logged:
                Time Spent - Not Specified
                Not Specified