Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-6365

distributed cache doesn't work with HDFS and another file system

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.1
    • Fix Version/s: None
    • Component/s: filecache
    • Labels:
      None
    • Environment:

      CentOS

      Description

      This is a continuation of http://issues.apache.org/jira/browse/HADOOP-5635 (JIRA wouldn't let me edit that one). I found another issue with DistributedCache using something besides HDFS. In my case I have TWO active file systems, with HDFS being the default file system.

      My fix includes two additional changes (from HADOOP-5635) to get it to work with another filesystem scheme (plus the changes from the original patch). I've tested this an it works with my code on HDFS with another filesystem. I have similar changes to mapreduce.filecacheTaskDistributedCacheManager and TrackerDistributedCacheManager (0.22.0).

      Basically, URI.getPath() is called instead of URI.toString(). toString returns the scheme plus path which is important in finding the file to copy (getting the file system). Otherwise it searches the default file system (in this case HDFS) for the file.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              enzo69mc Marc Colosimo
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 0.5h
                0.5h
                Remaining:
                Remaining Estimate - 0.5h
                0.5h
                Logged:
                Time Spent - Not Specified
                Not Specified