Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-7611

SequenceFile.Sorter creates local temp files on HDFS

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.20.2
    • None
    • io
    • None
    • CentOS 5.6 64-bit, Oracle JDK 1.6.0_26 64-bit

    Description

      When using SequenceFile.Sorter to sort or merge sequence files that exist in HDFS, it attempts to create temp files in a directory structure specified by mapred.local.dir but on HDFS, not in the local file system. The problem code is in MergeQueue.merge(). Starting at line 2953:

                  Path outputFile =  lDirAlloc.getLocalPathForWrite(
                                                      tmpFilename.toString(),
                                                      approxOutputSize, conf);
                  LOG.debug("writing intermediate results to " + outputFile);
                  Writer writer = cloneFileAttributes(
                                                      fs.makeQualified(segmentsToMerge.get(0).segmentPathName), 
                                                      fs.makeQualified(outputFile), null);
      

      The outputFile here is a local path without a scheme, e.g. "/mnt/mnt1/mapred/local", specified by the mapred.local.dir property. If we are sorting files on HDFS, the fs object is a DistributedFileSystem. The call to fs.makeQualified(outputFile) appends the fs object's scheme to the local temp path returned by lDirAlloc, e.g. hdfs://mnt/mnt1/mapred/local. This directory is then created (if the proper permissions are available) on HDFS. If the HDFS permissions are not available, the sort/merge fails even though the directories exist locally.

      The code should instead always use the local file system if retrieving a path from the mapred.local.dir property. The unit tests do not test this condition, they only test using the local file system for sort and merge.

      Attachments

        Activity

          People

            Unassigned Unassigned
            bryanck Bryan Keller
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: