Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-7659

fs -getmerge isn't guaranteed to work well over non-HDFS filesystems

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.20.204.0
    • Fix Version/s: 3.0.0-alpha1
    • Component/s: fs
    • Labels:
      None
    • Release Note:
      Documented that the "fs -getmerge" shell command may not work properly over non HDFS-filesystem implementations due to platform-varying file list ordering.

      Description

      When you use fs -getmerge with HDFS, you are guaranteed file list sorting (part-00000, part-00001, onwards). When you use the same with other FSes we bundle, the ordering of listing is not guaranteed at all. This is cause of http://download.oracle.com/javase/6/docs/api/java/io/File.html#list() which we use internally for native file listing.

      This should either be documented as a known issue on -getmerge help pages/mans, or a consistent ordering (similar to HDFS) must be applied atop the listing. I suspect the latter only makes it worthy for what we include - while other FSes out there still have to deal with this issue. Perhaps we need a recommendation doc note added to our API?

        Attachments

          Activity

            People

            • Assignee:
              qwertymaniac Harsh J
              Reporter:
              qwertymaniac Harsh J
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: