Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3941

Extend FileSystem API to return file-checksums/file-digests

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.19.0
    • Component/s: fs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Added new FileSystem APIs: FileChecksum and FileSystem.getFileChecksum(Path).

      Description

      Suppose we have two files in two locations (may be two clusters) and these two files have the same size. How could we tell whether the content of them are the same?

      Currently, the only way is to read both files and compare the content of them. This is a very expensive operation if the files are huge.

      So, we would like to extend the FileSystem API to support returning file-checksums/file-digests.

        Attachments

        1. 3941_20080904.patch
          10 kB
          Tsz Wo Nicholas Sze
        2. 3941_20080827.patch
          17 kB
          Tsz Wo Nicholas Sze
        3. 3941_20080826.patch
          14 kB
          Tsz Wo Nicholas Sze
        4. 3941_20080820.patch
          18 kB
          Tsz Wo Nicholas Sze
        5. 3941_20080819b.patch
          18 kB
          Tsz Wo Nicholas Sze
        6. 3941_20080819.patch
          15 kB
          Tsz Wo Nicholas Sze
        7. 3941_20080818.patch
          10 kB
          Tsz Wo Nicholas Sze

          Issue Links

            Activity

              People

              • Assignee:
                szetszwo Tsz Wo Nicholas Sze
                Reporter:
                szetszwo Tsz Wo Nicholas Sze
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: