Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3941

Extend FileSystem API to return file-checksums/file-digests

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.19.0
    • Component/s: fs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Added new FileSystem APIs: FileChecksum and FileSystem.getFileChecksum(Path).

      Description

      Suppose we have two files in two locations (may be two clusters) and these two files have the same size. How could we tell whether the content of them are the same?

      Currently, the only way is to read both files and compare the content of them. This is a very expensive operation if the files are huge.

      So, we would like to extend the FileSystem API to support returning file-checksums/file-digests.

        Attachments

        1. 3941_20080904.patch
          10 kB
          Tsz-wo Sze
        2. 3941_20080827.patch
          17 kB
          Tsz-wo Sze
        3. 3941_20080826.patch
          14 kB
          Tsz-wo Sze
        4. 3941_20080820.patch
          18 kB
          Tsz-wo Sze
        5. 3941_20080819b.patch
          18 kB
          Tsz-wo Sze
        6. 3941_20080819.patch
          15 kB
          Tsz-wo Sze
        7. 3941_20080818.patch
          10 kB
          Tsz-wo Sze

          Issue Links

            Activity

              People

              • Assignee:
                szetszwo Tsz-wo Sze
                Reporter:
                szetszwo Tsz-wo Sze
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: