Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3941

Extend FileSystem API to return file-checksums/file-digests

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.19.0
    • fs
    • None
    • Reviewed
    • Added new FileSystem APIs: FileChecksum and FileSystem.getFileChecksum(Path).

    Description

      Suppose we have two files in two locations (may be two clusters) and these two files have the same size. How could we tell whether the content of them are the same?

      Currently, the only way is to read both files and compare the content of them. This is a very expensive operation if the files are huge.

      So, we would like to extend the FileSystem API to support returning file-checksums/file-digests.

      Attachments

        1. 3941_20080904.patch
          10 kB
          Tsz-wo Sze
        2. 3941_20080827.patch
          17 kB
          Tsz-wo Sze
        3. 3941_20080826.patch
          14 kB
          Tsz-wo Sze
        4. 3941_20080820.patch
          18 kB
          Tsz-wo Sze
        5. 3941_20080819b.patch
          18 kB
          Tsz-wo Sze
        6. 3941_20080819.patch
          15 kB
          Tsz-wo Sze
        7. 3941_20080818.patch
          10 kB
          Tsz-wo Sze

        Issue Links

          Activity

            People

              szetszwo Tsz-wo Sze
              szetszwo Tsz-wo Sze
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: