Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15620 Über-jira: S3A phase VI: Hadoop 3.3 features
  3. HADOOP-15224

builld up md5 checksum as blocks are built in S3ABlockOutputStream; validate upload

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0
    • Fix Version/s: None
    • Component/s: fs/s3
    • Labels:
      None
    • Target Version/s:

      Description

      Ryan Blue reports sometimes he sees corrupt data on S3. Given MD5 checks from upload to S3, its likelier to have happened in VM RAM, HDD or nearby.

      If the MD5 checksum for each block was built up as data was written to it, and checked against the etag RAM/HDD storage of the saved blocks could be removed as sources of corruption

      The obvious place would be org.apache.hadoop.fs.s3a.S3ADataBlocks.DataBlock

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                stevel@apache.org Steve Loughran
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated: