Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1450

checksums should be closer to data generation and consumption

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • fs
    • None

    Description

      ChecksumFileSystem checksums data by inserting a filter between two buffers. The outermost buffer should be as small as possible, so that, when writing, checksums are computed before the data has spent much time in memory, and, when reading, checksums are validated as close to their time of use as possible. Currently the outer buffer is the larger, using the bufferSize specified by the user, and the inner is small, so that most reads and writes will bypass it, as an optimization. Instead, the outer buffer should be made to be bytesPerChecksum, and the inner buffer should be the user-specified buffer size.

      Attachments

        1. HADOOP-1450.patch
          2 kB
          Doug Cutting

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            cutting Doug Cutting
            cutting Doug Cutting
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment