Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1846

Don't fill preallocated portion of edits log with 0x00

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.23.0
    • 0.23.0
    • namenode
    • None
    • Reviewed

    Description

      HADOOP-2330 added a feature to preallocate space in the local file system for the NN transaction log. That change seeks past the current end of the file and writes out some data, which on most systems results in the intervening data in the file being filled with zeros. Most underlying file systems have special handling for sparse files, and don't actually allocate blocks on disk for blocks of a file which consist completely of 0x00.

      I've seen cases in the wild where the volume an edits dir is on fills up, resulting in a partial final transaction being written out to disk. If you examine the bytes of this (now corrupt) edits file, you'll see the partial final transaction followed by a lot of zeros, suggesting that the preallocation previously succeeded before the volume ran out of space. If we fill the preallocated space with something other than zeros, we'd likely see the failure at preallocation time, rather than transaction-writing time, and so cause the NN to crash earlier, without a partial transaction being written out.

      I also hypothesize that filling the preallocated space in the edits log with something other than 0x00 will result in a performance improvement in NN throughput. I haven't tested this yet, but I intend to as part of this JIRA.

      Attachments

        1. hdfs-1846-perf-analysis.0.patch
          7 kB
          Aaron Myers
        2. hdfs-1846.3.patch
          5 kB
          Aaron Myers
        3. hdfs-1846.3.patch
          5 kB
          Aaron Myers
        4. hdfs-1846.2.patch
          4 kB
          Aaron Myers
        5. hdfs-1846.1.patch
          4 kB
          Aaron Myers
        6. hdfs-1846.0.txt
          2 kB
          Aaron Myers
        7. editsStored
          1.00 MB
          Aaron Myers

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            atm Aaron Myers
            atm Aaron Myers
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment