Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9092

Nfs silently drops overlapping write requests and causes data copying to fail

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.7.1
    • 2.8.0, 3.0.0-alpha1
    • nfs
    • None
    • Reviewed

    Description

      When NOT using 'sync' option, the NFS writes may issue the following warning:
      org.apache.hadoop.hdfs.nfs.nfs3.OpenFileCtx: Got an overlapping write (1248751616, 1249677312), nextOffset=1248752400. Silently drop it now

      and the size of data copied via NFS will stay at 1248752400.

      Found what happened is:

      1. The write requests from client are sent asynchronously.
      2. The NFS gateway has handler to handle the incoming requests by creating an internal write request structuire and put it into cache;
      3. In parallel, a separate thread in NFS gateway takes requests out from the cache and writes the data to HDFS.

      The current offset is how much data has been written by the write thread in 3. The detection of overlapping write request happens in 2, but it only checks the write request against the curent offset, and trim the request if necessary. Because the write requests are sent asynchronously, if two requests are beyond the current offset, and they overlap, it's not detected and both are put into the cache. This cause the symptom reported in this case at step 3.

      Attachments

        1. HDFS-9092.001.patch
          17 kB
          Yongjun Zhang
        2. HDFS-9092.002.patch
          18 kB
          Yongjun Zhang

        Activity

          People

            yzhangal Yongjun Zhang
            yzhangal Yongjun Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: