Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1623 High Availability Framework for HDFS NN
  3. HDFS-2393

Mark appropriate methods of ClientProtocol with the idempotent annotation

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: HA branch (HDFS-1623)
    • Fix Version/s: HA branch (HDFS-1623)
    • Component/s: ha, hdfs-client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      HDFS-1973 will make the DFSClient take advantage of the annotation. This JIRA is to identify which methods can be marked as idempotent.

      1. HDFS-2393-HDFS-1623.patch
        10 kB
        Aaron T. Myers
      2. HDFS-2393-HDFS-1623.patch
        11 kB
        Aaron T. Myers

        Issue Links

          Activity

          Hide
          atm Aaron T. Myers added a comment -

          I've been working on categorizing all of the 45 operations currently defined in ClientProtocol. This is the breakdown of what I've come up with:

          Read ops, therefore idempotent:

          • getBlockLocations
          • getPreferredBlockSize
          • getListing
          • getFileInfo
          • getFileLinkInfo
          • getStats
          • getDatanodeReport
          • listCorruptFileBlocks
          • getContentSummary
          • getLinkTarget
          • getServerDefaults

          Write ops that I believe should be marked idempotent:

          • setReplication
          • setPermission
          • setOwner
          • reportBadBlocks
          • mkdirs (in fact, it looks like this op can never return false. It either returns true or throws an exception.)
          • renewLease
          • setQuota
          • fsync
          • setTimes
          • getDelegationToken
          • renewDelegationToken

          Write ops I'm unsure about:

          • getAdditionalDatanode (I'm not sure this one should even be marked OperationCategory.WRITE - it only gets the readLock)
          • recoverLease
          • updateBlockForPipeline

          Write ops that I believe should not be idempotent:

          • create
          • append
          • abandonBlock
          • addBlock
          • complete
          • rename
          • concat
          • rename2
          • delete
          • createSymLink
          • updatePipeline
          • cancelDelegationToken

          Ops which are more administrative, and arguably shouldn't be in ClientProtocol:

          • setSafeMode
          • saveNamespace
          • restoreFailedStorage
          • refreshNodes
          • finalizeUpgrade
          • distributedUpgradeProgress
          • metaSave
          • setBalancerBandwidth

          Comments are most welcome. In particular, I'm interested in:

          1. Does anyone have any insight about the three operations I'm not sure about? We could play it safe at first, and not mark them idempotent. Doing so will result in a few more client failures during an NN fail over, but will not result in correctness issues.
          2. What to do about the 8 operations which seem administrative in nature?
          3. Did I misidentify any of the write operations which I believe to be idempotent?

          Once we hash these things out a little bit, I'll go ahead and make a patch for it.

          Show
          atm Aaron T. Myers added a comment - I've been working on categorizing all of the 45 operations currently defined in ClientProtocol . This is the breakdown of what I've come up with: Read ops, therefore idempotent: getBlockLocations getPreferredBlockSize getListing getFileInfo getFileLinkInfo getStats getDatanodeReport listCorruptFileBlocks getContentSummary getLinkTarget getServerDefaults Write ops that I believe should be marked idempotent: setReplication setPermission setOwner reportBadBlocks mkdirs (in fact, it looks like this op can never return false. It either returns true or throws an exception.) renewLease setQuota fsync setTimes getDelegationToken renewDelegationToken Write ops I'm unsure about: getAdditionalDatanode (I'm not sure this one should even be marked OperationCategory.WRITE - it only gets the readLock) recoverLease updateBlockForPipeline Write ops that I believe should not be idempotent: create append abandonBlock addBlock complete rename concat rename2 delete createSymLink updatePipeline cancelDelegationToken Ops which are more administrative, and arguably shouldn't be in ClientProtocol: setSafeMode saveNamespace restoreFailedStorage refreshNodes finalizeUpgrade distributedUpgradeProgress metaSave setBalancerBandwidth Comments are most welcome. In particular, I'm interested in: Does anyone have any insight about the three operations I'm not sure about? We could play it safe at first, and not mark them idempotent. Doing so will result in a few more client failures during an NN fail over, but will not result in correctness issues. What to do about the 8 operations which seem administrative in nature? Did I misidentify any of the write operations which I believe to be idempotent? Once we hash these things out a little bit, I'll go ahead and make a patch for it.
          Hide
          tlipcon Todd Lipcon added a comment -

          I agree with your assessment of the write ops. The three you weren't sure about I think are all idempotent:

          • getAdditionalDatanode just picks a datanode and doesn't modify any state. It's probably marked WRITE since it's used by the file write path.
          • If recoverLease is retried after a success, it will return an error code that the lease is already recovered. This is already possible if the lease expires just before the caller calls recoverLease. So callers already need to know how to deal with that case regardless of failover/HA.
          • updateBlockForPipeline will allocate a new generation stamp if it's called a second time, but that's OK - we just need to make sure genstamps increase, not that they increase exactly once.

          As for the administrative ones:

          • setSafeMode, saveNamespace, restoreFailedStorage, refreshNodes, and metaSave are all under the scope of one NN (even in an HA setup). That is to say, they would be targeted at a specific IP, not at a logical cluster, I believe.
          • haven't thought much about upgrade yet. I don't know those code paths well. Let's mark those TODO
          • setBalancerBandwidth seems like a hack that shouldn't really be part of RPC anyway. Let's ignore this one for now too.
          Show
          tlipcon Todd Lipcon added a comment - I agree with your assessment of the write ops. The three you weren't sure about I think are all idempotent: getAdditionalDatanode just picks a datanode and doesn't modify any state. It's probably marked WRITE since it's used by the file write path. If recoverLease is retried after a success, it will return an error code that the lease is already recovered. This is already possible if the lease expires just before the caller calls recoverLease. So callers already need to know how to deal with that case regardless of failover/HA. updateBlockForPipeline will allocate a new generation stamp if it's called a second time, but that's OK - we just need to make sure genstamps increase, not that they increase exactly once. As for the administrative ones: setSafeMode, saveNamespace, restoreFailedStorage, refreshNodes, and metaSave are all under the scope of one NN (even in an HA setup). That is to say, they would be targeted at a specific IP, not at a logical cluster, I believe. haven't thought much about upgrade yet. I don't know those code paths well. Let's mark those TODO setBalancerBandwidth seems like a hack that shouldn't really be part of RPC anyway. Let's ignore this one for now too.
          Hide
          atm Aaron T. Myers added a comment -

          Thanks a lot for the review, Todd. Here's a patch which implements what we've discussed here.

          I added three TODOs for the upgrade-related operations and the setBalancerBandwidth op. I agree with your analysis regarding the administrative ops, so I deliberately left those as non-idempotent.

          Show
          atm Aaron T. Myers added a comment - Thanks a lot for the review, Todd. Here's a patch which implements what we've discussed here. I added three TODOs for the upgrade-related operations and the setBalancerBandwidth op. I agree with your analysis regarding the administrative ops, so I deliberately left those as non-idempotent.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12498080/HDFS-2393-HDFS-1623.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1350//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1350//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12498080/HDFS-2393-HDFS-1623.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1350//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1350//console This message is automatically generated.
          Hide
          tlipcon Todd Lipcon added a comment -

          +1

          Show
          tlipcon Todd Lipcon added a comment - +1
          Hide
          atm Aaron T. Myers added a comment -

          Thanks a lot for the review, Todd. Attaching patch rebased on HDFS-1623 branch.

          I'm going to commit this momentarily.

          Show
          atm Aaron T. Myers added a comment - Thanks a lot for the review, Todd. Attaching patch rebased on HDFS-1623 branch. I'm going to commit this momentarily.
          Hide
          atm Aaron T. Myers added a comment -

          I've just committed this.

          Show
          atm Aaron T. Myers added a comment - I've just committed this.

            People

            • Assignee:
              atm Aaron T. Myers
              Reporter:
              atm Aaron T. Myers
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development