Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.1
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: fs/azure
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      The Azure Blob Storage file system (WASB) now includes optional support for use of the append API by a single writer on a path. Please note that the implementation differs from the semantics of HDFS append. HDFS append internally guarantees that only a single writer may append to a path at a given time. WASB does not enforce this guarantee internally. Instead, the application must enforce access by a single writer, such as by running single-threaded or relying on some external locking mechanism to coordinate concurrent processes. Refer to the Azure Blob Storage documentation page for more details on enabling append in configuration.
      Show
      The Azure Blob Storage file system (WASB) now includes optional support for use of the append API by a single writer on a path. Please note that the implementation differs from the semantics of HDFS append. HDFS append internally guarantees that only a single writer may append to a path at a given time. WASB does not enforce this guarantee internally. Instead, the application must enforce access by a single writer, such as by running single-threaded or relying on some external locking mechanism to coordinate concurrent processes. Refer to the Azure Blob Storage documentation page for more details on enabling append in configuration.

      Description

      Currently the WASB implementation of the HDFS interface does not support Append API. This JIRA is added to design and implement the Append API support to WASB. The intended support for Append would only support a single writer.

      1. Append API.docx
        1.14 MB
        Dushyanth
      2. HADOOP-12635.001.patch
        49 kB
        Dushyanth
      3. HADOOP-12635.002.patch
        50 kB
        Dushyanth
      4. HADOOP-12635.003.patch
        51 kB
        Dushyanth
      5. HADOOP-12635.005.patch
        79 kB
        Dushyanth
      6. HADOOP-12635.006.patch
        79 kB
        Dushyanth
      7. HADOOP-12635.007.patch
        86 kB
        Dushyanth
      8. HADOOP-12635-004.patch
        51 kB
        Dushyanth

        Issue Links

          Activity

          Hide
          dchickabasapa Dushyanth added a comment -

          Attaching is the first draft of the specification for Append.

          Show
          dchickabasapa Dushyanth added a comment - Attaching is the first draft of the specification for Append.
          Hide
          dchickabasapa Dushyanth added a comment -

          Attaching first iteration for Append support in WASB.

          Testing: The patch contains new tests to verify the append support. Also changes have been tested against FileSystemContractLive tests for the both Block Blobs and Page Blobs.

          Show
          dchickabasapa Dushyanth added a comment - Attaching first iteration for Append support in WASB. Testing: The patch contains new tests to verify the append support. Also changes have been tested against FileSystemContractLive tests for the both Block Blobs and Page Blobs.
          Hide
          cnauroth Chris Nauroth added a comment -

          Dushyanth, thank you for the patch. Here are a few comments.

          1. Please let me know if I'm missing something, but it appears there is a fundamental problem in that AzureNativeFileSystemStore#retrieveAppendStream is not atomic. First, it checks the metadata for a prior append lease, and if found, throws an exception. This logic is not coordinated on a lease, so 2 concurrent processes could get past these checks for the same blob, and then call the BlockBlobAppendStream constructor. Then, both processes would call BlockBlobAppendStream#updateBlobAppendMetadata. One process would win the race for the blob lease and set the append lease in metadata. The other process would block (not error) retrying blob lease acquisition. (See SelfRenewingLease constructor.) Eventually, it would acquire the blob lease, and set the append lease for itself in metadata. At this point, there are 2 processes both owning a BlockBlobAppendStream pointing to the same blob. I'm not sure what happens next. Would both processes independently append and commit their own blocks? Whatever happens, it's a violation of single-writer semantics. In HDFS, this sequence is atomic, so it's guaranteed that one of those processes would have acquired the lease and the other would have experienced an exception. Does the whole AzureNativeFileSystemStore#retrieveAppendStream method need to be guarded by the lease?
          2. Please do not start background threads or thread pools from within constructors. This is a pitfall that can lead to tricky edge cases where the background thread sees the object in a partially constructed state. The JVM can even reorder ops in funny ways, making the background thread see seemingly impossible state. Instead, start threads from a separate initialize method that you can call right after the constructor. This will guarantee that the object is in a consistent state before threads observe it. I know SelfRenewingLease and other parts of the Hadoop code start threads from a constructor. It's not a good thing.
          3. APPEND_LEASE_TIMEOUT is 30 seconds and LEASE_RENEWAL_PERIOD is also 30 seconds. That's going to cut it pretty close. If another process tries to append right at the 30-second mark, then the lease renewal might not get ahead of it. FWIW, the HDFS client typically uses a 1-second renewal period and the soft expiration is 60 seconds, beyond which another client may claim the lease.
          4. Various metadata operations like setOwner don't coordinate on the append lease. I think that means any such metadata operations running concurrently with append activity would risk overwriting append_lease_last_modified with an old value, so you could experience lost updates. Maybe this is OK in practice if the renewal period is made more frequent as per above?
          5. The HDFS client previously started a separate renewal thread per lease, much like what BlockBlobAppendStream does here. This eventually became a scalability bottleneck with too many threads in applications that open multiple files concurrently. We evolved to a design of a single lease renewer thread capable of servicing all renewal activity. Let's keep this in mind as a future enhancement if excessive threads starts to become a problem.
          6. In general, it's not necessary to call toString() explicitly for exception and log messages. Particularly in cases like this:
                LOG.debug("Opening file: {} for append", f.toString());
            

            If you just pass f, then the interface of SLF4J accepts the f instance and only calls toString() internally if debug level is enabled. It probably doesn't matter much for Path#toString, but it can improve performance if the class has a more expensive toString() implementation.

          7. I see style issues like lines that are too long. A pre-commit run would find all such problems.
          Show
          cnauroth Chris Nauroth added a comment - Dushyanth , thank you for the patch. Here are a few comments. Please let me know if I'm missing something, but it appears there is a fundamental problem in that AzureNativeFileSystemStore#retrieveAppendStream is not atomic. First, it checks the metadata for a prior append lease, and if found, throws an exception. This logic is not coordinated on a lease, so 2 concurrent processes could get past these checks for the same blob, and then call the BlockBlobAppendStream constructor. Then, both processes would call BlockBlobAppendStream#updateBlobAppendMetadata . One process would win the race for the blob lease and set the append lease in metadata. The other process would block (not error) retrying blob lease acquisition. (See SelfRenewingLease constructor.) Eventually, it would acquire the blob lease, and set the append lease for itself in metadata. At this point, there are 2 processes both owning a BlockBlobAppendStream pointing to the same blob. I'm not sure what happens next. Would both processes independently append and commit their own blocks? Whatever happens, it's a violation of single-writer semantics. In HDFS, this sequence is atomic, so it's guaranteed that one of those processes would have acquired the lease and the other would have experienced an exception. Does the whole AzureNativeFileSystemStore#retrieveAppendStream method need to be guarded by the lease? Please do not start background threads or thread pools from within constructors. This is a pitfall that can lead to tricky edge cases where the background thread sees the object in a partially constructed state. The JVM can even reorder ops in funny ways, making the background thread see seemingly impossible state. Instead, start threads from a separate initialize method that you can call right after the constructor. This will guarantee that the object is in a consistent state before threads observe it. I know SelfRenewingLease and other parts of the Hadoop code start threads from a constructor. It's not a good thing. APPEND_LEASE_TIMEOUT is 30 seconds and LEASE_RENEWAL_PERIOD is also 30 seconds. That's going to cut it pretty close. If another process tries to append right at the 30-second mark, then the lease renewal might not get ahead of it. FWIW, the HDFS client typically uses a 1-second renewal period and the soft expiration is 60 seconds, beyond which another client may claim the lease. Various metadata operations like setOwner don't coordinate on the append lease. I think that means any such metadata operations running concurrently with append activity would risk overwriting append_lease_last_modified with an old value, so you could experience lost updates. Maybe this is OK in practice if the renewal period is made more frequent as per above? The HDFS client previously started a separate renewal thread per lease, much like what BlockBlobAppendStream does here. This eventually became a scalability bottleneck with too many threads in applications that open multiple files concurrently. We evolved to a design of a single lease renewer thread capable of servicing all renewal activity. Let's keep this in mind as a future enhancement if excessive threads starts to become a problem. In general, it's not necessary to call toString() explicitly for exception and log messages. Particularly in cases like this: LOG.debug( "Opening file: {} for append" , f.toString()); If you just pass f , then the interface of SLF4J accepts the f instance and only calls toString() internally if debug level is enabled. It probably doesn't matter much for Path#toString , but it can improve performance if the class has a more expensive toString() implementation. I see style issues like lines that are too long. A pre-commit run would find all such problems.
          Hide
          dchickabasapa Dushyanth added a comment -

          Chris Nauroth Thank you very much for the code review.

          I have attached a new patch based on the review comments.

          1) You are right, there is a race condition. I have modified the updateBlobAppendMetadata to be a test and set operation. In this way, the creation of Append stream would become atomic. So now if two clients try to open the Append stream, one of them would be successful, the other would fail as the metadata would be updated holding a lease on the blob.
          2) Created an initialize() method for creation of thread pool and append lease renewer thread other public API has a check to see if the append stream is initialized.
          3) This was a mistake, I had to set the lease renewel period to 10 secs and the timeout to 30 secs. Keeping it to 1 sec for Azure storage I think would create bottleneck issues for the blob, so I have kept the lease renewel to 10 secs, let me know if you feel otherwise.
          4) Fix to 4 should address it.
          6) Fixed it.

          Show
          dchickabasapa Dushyanth added a comment - Chris Nauroth Thank you very much for the code review. I have attached a new patch based on the review comments. 1) You are right, there is a race condition. I have modified the updateBlobAppendMetadata to be a test and set operation. In this way, the creation of Append stream would become atomic. So now if two clients try to open the Append stream, one of them would be successful, the other would fail as the metadata would be updated holding a lease on the blob. 2) Created an initialize() method for creation of thread pool and append lease renewer thread other public API has a check to see if the append stream is initialized. 3) This was a mistake, I had to set the lease renewel period to 10 secs and the timeout to 30 secs. Keeping it to 1 sec for Azure storage I think would create bottleneck issues for the blob, so I have kept the lease renewel to 10 secs, let me know if you feel otherwise. 4) Fix to 4 should address it. 6) Fixed it.
          Hide
          cnauroth Chris Nauroth added a comment -

          Dushyanth, thank you for incorporating the feedback.

          I'm finding it challenging to follow the logic of whether or not the test-and-set succeeds based on the updatePossible flag. Instead of tracking it in a flag, I'm wondering if the whole logic could be simplified to the following, which wouldn't require a flag tracked in another variable. Let me know your thoughts.

                  // If there is prior lease metadata, and that least timeout has not yet expired, then testCondition must match lease condition.
                  if (metadata.containsKey(APPEND_LEASE) &&
                      currentTime - Long.parseLong(metadata.get(APPEND_LEASE_LAST_MODIFIED)) <= BlockBlobAppendStream.APPEND_LEASE_TIMEOUT &&
                      !metadata.get(APPEND_LEASE).equals(Boolean.toString(testCondition))) {
                    return false;
                  }
                  // Do metadata update here.
                  return true;
          

          I am assuming that uploadMetadata is atomic, so that we will never have a case where metadata exists containing append_lease, but not append_lease_last_modified. If that operation is not in fact atomic, please let me know.

          After addressing that, please feel free to click the Submit Patch button for a pre-commit run.

          Show
          cnauroth Chris Nauroth added a comment - Dushyanth , thank you for incorporating the feedback. I'm finding it challenging to follow the logic of whether or not the test-and-set succeeds based on the updatePossible flag. Instead of tracking it in a flag, I'm wondering if the whole logic could be simplified to the following, which wouldn't require a flag tracked in another variable. Let me know your thoughts. // If there is prior lease metadata, and that least timeout has not yet expired, then testCondition must match lease condition. if (metadata.containsKey(APPEND_LEASE) && currentTime - Long .parseLong(metadata.get(APPEND_LEASE_LAST_MODIFIED)) <= BlockBlobAppendStream.APPEND_LEASE_TIMEOUT && !metadata.get(APPEND_LEASE).equals( Boolean .toString(testCondition))) { return false ; } // Do metadata update here. return true ; I am assuming that uploadMetadata is atomic, so that we will never have a case where metadata exists containing append_lease , but not append_lease_last_modified . If that operation is not in fact atomic, please let me know. After addressing that, please feel free to click the Submit Patch button for a pre-commit run.
          Hide
          cnauroth Chris Nauroth added a comment -

          Dushyanth, I thought of another potential problem with the current patch.

          The code I've seen so far is focused on the append API. I haven't seen any code changes in the create code path, so WASB create calls would not coordinate on the lease with the append calls. This means that it would be possible for process A to open a file for create, and concurrently process B opens the same file for append. Both processes would write data to their respective streams, and I guess ultimately both could flush and commit blocks to Azure Storage. This differs from HDFS, which issues a lease on a path for both create and append calls, thus ensuring single-writer semantics across both create and append. In this same example executing against HDFS, one of the processes would have succeeded, and the other would have received an error.

          If your goal is to match HDFS semantics fully, then the patch would need to do something about the create calls too.

          Show
          cnauroth Chris Nauroth added a comment - Dushyanth , I thought of another potential problem with the current patch. The code I've seen so far is focused on the append API. I haven't seen any code changes in the create code path, so WASB create calls would not coordinate on the lease with the append calls. This means that it would be possible for process A to open a file for create , and concurrently process B opens the same file for append . Both processes would write data to their respective streams, and I guess ultimately both could flush and commit blocks to Azure Storage. This differs from HDFS, which issues a lease on a path for both create and append calls, thus ensuring single-writer semantics across both create and append . In this same example executing against HDFS, one of the processes would have succeeded, and the other would have received an error. If your goal is to match HDFS semantics fully, then the patch would need to do something about the create calls too.
          Hide
          dchickabasapa Dushyanth added a comment -

          Chris Nauroth The conditions for the update of blob metadata in Append depends on the following conditions:

          1) If append lease metadata is not part of the Blob. In this case this is the first client to Append so we update the metadata.
          2) If append lease metadata is present and timeout has occurred. In this case irrespective of what the value of the append lease is we update the metadata.
          3) If append lease metadata is present and is equal to testCondition value (passed as parameter) and timeout has not occurred, we update the metadata.
          4) If append lease metadata is present and is not equal to testCondition value (passed as parameter) and timeout has not occurred, we do not update metadata and return false.

          The updatePossible flag enforces these conditions. But you are correct in that it is difficult to understand and thus maintain. I have made the change to follow the check you have suggested and added comments to specify the condition explicitly. Adding Patch v003 .

          Testing same as Patch v002.

          Show
          dchickabasapa Dushyanth added a comment - Chris Nauroth The conditions for the update of blob metadata in Append depends on the following conditions: 1) If append lease metadata is not part of the Blob. In this case this is the first client to Append so we update the metadata. 2) If append lease metadata is present and timeout has occurred. In this case irrespective of what the value of the append lease is we update the metadata. 3) If append lease metadata is present and is equal to testCondition value (passed as parameter) and timeout has not occurred, we update the metadata. 4) If append lease metadata is present and is not equal to testCondition value (passed as parameter) and timeout has not occurred, we do not update metadata and return false. The updatePossible flag enforces these conditions. But you are correct in that it is difficult to understand and thus maintain. I have made the change to follow the check you have suggested and added comments to specify the condition explicitly. Adding Patch v003 . Testing same as Patch v002.
          Hide
          dchickabasapa Dushyanth added a comment -

          Chris Nauroth Sorry I missed your second CR comment. Taking a look at it now.

          Show
          dchickabasapa Dushyanth added a comment - Chris Nauroth Sorry I missed your second CR comment. Taking a look at it now.
          Hide
          dchickabasapa Dushyanth added a comment -

          Chris Nauroth Thanks for that catch. But I think in WASB we can solve this problem completely in Append and not touch create codepath by using the fact that WASB for Block Blobs writes to a temporary blob during create. During this write, the metadata of the blob would contain a link back to the temporary blob where the actual writes are happening. So in append() we can check the metadata of the blob to see if there is a temporary link and fail if there exists one. Do you see any usecase that I am missing that would fail with this approach?

          Show
          dchickabasapa Dushyanth added a comment - Chris Nauroth Thanks for that catch. But I think in WASB we can solve this problem completely in Append and not touch create codepath by using the fact that WASB for Block Blobs writes to a temporary blob during create. During this write, the metadata of the blob would contain a link back to the temporary blob where the actual writes are happening. So in append() we can check the metadata of the blob to see if there is a temporary link and fail if there exists one. Do you see any usecase that I am missing that would fail with this approach?
          Hide
          cnauroth Chris Nauroth added a comment -

          Dushyanth, that's an interesting thought. I'll need to spend more time reasoning through the possible concurrent execution paths to convince myself fully. I'll keep that in mind while I do a deeper review on patch v003 now.

          I'm going to try a pre-commit run now too.

          Show
          cnauroth Chris Nauroth added a comment - Dushyanth , that's an interesting thought. I'll need to spend more time reasoning through the possible concurrent execution paths to convince myself fully. I'll keep that in mind while I do a deeper review on patch v003 now. I'm going to try a pre-commit run now too.
          Hide
          dchickabasapa Dushyanth added a comment -

          Chris Nauroth When I said that the create() codeflow doesn't have to change, I meant that the logic doesn't have to change however some methods inside AzureNativeFileSystemStore class like storeMetadataAttribute() and rename() do need to change to have a lease on the blob before updating the metadata.

          Show
          dchickabasapa Dushyanth added a comment - Chris Nauroth When I said that the create() codeflow doesn't have to change, I meant that the logic doesn't have to change however some methods inside AzureNativeFileSystemStore class like storeMetadataAttribute() and rename() do need to change to have a lease on the blob before updating the metadata.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          -1 @author 0m 0s The patch appears to contain 2 @author tags which the community has agreed to not allow in code contributions.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 7m 59s trunk passed
          +1 compile 0m 17s trunk passed with JDK v1.8.0_66
          +1 compile 0m 15s trunk passed with JDK v1.7.0_91
          +1 checkstyle 0m 9s trunk passed
          +1 mvnsite 0m 21s trunk passed
          +1 mvneclipse 0m 12s trunk passed
          +1 findbugs 0m 29s trunk passed
          +1 javadoc 0m 13s trunk passed with JDK v1.8.0_66
          +1 javadoc 0m 13s trunk passed with JDK v1.7.0_91
          +1 mvninstall 0m 15s the patch passed
          +1 compile 0m 14s the patch passed with JDK v1.8.0_66
          +1 javac 0m 14s the patch passed
          +1 compile 0m 14s the patch passed with JDK v1.7.0_91
          +1 javac 0m 14s the patch passed
          +1 checkstyle 0m 9s the patch passed
          +1 mvnsite 0m 18s the patch passed
          +1 mvneclipse 0m 8s the patch passed
          -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
          -1 findbugs 0m 38s hadoop-tools/hadoop-azure introduced 1 new FindBugs issues.
          +1 javadoc 0m 12s the patch passed with JDK v1.8.0_66
          +1 javadoc 0m 11s the patch passed with JDK v1.7.0_91
          -1 unit 1m 12s hadoop-azure in the patch failed with JDK v1.8.0_66.
          -1 unit 1m 19s hadoop-azure in the patch failed with JDK v1.7.0_91.
          +1 asflicense 0m 18s Patch does not generate ASF License warnings.
          16m 19s



          Reason Tests
          FindBugs module:hadoop-tools/hadoop-azure
            org.apache.hadoop.fs.azure.BlockBlobAppendStream.updateBlobAppendMetadata(boolean, boolean) calls Thread.sleep() with a lock held At BlockBlobAppendStream.java:a lock held At BlockBlobAppendStream.java:[line 576]
          JDK v1.8.0_66 Failed junit tests hadoop.fs.azure.TestNativeAzureFileSystemAppend
          JDK v1.7.0_91 Failed junit tests hadoop.fs.azure.TestNativeAzureFileSystemAppend



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12781683/HADOOP-12635.003.patch
          JIRA Issue HADOOP-12635
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux e8985f25a30c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / c2e2e13
          Default Java 1.7.0_91
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91
          findbugs v3.0.0
          whitespace https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/artifact/patchprocess/whitespace-eol.txt
          findbugs https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/artifact/patchprocess/new-findbugs-hadoop-tools_hadoop-azure.html
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.8.0_66.txt
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.7.0_91.txt
          unit test logs https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.7.0_91.txt
          JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/testReport/
          modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
          Max memory used 76MB
          Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 @author 0m 0s The patch appears to contain 2 @author tags which the community has agreed to not allow in code contributions. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 7m 59s trunk passed +1 compile 0m 17s trunk passed with JDK v1.8.0_66 +1 compile 0m 15s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 9s trunk passed +1 mvnsite 0m 21s trunk passed +1 mvneclipse 0m 12s trunk passed +1 findbugs 0m 29s trunk passed +1 javadoc 0m 13s trunk passed with JDK v1.8.0_66 +1 javadoc 0m 13s trunk passed with JDK v1.7.0_91 +1 mvninstall 0m 15s the patch passed +1 compile 0m 14s the patch passed with JDK v1.8.0_66 +1 javac 0m 14s the patch passed +1 compile 0m 14s the patch passed with JDK v1.7.0_91 +1 javac 0m 14s the patch passed +1 checkstyle 0m 9s the patch passed +1 mvnsite 0m 18s the patch passed +1 mvneclipse 0m 8s the patch passed -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. -1 findbugs 0m 38s hadoop-tools/hadoop-azure introduced 1 new FindBugs issues. +1 javadoc 0m 12s the patch passed with JDK v1.8.0_66 +1 javadoc 0m 11s the patch passed with JDK v1.7.0_91 -1 unit 1m 12s hadoop-azure in the patch failed with JDK v1.8.0_66. -1 unit 1m 19s hadoop-azure in the patch failed with JDK v1.7.0_91. +1 asflicense 0m 18s Patch does not generate ASF License warnings. 16m 19s Reason Tests FindBugs module:hadoop-tools/hadoop-azure   org.apache.hadoop.fs.azure.BlockBlobAppendStream.updateBlobAppendMetadata(boolean, boolean) calls Thread.sleep() with a lock held At BlockBlobAppendStream.java:a lock held At BlockBlobAppendStream.java: [line 576] JDK v1.8.0_66 Failed junit tests hadoop.fs.azure.TestNativeAzureFileSystemAppend JDK v1.7.0_91 Failed junit tests hadoop.fs.azure.TestNativeAzureFileSystemAppend Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12781683/HADOOP-12635.003.patch JIRA Issue HADOOP-12635 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux e8985f25a30c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / c2e2e13 Default Java 1.7.0_91 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91 findbugs v3.0.0 whitespace https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/artifact/patchprocess/whitespace-eol.txt findbugs https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/artifact/patchprocess/new-findbugs-hadoop-tools_hadoop-azure.html unit https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.8.0_66.txt unit https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.7.0_91.txt unit test logs https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.7.0_91.txt JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/testReport/ modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure Max memory used 76MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/8387/console This message was automatically generated.
          Hide
          dchickabasapa Dushyanth added a comment -

          Fixing whitespace, author and Findbug errors from v003.

          Testing same as v003.

          Chris Nauroth the newly added tests for Append are getting executed in QA run. These are only live tests, is there a way to mark these tests as live-tests and not be run as part of QA run.

          Show
          dchickabasapa Dushyanth added a comment - Fixing whitespace, author and Findbug errors from v003. Testing same as v003. Chris Nauroth the newly added tests for Append are getting executed in QA run. These are only live tests, is there a way to mark these tests as live-tests and not be run as part of QA run.
          Hide
          cnauroth Chris Nauroth added a comment -

          Dushyanth, TestNativeAzureFileSystemAppend extends NativeAzureFileSystemTestBase. If you look at the setUp method in the base class, it already takes care of skipping the tests if not running against the live service. I think the problem is that TestNativeAzureFileSystemAppend overrides setUp but never calls back up to super.setUp() to trigger this logic.

          In your next patch revision, I recommend removing the setUp and tearDown methods from TestNativeAzureFileSystemAppend. I think the base class implementation does everything you need. The FileSystem instance is exposed in the base class as a protected member named fs, so you can refer to that in your test cases. I don't think you need to track the FileSystem or the AzureBlobStorageTestAccount in members of the subclass.

          Show
          cnauroth Chris Nauroth added a comment - Dushyanth , TestNativeAzureFileSystemAppend extends NativeAzureFileSystemTestBase . If you look at the setUp method in the base class, it already takes care of skipping the tests if not running against the live service. I think the problem is that TestNativeAzureFileSystemAppend overrides setUp but never calls back up to super.setUp() to trigger this logic. In your next patch revision, I recommend removing the setUp and tearDown methods from TestNativeAzureFileSystemAppend . I think the base class implementation does everything you need. The FileSystem instance is exposed in the base class as a protected member named fs , so you can refer to that in your test cases. I don't think you need to track the FileSystem or the AzureBlobStorageTestAccount in members of the subclass.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 7m 32s trunk passed
          +1 compile 0m 15s trunk passed with JDK v1.8.0_66
          +1 compile 0m 15s trunk passed with JDK v1.7.0_91
          +1 checkstyle 0m 8s trunk passed
          +1 mvnsite 0m 22s trunk passed
          +1 mvneclipse 0m 11s trunk passed
          +1 findbugs 0m 29s trunk passed
          +1 javadoc 0m 13s trunk passed with JDK v1.8.0_66
          +1 javadoc 0m 14s trunk passed with JDK v1.7.0_91
          +1 mvninstall 0m 16s the patch passed
          +1 compile 0m 11s the patch passed with JDK v1.8.0_66
          +1 javac 0m 11s the patch passed
          +1 compile 0m 13s the patch passed with JDK v1.7.0_91
          +1 javac 0m 13s the patch passed
          +1 checkstyle 0m 9s the patch passed
          +1 mvnsite 0m 19s the patch passed
          +1 mvneclipse 0m 9s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          -1 findbugs 0m 39s hadoop-tools/hadoop-azure introduced 1 new FindBugs issues.
          +1 javadoc 0m 10s the patch passed with JDK v1.8.0_66
          +1 javadoc 0m 12s the patch passed with JDK v1.7.0_91
          -1 unit 1m 5s hadoop-azure in the patch failed with JDK v1.8.0_66.
          -1 unit 1m 19s hadoop-azure in the patch failed with JDK v1.7.0_91.
          +1 asflicense 0m 20s Patch does not generate ASF License warnings.
          15m 47s



          Reason Tests
          FindBugs module:hadoop-tools/hadoop-azure
            org.apache.hadoop.fs.azure.BlockBlobAppendStream.updateBlobAppendMetadata(boolean, boolean) calls Thread.sleep() with a lock held At BlockBlobAppendStream.java:a lock held At BlockBlobAppendStream.java:[line 579]
          JDK v1.8.0_66 Failed junit tests hadoop.fs.azure.TestNativeAzureFileSystemAppend
          JDK v1.7.0_91 Failed junit tests hadoop.fs.azure.TestNativeAzureFileSystemAppend



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12781962/HADOOP-12635-004.patch
          JIRA Issue HADOOP-12635
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 2799c3f618d0 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / fbb5868
          Default Java 1.7.0_91
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91
          findbugs v3.0.0
          findbugs https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/artifact/patchprocess/new-findbugs-hadoop-tools_hadoop-azure.html
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.8.0_66.txt
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.7.0_91.txt
          unit test logs https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.7.0_91.txt
          JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/testReport/
          modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
          Max memory used 76MB
          Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 7m 32s trunk passed +1 compile 0m 15s trunk passed with JDK v1.8.0_66 +1 compile 0m 15s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 8s trunk passed +1 mvnsite 0m 22s trunk passed +1 mvneclipse 0m 11s trunk passed +1 findbugs 0m 29s trunk passed +1 javadoc 0m 13s trunk passed with JDK v1.8.0_66 +1 javadoc 0m 14s trunk passed with JDK v1.7.0_91 +1 mvninstall 0m 16s the patch passed +1 compile 0m 11s the patch passed with JDK v1.8.0_66 +1 javac 0m 11s the patch passed +1 compile 0m 13s the patch passed with JDK v1.7.0_91 +1 javac 0m 13s the patch passed +1 checkstyle 0m 9s the patch passed +1 mvnsite 0m 19s the patch passed +1 mvneclipse 0m 9s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. -1 findbugs 0m 39s hadoop-tools/hadoop-azure introduced 1 new FindBugs issues. +1 javadoc 0m 10s the patch passed with JDK v1.8.0_66 +1 javadoc 0m 12s the patch passed with JDK v1.7.0_91 -1 unit 1m 5s hadoop-azure in the patch failed with JDK v1.8.0_66. -1 unit 1m 19s hadoop-azure in the patch failed with JDK v1.7.0_91. +1 asflicense 0m 20s Patch does not generate ASF License warnings. 15m 47s Reason Tests FindBugs module:hadoop-tools/hadoop-azure   org.apache.hadoop.fs.azure.BlockBlobAppendStream.updateBlobAppendMetadata(boolean, boolean) calls Thread.sleep() with a lock held At BlockBlobAppendStream.java:a lock held At BlockBlobAppendStream.java: [line 579] JDK v1.8.0_66 Failed junit tests hadoop.fs.azure.TestNativeAzureFileSystemAppend JDK v1.7.0_91 Failed junit tests hadoop.fs.azure.TestNativeAzureFileSystemAppend Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12781962/HADOOP-12635-004.patch JIRA Issue HADOOP-12635 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 2799c3f618d0 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / fbb5868 Default Java 1.7.0_91 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91 findbugs v3.0.0 findbugs https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/artifact/patchprocess/new-findbugs-hadoop-tools_hadoop-azure.html unit https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.8.0_66.txt unit https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.7.0_91.txt unit test logs https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/artifact/patchprocess/patch-unit-hadoop-tools_hadoop-azure-jdk1.7.0_91.txt JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/testReport/ modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure Max memory used 76MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/8395/console This message was automatically generated.
          Hide
          cnauroth Chris Nauroth added a comment -

          Dushyanth, I thought more about your proposal around the create problem. I'd like to point out a few more things about HDFS lease semantics:

          • If process A holds a lease and process B concurrently calls delete, then the delete is allowed and process A's lease is removed.
          • Similarly, if process A holds a lease and process B concurrently calls create with the OVERWRITE option, then process B wins and process A's lease is removed. (This is like delete + create.)
          • Also similarly, if process A holds a lease and process B concurrently calls rename with the OVERWRITE option, then the rename is allowed and process A's lease is removed. (This is like delete + rename.)

          Suppose an append is in progress to a WASB file in process A and another process B does one of the above actions concurrently. Is Azure Storage going to allow process A to continue writing and committing blocks, even though process A is no longer really operating on the same file that it started?

          According to this document on Managing Concurrency in Microsoft Azure Storage, the default concurrency model is last-one-wins. The document then describes options for pessimistic locking (leases, which we use for a limited subset of operations) or optimistic locking (ETag and If-Then, which we don't use anywhere). If both process A and process B are allowed to proceed writing blocks, then the end result is likely to be a meaningless file. In some cases, that could be severe enough that end users would classify it as data loss.

          I cannot think of any complete solution to this problem other than adding comprehensive locking across these operations. That would be a much bigger change than what has been proposed in the current patch revisions. That also might harm performance of WASB if applications need to make extra remote calls for lease acquisition and run additional lease renewer threads. (The optimistic locking strategy might not suffer this problem.)

          I think the strategy of trying to use metadata checks as a guard is incomplete, because it doesn't cover the possibility that another process, unaware of what happened, might continue trying to write blocks at the same blob path. If there is in fact some other protection mechanism in Azure Storage that I'm unaware of and WASB is taking advantage of it, please let me know. Thanks!

          Show
          cnauroth Chris Nauroth added a comment - Dushyanth , I thought more about your proposal around the create problem. I'd like to point out a few more things about HDFS lease semantics: If process A holds a lease and process B concurrently calls delete , then the delete is allowed and process A's lease is removed. Similarly, if process A holds a lease and process B concurrently calls create with the OVERWRITE option, then process B wins and process A's lease is removed. (This is like delete + create.) Also similarly, if process A holds a lease and process B concurrently calls rename with the OVERWRITE option, then the rename is allowed and process A's lease is removed. (This is like delete + rename.) Suppose an append is in progress to a WASB file in process A and another process B does one of the above actions concurrently. Is Azure Storage going to allow process A to continue writing and committing blocks, even though process A is no longer really operating on the same file that it started? According to this document on Managing Concurrency in Microsoft Azure Storage , the default concurrency model is last-one-wins. The document then describes options for pessimistic locking (leases, which we use for a limited subset of operations) or optimistic locking (ETag and If-Then, which we don't use anywhere). If both process A and process B are allowed to proceed writing blocks, then the end result is likely to be a meaningless file. In some cases, that could be severe enough that end users would classify it as data loss. I cannot think of any complete solution to this problem other than adding comprehensive locking across these operations. That would be a much bigger change than what has been proposed in the current patch revisions. That also might harm performance of WASB if applications need to make extra remote calls for lease acquisition and run additional lease renewer threads. (The optimistic locking strategy might not suffer this problem.) I think the strategy of trying to use metadata checks as a guard is incomplete, because it doesn't cover the possibility that another process, unaware of what happened, might continue trying to write blocks at the same blob path. If there is in fact some other protection mechanism in Azure Storage that I'm unaware of and WASB is taking advantage of it, please let me know. Thanks!
          Hide
          cnauroth Chris Nauroth added a comment - - edited

          Thank you for patch v004. Here is some additional code review feedback.

          The BlockBlobAppendStream#ioQueue member seems unnecessary. It is only used for initialization of ioThreadPool, and after that, all remaining logic interacts with ioThreadPool instead of ioQueue. The ioQueue is something you can create locally inside initialize.

          In BlockBlobAppendStream#close:

              if (closed) {
                leaseFreed = true;
                return;
              }
          

          Is it necessary to set leaseFreed here, or can the line be removed? I think it can be removed, because initial execution of close would have called cleanup, which sets leaseFreed to true.

          BlockBlobAppendStream#generateBlockId has a small typo in an exception message: "invaid" should be "invalid".

          I looked at the code for com.microsoft.azure.storage.core.Base64 and com.microsoft.azure.storage.core.Utility. Both have the warning "RESERVED FOR INTERNAL USE" at the top. I'd prefer to avoid calling Azure Storage SDK internals that are not considered public API, and therefore might change incompatibly between releases. For base-64 encoding, the rest of the Hadoop code uses org.apache.commons.codec.binary.Base64. If you'd like to see sample usage, one example is in SaslRpcServer. For Utility#getBytesFromLong, this looks like something trivial to clone a copy into the BlockBlobAppendStream class, with a comment added to indicate the code was copied from there.

              /**
               * Returns a byte array that represents the data of a <code>long</code> value.
               * 
               * @param value
               *            The value from which the byte array will be returned.
               * 
               * @return A byte array that represents the data of the specified <code>long</code> value.
               */
              public static byte[] getBytesFromLong(final long value) {
                  final byte[] tempArray = new byte[8];
          
                  for (int m = 0; m < 8; m++) {
                      tempArray[7 - m] = (byte) ((value >> (8 * m)) & 0xFF);
                  }
          
                  return tempArray;
              }
          

          PageBlobOutputStream#close has some code that dumps stack traces of stuck threads if there is an I/O timeout. This has been helpful for troubleshooting. I recommend doing the same for BlockBlobAppendStream#close, possibly refactoring to share the code.

          When you create ioThreadPool, I recommend supplying a custom ThreadFactory that sets the name of returned threads to something like "BlockBlobAppendStream-<key>-<sequential ID>". That will help identify these threads while troubleshooting with jstack and similar tools. I see PageBlobOutputStream isn't doing this right now either. (No need to fix it in scope of this patch. It can be fixed later.)

          In BlockBlobAppendStream.WriteRequest#run:

                    try {
                      Thread.sleep(BLOCK_UPLOAD_RETRY_INTERVAL);
                    } catch(InterruptedException ie) {
                      // Swalling the exception here
                    }
          

          Swallowing InterruptedException is an anti-pattern that is unfortunately prevalent throughout the existing Hadoop codebase, but let's not continue doing it for new code. Let's restore interrupted status by calling Thread.currentThread().interrupt() and break out of the loop.

          In NativeAzureFileSystem#append, the exception handling has the same kind of problem that we noticed during the HADOOP-12551 code review. If the caught exception is a StorageException, but not a FileNotFoundException, then the original exception is swallowed instead of propagated to the caller.

          Please address the Findbugs warning in BlockBlobAppendStream#updateBlobAppendMetadata. It looks like you'll need to move the sleep outside the synchronized block.

          Show
          cnauroth Chris Nauroth added a comment - - edited Thank you for patch v004. Here is some additional code review feedback. The BlockBlobAppendStream#ioQueue member seems unnecessary. It is only used for initialization of ioThreadPool , and after that, all remaining logic interacts with ioThreadPool instead of ioQueue . The ioQueue is something you can create locally inside initialize . In BlockBlobAppendStream#close : if (closed) { leaseFreed = true ; return ; } Is it necessary to set leaseFreed here, or can the line be removed? I think it can be removed, because initial execution of close would have called cleanup , which sets leaseFreed to true . BlockBlobAppendStream#generateBlockId has a small typo in an exception message: "invaid" should be "invalid". I looked at the code for com.microsoft.azure.storage.core.Base64 and com.microsoft.azure.storage.core.Utility . Both have the warning "RESERVED FOR INTERNAL USE" at the top. I'd prefer to avoid calling Azure Storage SDK internals that are not considered public API, and therefore might change incompatibly between releases. For base-64 encoding, the rest of the Hadoop code uses org.apache.commons.codec.binary.Base64 . If you'd like to see sample usage, one example is in SaslRpcServer . For Utility#getBytesFromLong , this looks like something trivial to clone a copy into the BlockBlobAppendStream class, with a comment added to indicate the code was copied from there. /** * Returns a byte array that represents the data of a <code> long </code> value. * * @param value * The value from which the byte array will be returned. * * @ return A byte array that represents the data of the specified <code> long </code> value. */ public static byte [] getBytesFromLong( final long value) { final byte [] tempArray = new byte [8]; for ( int m = 0; m < 8; m++) { tempArray[7 - m] = ( byte ) ((value >> (8 * m)) & 0xFF); } return tempArray; } PageBlobOutputStream#close has some code that dumps stack traces of stuck threads if there is an I/O timeout. This has been helpful for troubleshooting. I recommend doing the same for BlockBlobAppendStream#close , possibly refactoring to share the code. When you create ioThreadPool , I recommend supplying a custom ThreadFactory that sets the name of returned threads to something like "BlockBlobAppendStream-<key>-<sequential ID>". That will help identify these threads while troubleshooting with jstack and similar tools. I see PageBlobOutputStream isn't doing this right now either. (No need to fix it in scope of this patch. It can be fixed later.) In BlockBlobAppendStream.WriteRequest#run : try { Thread .sleep(BLOCK_UPLOAD_RETRY_INTERVAL); } catch (InterruptedException ie) { // Swalling the exception here } Swallowing InterruptedException is an anti-pattern that is unfortunately prevalent throughout the existing Hadoop codebase, but let's not continue doing it for new code. Let's restore interrupted status by calling Thread.currentThread().interrupt() and break out of the loop. In NativeAzureFileSystem#append , the exception handling has the same kind of problem that we noticed during the HADOOP-12551 code review. If the caught exception is a StorageException , but not a FileNotFoundException , then the original exception is swallowed instead of propagated to the caller. Please address the Findbugs warning in BlockBlobAppendStream#updateBlobAppendMetadata . It looks like you'll need to move the sleep outside the synchronized block.
          Hide
          cnauroth Chris Nauroth added a comment -

          I have one more code review note. There are a few more member variables in BlockBlobAppendStream that would be good to mark as final:

          • key
          • bufferSize
          • blob
          • opContext
          • uncommittedBlockEntries
          • sequenceGenerator
          • ioThreadPool
          Show
          cnauroth Chris Nauroth added a comment - I have one more code review note. There are a few more member variables in BlockBlobAppendStream that would be good to mark as final : key bufferSize blob opContext uncommittedBlockEntries sequenceGenerator ioThreadPool
          Hide
          cnauroth Chris Nauroth added a comment -

          It turns out I misinterpreted something in the design document. There is a statement about supporting "a single writer". I interpreted this to mean that the goal of the feature was to match HDFS single-writer enforcement semantics exactly. Instead, the goal is to support append for applications that can guarantee a single writer themselves. WASB will not provide this enforcement internally. It becomes a responsibility of the application either to ensure single-threaded handling for a particular file path, or rely on some external locking mechanism of its own.

          Exactly matching HDFS single-writer semantics might be a nice future goal, but this would be a huge undertaking. It would nearly be a full rewrite with reconsideration of locking strategies inside the entire API footprint. It also will be challenging to achieve this without harming current performance.

          Based on this, I think we can proceed with a patch similar to what has been provided already, pending resolution of my code review feedback. Usage of the API will carry some risk, so I propose that we guard it behind a configuration flag, which would reject append calls with an error unless explicitly enabled. Let's also call out the risks in documentation and make it clear to end users that this differs from HDFS in that single-writer enforcement is an application responsibility with WASB.

          Show
          cnauroth Chris Nauroth added a comment - It turns out I misinterpreted something in the design document. There is a statement about supporting "a single writer". I interpreted this to mean that the goal of the feature was to match HDFS single-writer enforcement semantics exactly. Instead, the goal is to support append for applications that can guarantee a single writer themselves. WASB will not provide this enforcement internally. It becomes a responsibility of the application either to ensure single-threaded handling for a particular file path, or rely on some external locking mechanism of its own. Exactly matching HDFS single-writer semantics might be a nice future goal, but this would be a huge undertaking. It would nearly be a full rewrite with reconsideration of locking strategies inside the entire API footprint. It also will be challenging to achieve this without harming current performance. Based on this, I think we can proceed with a patch similar to what has been provided already, pending resolution of my code review feedback. Usage of the API will carry some risk, so I propose that we guard it behind a configuration flag, which would reject append calls with an error unless explicitly enabled. Let's also call out the risks in documentation and make it clear to end users that this differs from HDFS in that single-writer enforcement is an application responsibility with WASB.
          Hide
          dchickabasapa Dushyanth added a comment -

          Chris Nauroth Thanks for the review and feedback.

          I have addressed all the feedback in v005. The append support is now guarded by "fs.azure.enable.append.support" configuration flag.

          Testing : Testing with tests added for Append API and contract live tests for block blobs and page blobs.

          Show
          dchickabasapa Dushyanth added a comment - Chris Nauroth Thanks for the review and feedback. I have addressed all the feedback in v005. The append support is now guarded by "fs.azure.enable.append.support" configuration flag. Testing : Testing with tests added for Append API and contract live tests for block blobs and page blobs.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 7m 38s trunk passed
          +1 compile 0m 15s trunk passed with JDK v1.8.0_66
          +1 compile 0m 17s trunk passed with JDK v1.7.0_91
          +1 checkstyle 0m 9s trunk passed
          +1 mvnsite 0m 23s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 0m 30s trunk passed
          +1 javadoc 0m 14s trunk passed with JDK v1.8.0_66
          +1 javadoc 0m 16s trunk passed with JDK v1.7.0_91
          +1 mvninstall 0m 16s the patch passed
          +1 compile 0m 11s the patch passed with JDK v1.8.0_66
          +1 javac 0m 11s the patch passed
          +1 compile 0m 14s the patch passed with JDK v1.7.0_91
          +1 javac 0m 14s the patch passed
          -1 checkstyle 0m 8s Patch generated 2 new checkstyle issues in hadoop-tools/hadoop-azure (total was 66, now 59).
          +1 mvnsite 0m 19s the patch passed
          +1 mvneclipse 0m 9s the patch passed
          -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
          -1 findbugs 0m 38s hadoop-tools/hadoop-azure introduced 1 new FindBugs issues.
          +1 javadoc 0m 11s the patch passed with JDK v1.8.0_66
          +1 javadoc 0m 11s the patch passed with JDK v1.7.0_91
          +1 unit 1m 4s hadoop-azure in the patch passed with JDK v1.8.0_66.
          +1 unit 1m 16s hadoop-azure in the patch passed with JDK v1.7.0_91.
          +1 asflicense 0m 17s Patch does not generate ASF License warnings.
          15m 57s



          Reason Tests
          FindBugs module:hadoop-tools/hadoop-azure
            Found reliance on default encoding in org.apache.hadoop.fs.azure.BlockBlobAppendStream.generateBlockId():in org.apache.hadoop.fs.azure.BlockBlobAppendStream.generateBlockId(): new String(byte[]) At BlockBlobAppendStream.java:[line 468]



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12782600/HADOOP-12635.005.patch
          JIRA Issue HADOOP-12635
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 9eccfdce903d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 5d5a22a
          Default Java 1.7.0_91
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/8421/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-azure.txt
          whitespace https://builds.apache.org/job/PreCommit-HADOOP-Build/8421/artifact/patchprocess/whitespace-eol.txt
          findbugs https://builds.apache.org/job/PreCommit-HADOOP-Build/8421/artifact/patchprocess/new-findbugs-hadoop-tools_hadoop-azure.html
          JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/8421/testReport/
          modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
          Max memory used 76MB
          Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/8421/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 7m 38s trunk passed +1 compile 0m 15s trunk passed with JDK v1.8.0_66 +1 compile 0m 17s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 9s trunk passed +1 mvnsite 0m 23s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 30s trunk passed +1 javadoc 0m 14s trunk passed with JDK v1.8.0_66 +1 javadoc 0m 16s trunk passed with JDK v1.7.0_91 +1 mvninstall 0m 16s the patch passed +1 compile 0m 11s the patch passed with JDK v1.8.0_66 +1 javac 0m 11s the patch passed +1 compile 0m 14s the patch passed with JDK v1.7.0_91 +1 javac 0m 14s the patch passed -1 checkstyle 0m 8s Patch generated 2 new checkstyle issues in hadoop-tools/hadoop-azure (total was 66, now 59). +1 mvnsite 0m 19s the patch passed +1 mvneclipse 0m 9s the patch passed -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. -1 findbugs 0m 38s hadoop-tools/hadoop-azure introduced 1 new FindBugs issues. +1 javadoc 0m 11s the patch passed with JDK v1.8.0_66 +1 javadoc 0m 11s the patch passed with JDK v1.7.0_91 +1 unit 1m 4s hadoop-azure in the patch passed with JDK v1.8.0_66. +1 unit 1m 16s hadoop-azure in the patch passed with JDK v1.7.0_91. +1 asflicense 0m 17s Patch does not generate ASF License warnings. 15m 57s Reason Tests FindBugs module:hadoop-tools/hadoop-azure   Found reliance on default encoding in org.apache.hadoop.fs.azure.BlockBlobAppendStream.generateBlockId():in org.apache.hadoop.fs.azure.BlockBlobAppendStream.generateBlockId(): new String(byte[]) At BlockBlobAppendStream.java: [line 468] Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12782600/HADOOP-12635.005.patch JIRA Issue HADOOP-12635 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 9eccfdce903d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 5d5a22a Default Java 1.7.0_91 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/8421/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-azure.txt whitespace https://builds.apache.org/job/PreCommit-HADOOP-Build/8421/artifact/patchprocess/whitespace-eol.txt findbugs https://builds.apache.org/job/PreCommit-HADOOP-Build/8421/artifact/patchprocess/new-findbugs-hadoop-tools_hadoop-azure.html JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/8421/testReport/ modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure Max memory used 76MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/8421/console This message was automatically generated.
          Hide
          dchickabasapa Dushyanth added a comment -

          Fixing Findbugs, Stylecheck and whitspace errors.

          Show
          dchickabasapa Dushyanth added a comment - Fixing Findbugs, Stylecheck and whitspace errors.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 7m 49s trunk passed
          +1 compile 0m 14s trunk passed with JDK v1.8.0_66
          +1 compile 0m 18s trunk passed with JDK v1.7.0_91
          +1 checkstyle 0m 9s trunk passed
          +1 mvnsite 0m 23s trunk passed
          +1 mvneclipse 0m 40s trunk passed
          +1 findbugs 0m 29s trunk passed
          +1 javadoc 0m 12s trunk passed with JDK v1.8.0_66
          +1 javadoc 0m 15s trunk passed with JDK v1.7.0_91
          +1 mvninstall 0m 16s the patch passed
          +1 compile 0m 11s the patch passed with JDK v1.8.0_66
          +1 javac 0m 11s the patch passed
          +1 compile 0m 16s the patch passed with JDK v1.7.0_91
          +1 javac 0m 16s the patch passed
          +1 checkstyle 0m 9s the patch passed
          +1 mvnsite 0m 19s the patch passed
          +1 mvneclipse 0m 9s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 0m 35s the patch passed
          +1 javadoc 0m 9s the patch passed with JDK v1.8.0_66
          +1 javadoc 0m 12s the patch passed with JDK v1.7.0_91
          +1 unit 1m 5s hadoop-azure in the patch passed with JDK v1.8.0_66.
          +1 unit 1m 21s hadoop-azure in the patch passed with JDK v1.7.0_91.
          +1 asflicense 0m 18s Patch does not generate ASF License warnings.
          16m 47s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12782622/HADOOP-12635.006.patch
          JIRA Issue HADOOP-12635
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 4cb7ce485b91 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 2a30386
          Default Java 1.7.0_91
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91
          findbugs v3.0.0
          JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/8423/testReport/
          modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
          Max memory used 76MB
          Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/8423/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 7m 49s trunk passed +1 compile 0m 14s trunk passed with JDK v1.8.0_66 +1 compile 0m 18s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 9s trunk passed +1 mvnsite 0m 23s trunk passed +1 mvneclipse 0m 40s trunk passed +1 findbugs 0m 29s trunk passed +1 javadoc 0m 12s trunk passed with JDK v1.8.0_66 +1 javadoc 0m 15s trunk passed with JDK v1.7.0_91 +1 mvninstall 0m 16s the patch passed +1 compile 0m 11s the patch passed with JDK v1.8.0_66 +1 javac 0m 11s the patch passed +1 compile 0m 16s the patch passed with JDK v1.7.0_91 +1 javac 0m 16s the patch passed +1 checkstyle 0m 9s the patch passed +1 mvnsite 0m 19s the patch passed +1 mvneclipse 0m 9s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 0m 35s the patch passed +1 javadoc 0m 9s the patch passed with JDK v1.8.0_66 +1 javadoc 0m 12s the patch passed with JDK v1.7.0_91 +1 unit 1m 5s hadoop-azure in the patch passed with JDK v1.8.0_66. +1 unit 1m 21s hadoop-azure in the patch passed with JDK v1.7.0_91. +1 asflicense 0m 18s Patch does not generate ASF License warnings. 16m 47s Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12782622/HADOOP-12635.006.patch JIRA Issue HADOOP-12635 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 4cb7ce485b91 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 2a30386 Default Java 1.7.0_91 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91 findbugs v3.0.0 JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/8423/testReport/ modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure Max memory used 76MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/8423/console This message was automatically generated.
          Hide
          cnauroth Chris Nauroth added a comment -

          Thank you for the new revision, Dushyanth.

          I had previously asked for ioThreadPool to be marked final, but I see now that it isn't possible.

          I have a few more items remaining. Much of this is nit-picky at this point. We're getting close.

          Mark BlockBlobAppendStream#LOG and NativeAzureFileSystemHelper#LOG as private.

          The comment's indentation here has an extra space:

              try {
                 //Set the append lease if the value of the append lease is false
                if (!updateBlobAppendMetadata(true, false)) {
          

          There is a typo on "SelfRenewingLease" in this comment:

                    // Swallowing exception here as the lease is cleaned up by the SeleRenewingLease object.
          

          Please add the @Override annotation to UploaderThreadFactory#newThread.

          The JavaDocs for the new overload of StorageInterface#uploadMetadata don't mention the additional parameters. Since everything else in StorageInterface has thorough JavaDocs, would you please correct that and also add JavaDocs for all of the new methods? That's downloadBlockList, uploadBlock and commitBlockList.

          Could you please move TestNativeAzureFileSystemAppend#setUp to the top of the class? It's customary to put the @Before method at the top, so that's where people will expect to find it.

          TestNativeAzureFileSystemAppend#createBaseFileWithData is at risk of leaking an open file if the write call throws an exception before the chance to call close. Please use either try-finally or try-with-resources to guarantee the close call happens. I also see the same problem in a few of the @Test methods, so please fix it throughout the whole class.

          I recommend using Configuration#setBoolean here ,so that you won't need to take care of the string conversion:

              conf.set(NativeAzureFileSystem.APPEND_SUPPORT_ENABLE_PROPERTY_NAME, Boolean.toString(true));
          

          In TestNativeAzureFileSystemAppend#testSingleAppenderScenario, please add a call to GenericTestUtils.assertExceptionContains("Unable to set Append lease on the Blob.", ioe);. This will guarantee that the IOException was thrown for the expected reason, and it's not some other unrelated I/O error.

          Please add a test that sets the configuration property to false and then asserts that calling append results in an exception.

          Please update src/site/markdown/index.md. We'll want to mention that WASB has optional support for append by setting the configuration property. We'll also want to call out in scary bold letters that it differs from HDFS append semantics. The application has responsibility to enforce a single writer externally, and failure to do so will result in unexpected behavior.

          Show
          cnauroth Chris Nauroth added a comment - Thank you for the new revision, Dushyanth . I had previously asked for ioThreadPool to be marked final , but I see now that it isn't possible. I have a few more items remaining. Much of this is nit-picky at this point. We're getting close. Mark BlockBlobAppendStream#LOG and NativeAzureFileSystemHelper#LOG as private . The comment's indentation here has an extra space: try { //Set the append lease if the value of the append lease is false if (!updateBlobAppendMetadata( true , false )) { There is a typo on "SelfRenewingLease" in this comment: // Swallowing exception here as the lease is cleaned up by the SeleRenewingLease object. Please add the @Override annotation to UploaderThreadFactory#newThread . The JavaDocs for the new overload of StorageInterface#uploadMetadata don't mention the additional parameters. Since everything else in StorageInterface has thorough JavaDocs, would you please correct that and also add JavaDocs for all of the new methods? That's downloadBlockList , uploadBlock and commitBlockList . Could you please move TestNativeAzureFileSystemAppend#setUp to the top of the class? It's customary to put the @Before method at the top, so that's where people will expect to find it. TestNativeAzureFileSystemAppend#createBaseFileWithData is at risk of leaking an open file if the write call throws an exception before the chance to call close . Please use either try-finally or try-with-resources to guarantee the close call happens. I also see the same problem in a few of the @Test methods, so please fix it throughout the whole class. I recommend using Configuration#setBoolean here ,so that you won't need to take care of the string conversion: conf.set(NativeAzureFileSystem.APPEND_SUPPORT_ENABLE_PROPERTY_NAME, Boolean .toString( true )); In TestNativeAzureFileSystemAppend#testSingleAppenderScenario , please add a call to GenericTestUtils.assertExceptionContains("Unable to set Append lease on the Blob.", ioe); . This will guarantee that the IOException was thrown for the expected reason, and it's not some other unrelated I/O error. Please add a test that sets the configuration property to false and then asserts that calling append results in an exception. Please update src/site/markdown/index.md. We'll want to mention that WASB has optional support for append by setting the configuration property. We'll also want to call out in scary bold letters that it differs from HDFS append semantics. The application has responsibility to enforce a single writer externally, and failure to do so will result in unexpected behavior.
          Hide
          cnauroth Chris Nauroth added a comment -

          One more thing: please make NativeAzureFileSystemHelper package-private (remove public) and add the @InterfaceAudience.Private annotation.

          Show
          cnauroth Chris Nauroth added a comment - One more thing: please make NativeAzureFileSystemHelper package-private (remove public ) and add the @InterfaceAudience.Private annotation.
          Hide
          dchickabasapa Dushyanth added a comment -

          Chris Nauroth Thanks for the feedback.

          I have addressed the comment in v007.

          Testing: I have added new test to verify the behavior for Append when the configuration flag is set to false plust testing done for v006

          Show
          dchickabasapa Dushyanth added a comment - Chris Nauroth Thanks for the feedback. I have addressed the comment in v007. Testing: I have added new test to verify the behavior for Append when the configuration flag is set to false plust testing done for v006
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 7m 38s trunk passed
          +1 compile 0m 13s trunk passed with JDK v1.8.0_66
          +1 compile 0m 15s trunk passed with JDK v1.7.0_91
          +1 checkstyle 0m 8s trunk passed
          +1 mvnsite 0m 21s trunk passed
          +1 mvneclipse 0m 11s trunk passed
          +1 findbugs 0m 29s trunk passed
          +1 javadoc 0m 11s trunk passed with JDK v1.8.0_66
          +1 javadoc 0m 14s trunk passed with JDK v1.7.0_91
          +1 mvninstall 0m 15s the patch passed
          +1 compile 0m 11s the patch passed with JDK v1.8.0_66
          +1 javac 0m 11s the patch passed
          +1 compile 0m 13s the patch passed with JDK v1.7.0_91
          +1 javac 0m 13s the patch passed
          +1 checkstyle 0m 9s the patch passed
          +1 mvnsite 0m 18s the patch passed
          +1 mvneclipse 0m 9s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 0m 34s the patch passed
          +1 javadoc 0m 9s the patch passed with JDK v1.8.0_66
          +1 javadoc 0m 11s the patch passed with JDK v1.7.0_91
          +1 unit 1m 1s hadoop-azure in the patch passed with JDK v1.8.0_66.
          +1 unit 1m 16s hadoop-azure in the patch passed with JDK v1.7.0_91.
          +1 asflicense 0m 17s Patch does not generate ASF License warnings.
          15m 25s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12782688/HADOOP-12635.007.patch
          JIRA Issue HADOOP-12635
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 6a839f942263 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 2a30386
          Default Java 1.7.0_91
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91
          findbugs v3.0.0
          JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/8426/testReport/
          modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
          Max memory used 76MB
          Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/8426/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 7m 38s trunk passed +1 compile 0m 13s trunk passed with JDK v1.8.0_66 +1 compile 0m 15s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 8s trunk passed +1 mvnsite 0m 21s trunk passed +1 mvneclipse 0m 11s trunk passed +1 findbugs 0m 29s trunk passed +1 javadoc 0m 11s trunk passed with JDK v1.8.0_66 +1 javadoc 0m 14s trunk passed with JDK v1.7.0_91 +1 mvninstall 0m 15s the patch passed +1 compile 0m 11s the patch passed with JDK v1.8.0_66 +1 javac 0m 11s the patch passed +1 compile 0m 13s the patch passed with JDK v1.7.0_91 +1 javac 0m 13s the patch passed +1 checkstyle 0m 9s the patch passed +1 mvnsite 0m 18s the patch passed +1 mvneclipse 0m 9s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 0m 34s the patch passed +1 javadoc 0m 9s the patch passed with JDK v1.8.0_66 +1 javadoc 0m 11s the patch passed with JDK v1.7.0_91 +1 unit 1m 1s hadoop-azure in the patch passed with JDK v1.8.0_66. +1 unit 1m 16s hadoop-azure in the patch passed with JDK v1.7.0_91. +1 asflicense 0m 17s Patch does not generate ASF License warnings. 15m 25s Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12782688/HADOOP-12635.007.patch JIRA Issue HADOOP-12635 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 6a839f942263 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 2a30386 Default Java 1.7.0_91 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91 findbugs v3.0.0 JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/8426/testReport/ modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure Max memory used 76MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/8426/console This message was automatically generated.
          Hide
          cnauroth Chris Nauroth added a comment -

          +1 for patch v007. I'm going to make a couple of minor cosmetic modifications myself before the commit instead of asking for another patch revision.

          TestNativeAzureFileSystemAppend#verifyAppend has a typo in a comment, which I will correct.

                  } catch(IOException ioe) {
                    //Swalling
                  }
          

          index.md missed a period to break up two sentences.

          file path, or rely on some external locking mechanism of its own failure to do so will result in
          unexpected behavior.
          

          I will change this to:

          file path, or rely on some external locking mechanism of its own.  Failure to do so will result in
          unexpected behavior.
          
          Show
          cnauroth Chris Nauroth added a comment - +1 for patch v007. I'm going to make a couple of minor cosmetic modifications myself before the commit instead of asking for another patch revision. TestNativeAzureFileSystemAppend#verifyAppend has a typo in a comment, which I will correct. } catch (IOException ioe) { //Swalling } index.md missed a period to break up two sentences. file path, or rely on some external locking mechanism of its own failure to do so will result in unexpected behavior. I will change this to: file path, or rely on some external locking mechanism of its own. Failure to do so will result in unexpected behavior.
          Hide
          cnauroth Chris Nauroth added a comment -

          I have committed this to trunk, branch-2 and branch-2.8. Dushyanth, thank you for the contribution.

          Show
          cnauroth Chris Nauroth added a comment - I have committed this to trunk, branch-2 and branch-2.8. Dushyanth , thank you for the contribution.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #9132 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9132/)
          HADOOP-12635. Adding Append API support for WASB. Contributed by (cnauroth: rev 8bc93db2e7c64830b6a662f28c8917a9eef4e7c9)

          • hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/TestNativeAzureFileSystemAppend.java
          • hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java
          • hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/StorageInterface.java
          • hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/BlockBlobAppendStream.java
          • hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/MockStorageInterface.java
          • hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/PageBlobOutputStream.java
          • hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystemHelper.java
          • hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/AzureNativeFileSystemStore.java
          • hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeFileSystemStore.java
          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-tools/hadoop-azure/src/site/markdown/index.md
          • hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/StorageInterfaceImpl.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9132 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9132/ ) HADOOP-12635 . Adding Append API support for WASB. Contributed by (cnauroth: rev 8bc93db2e7c64830b6a662f28c8917a9eef4e7c9) hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/TestNativeAzureFileSystemAppend.java hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/StorageInterface.java hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/BlockBlobAppendStream.java hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/MockStorageInterface.java hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/PageBlobOutputStream.java hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystemHelper.java hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/AzureNativeFileSystemStore.java hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeFileSystemStore.java hadoop-common-project/hadoop-common/CHANGES.txt hadoop-tools/hadoop-azure/src/site/markdown/index.md hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/StorageInterfaceImpl.java
          Hide
          bograd Bogdan Raducanu added a comment -

          Any plans to make append work for page blobs?

          Show
          bograd Bogdan Raducanu added a comment - Any plans to make append work for page blobs?
          Hide
          cnauroth Chris Nauroth added a comment -

          Bogdan Raducanu, I'm not aware of any work in progress on append support for page blobs. If you're interested in that, then I'd recommend filing a new JIRA to track it.

          Show
          cnauroth Chris Nauroth added a comment - Bogdan Raducanu , I'm not aware of any work in progress on append support for page blobs. If you're interested in that, then I'd recommend filing a new JIRA to track it.

            People

            • Assignee:
              dchickabasapa Dushyanth
              Reporter:
              dchickabasapa Dushyanth
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development