Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-5626 Track and Address Flaky tests
  3. HDDS-8024

When readChunk from a datanode fails, retry other datanodes.

    XMLWordPrintableJSON

Details

    Description

      ITestOzoneContractCreate.testSyncable  Time elapsed: 0.309 s  <<< ERROR!
      java.io.IOException: Inconsistent read for blockID=conID: 1 locID: 111677748019200007 bcsId: 0 length=2 position=1 numBytesToRead=1 numBytesRead=-1
      	at org.apache.hadoop.ozone.client.io.KeyInputStream.checkPartBytesRead(KeyInputStream.java:175)
      	at org.apache.hadoop.hdds.scm.storage.MultipartInputStream.readWithStrategy(MultipartInputStream.java:97)
      	at org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:54)
      	at org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:41)
      	at org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:55)
      	at java.io.FilterInputStream.read(FilterInputStream.java:83)
      	at org.apache.hadoop.fs.contract.AbstractContractCreateTest.validateSyncableSemantics(AbstractContractCreateTest.java:548)
      	at org.apache.hadoop.fs.contract.AbstractContractCreateTest.testSyncable(AbstractContractCreateTest.java:459)
      
      java.io.IOException: Inconsistent read for blockID=conID: 1 locID: 111677748019200001 bcsId: 0 length=1120 position=97 numBytesToRead=1023 numBytesRead=-1
      	at org.apache.hadoop.ozone.client.io.KeyInputStream.checkPartBytesRead(KeyInputStream.java:175)
      	at org.apache.hadoop.hdds.scm.storage.MultipartInputStream.readWithStrategy(MultipartInputStream.java:97)
      	at org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:54)
      	at org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:64)
      	at java.io.DataInputStream.read(DataInputStream.java:149)
      	at org.apache.hadoop.fs.ozone.TestHSync.runTestHSync(TestHSync.java:188)
      	at org.apache.hadoop.fs.ozone.TestHSync.runTestHSync(TestHSync.java:143)
      	at org.apache.hadoop.fs.ozone.TestHSync.testO3fsHSync(TestHSync.java:116)
      

      I guess these two different tests both exercise HSync.

      CC weichiu, szetszwo


      The datanode may fail to serve readChunk but the client won't retry other datanodes. We will change it to retry other datanodes in the JIRA.

      Attachments

        Issue Links

          Activity

            People

              szetszwo Tsz-wo Sze
              adoroszlai Attila Doroszlai
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: