Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-5626 Track and Address Flaky tests
  3. HDDS-8024

When readChunk from a datanode fails, retry other datanodes.

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      ITestOzoneContractCreate.testSyncable  Time elapsed: 0.309 s  <<< ERROR!
      java.io.IOException: Inconsistent read for blockID=conID: 1 locID: 111677748019200007 bcsId: 0 length=2 position=1 numBytesToRead=1 numBytesRead=-1
      	at org.apache.hadoop.ozone.client.io.KeyInputStream.checkPartBytesRead(KeyInputStream.java:175)
      	at org.apache.hadoop.hdds.scm.storage.MultipartInputStream.readWithStrategy(MultipartInputStream.java:97)
      	at org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:54)
      	at org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:41)
      	at org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:55)
      	at java.io.FilterInputStream.read(FilterInputStream.java:83)
      	at org.apache.hadoop.fs.contract.AbstractContractCreateTest.validateSyncableSemantics(AbstractContractCreateTest.java:548)
      	at org.apache.hadoop.fs.contract.AbstractContractCreateTest.testSyncable(AbstractContractCreateTest.java:459)
      
      java.io.IOException: Inconsistent read for blockID=conID: 1 locID: 111677748019200001 bcsId: 0 length=1120 position=97 numBytesToRead=1023 numBytesRead=-1
      	at org.apache.hadoop.ozone.client.io.KeyInputStream.checkPartBytesRead(KeyInputStream.java:175)
      	at org.apache.hadoop.hdds.scm.storage.MultipartInputStream.readWithStrategy(MultipartInputStream.java:97)
      	at org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:54)
      	at org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:64)
      	at java.io.DataInputStream.read(DataInputStream.java:149)
      	at org.apache.hadoop.fs.ozone.TestHSync.runTestHSync(TestHSync.java:188)
      	at org.apache.hadoop.fs.ozone.TestHSync.runTestHSync(TestHSync.java:143)
      	at org.apache.hadoop.fs.ozone.TestHSync.testO3fsHSync(TestHSync.java:116)
      

      I guess these two different tests both exercise HSync.

      CC Wei-Chiu Chuang, Tsz-wo Sze


      The datanode may fail to serve readChunk but the client won't retry other datanodes. We will change it to retry other datanodes in the JIRA.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            szetszwo Tsz-wo Sze
            adoroszlai Attila Doroszlai
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment