Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-6462 Phase II : Erasure Coding Offline Recovery & Read/Write Improvements
  3. HDDS-7303

EC: ECBlockReconstructedStripeInputStream should set initialized false on re-init

    XMLWordPrintableJSON

Details

    Description

      In ECBlockReconstructedStripeInputStream, when an exception occurs reading a block, the code calls the `init()` method to setup the missing indexes and buffers.

      If an InsufficientLocations exception is thrown part way through that method, the class ends up partly re-initialized. If something then ignores / handles the InsufficientLocations and tries to call read again, it can cause strange results. In one case, we get an illegalArgumentException, which I think is related to the above:

      Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
      	at org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.assignBuffers(ECBlockReconstructedStripeInputStream.java:289)
      	at org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.read(ECBlockReconstructedStripeInputStream.java:360)
      	at org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.readStripe(ECBlockReconstructedStripeInputStream.java:345)
      	at org.apache.hadoop.ozone.client.io.ECBlockReconstructedInputStream.readStripe(ECBlockReconstructedInputStream.java:214)
      	at org.apache.hadoop.ozone.client.io.ECBlockReconstructedInputStream.readAndSeekStripe(ECBlockReconstructedInputStream.java:198)
      	at org.apache.hadoop.ozone.client.io.ECBlockReconstructedInputStream.seek(ECBlockReconstructedInputStream.java:192)
      	at org.apache.hadoop.ozone.client.io.ECBlockInputStreamProxy.seek(ECBlockInputStreamProxy.java:224)
      	at org.apache.hadoop.ozone.client.io.KeyInputStream.seek(KeyInputStream.java:340)
      	at org.apache.hadoop.fs.ozone.OzoneFSInputStream.seek(OzoneFSInputStream.java:78)
      	at org.apache.hadoop.fs.FSInputStream.read(FSInputStream.java:85)
      	at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:124)
      	at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:116)
      	at org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readStripeFooter(RecordReaderUtils.java:273)
      	at org.apache.orc.impl.RecordReaderImpl.readStripeFooter(RecordReaderImpl.java:308)
      	at org.apache.orc.impl.RecordReaderImpl.beginReadStripe(RecordReaderImpl.java:1131)
      	at org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1093)
      	at org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1261)
      	at org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1296)
      	at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1332)
      	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:157)
      	at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader$1.next(VectorizedOrcAcidRowBatchReader.java:175)
      	at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader$1.next(VectorizedOrcAcidRowBatchReader.java:171)
      	at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.next(VectorizedOrcAcidRowBatchReader.java:871)
      	... 26 more
      

      We should simply set initialized to false at the beginning of init and set it to try at the end when the full init method has completed.

      Attachments

        Issue Links

          Activity

            People

              sodonnell Stephen O'Donnell
              sodonnell Stephen O'Donnell
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: