[HDDS-7303] EC: ECBlockReconstructedStripeInputStream should set initialized false on re-init - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.3.0
Component/s: EC Client
Labels:
- pull-request-available

Description

In ECBlockReconstructedStripeInputStream, when an exception occurs reading a block, the code calls the `init()` method to setup the missing indexes and buffers.

If an InsufficientLocations exception is thrown part way through that method, the class ends up partly re-initialized. If something then ignores / handles the InsufficientLocations and tries to call read again, it can cause strange results. In one case, we get an illegalArgumentException, which I think is related to the above:

Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
	at org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.assignBuffers(ECBlockReconstructedStripeInputStream.java:289)
	at org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.read(ECBlockReconstructedStripeInputStream.java:360)
	at org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.readStripe(ECBlockReconstructedStripeInputStream.java:345)
	at org.apache.hadoop.ozone.client.io.ECBlockReconstructedInputStream.readStripe(ECBlockReconstructedInputStream.java:214)
	at org.apache.hadoop.ozone.client.io.ECBlockReconstructedInputStream.readAndSeekStripe(ECBlockReconstructedInputStream.java:198)
	at org.apache.hadoop.ozone.client.io.ECBlockReconstructedInputStream.seek(ECBlockReconstructedInputStream.java:192)
	at org.apache.hadoop.ozone.client.io.ECBlockInputStreamProxy.seek(ECBlockInputStreamProxy.java:224)
	at org.apache.hadoop.ozone.client.io.KeyInputStream.seek(KeyInputStream.java:340)
	at org.apache.hadoop.fs.ozone.OzoneFSInputStream.seek(OzoneFSInputStream.java:78)
	at org.apache.hadoop.fs.FSInputStream.read(FSInputStream.java:85)
	at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:124)
	at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:116)
	at org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readStripeFooter(RecordReaderUtils.java:273)
	at org.apache.orc.impl.RecordReaderImpl.readStripeFooter(RecordReaderImpl.java:308)
	at org.apache.orc.impl.RecordReaderImpl.beginReadStripe(RecordReaderImpl.java:1131)
	at org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1093)
	at org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1261)
	at org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1296)
	at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1332)
	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:157)
	at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader$1.next(VectorizedOrcAcidRowBatchReader.java:175)
	at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader$1.next(VectorizedOrcAcidRowBatchReader.java:171)
	at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.next(VectorizedOrcAcidRowBatchReader.java:871)
	... 26 more

We should simply set initialized to false at the beginning of init and set it to try at the end when the full init method has completed.

Attachments

Issue Links

is duplicated by

HDDS-7278 [ozone-LR] ArrayIndexOutOfBoundsException in EC Buckets with fault injections

Resolved

links to

GitHub Pull Request #3816

Activity

People

Assignee:: Stephen O'Donnell

Reporter:: Stephen O'Donnell

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 10/Oct/22 12:02

Updated:: 15/Oct/22 07:06

Resolved:: 15/Oct/22 07:06