Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Attempting to read a key less than 1 chunk, with 3 of the 5 nodes stopped (both when not yet stale or stale), the read hangs for sometime and fails with:
$ ozone sh key get /vol1/bucket/ec1 /tmp/3_down java.lang.IllegalStateException at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:33) at org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.selectParityIndexes(ECBlockReconstructedStripeInputStream.java:432) at org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.init(ECBlockReconstructedStripeInputStream.java:179) at org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.readStripe(ECBlockReconstructedStripeInputStream.java:285) at org.apache.hadoop.ozone.client.io.ECBlockReconstructedInputStream.readStripe(ECBlockReconstructedInputStream.java:192) at org.apache.hadoop.ozone.client.io.ECBlockReconstructedInputStream.selectNextBuffer(ECBlockReconstructedInputStream.java:109) at org.apache.hadoop.ozone.client.io.ECBlockReconstructedInputStream.read(ECBlockReconstructedInputStream.java:83) at org.apache.hadoop.ozone.client.io.ECBlockInputStreamProxy.read(ECBlockInputStreamProxy.java:156) at org.apache.hadoop.ozone.client.io.ECBlockInputStreamProxy.read(ECBlockInputStreamProxy.java:171) at org.apache.hadoop.ozone.client.io.ECBlockInputStreamProxy.read(ECBlockInputStreamProxy.java:141) at org.apache.hadoop.hdds.scm.storage.ByteArrayReader.readFromBlock(ByteArrayReader.java:57) at org.apache.hadoop.ozone.client.io.KeyInputStream.readWithStrategy(KeyInputStream.java:268) at org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:235) at org.apache.hadoop.ozone.client.io.OzoneInputStream.read(OzoneInputStream.java:56) at java.base/java.io.InputStream.read(InputStream.java:205) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:94) at org.apache.hadoop.ozone.shell.keys.GetKeyHandler.execute(GetKeyHandler.java:88) at org.apache.hadoop.ozone.shell.Handler.call(Handler.java:98) at org.apache.hadoop.ozone.shell.Handler.call(Handler.java:44) at picocli.CommandLine.executeUserObject(CommandLine.java:1953) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2346) at picocli.CommandLine$RunLast.handle(CommandLine.java:2311) at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172) at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550) at picocli.CommandLine.parseWithHandler(CommandLine.java:2485) at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:96) at org.apache.hadoop.ozone.shell.OzoneShell.lambda$execute$0(OzoneShell.java:55) at org.apache.hadoop.hdds.tracing.TracingUtil.executeInNewSpan(TracingUtil.java:159) at org.apache.hadoop.ozone.shell.OzoneShell.execute(OzoneShell.java:53) at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:87) at org.apache.hadoop.ozone.shell.OzoneShell.main(OzoneShell.java:47)
After the nodes are marked dead and the replicas no longer present in SCM, we get the expected error immediately:
ozone sh key get /vol1/bucket/ec1 /tmp/3_down_dead There are insufficient datanodes to read the EC block
We should fail with a better error here.
Attachments
Issue Links
- links to