Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-3816 Erasure Coding
  3. HDDS-6319

EC: Fix read big file failure with EC policy 10+4.

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • EC-Branch
    • None

    Description

      Steps to reproduce:

      ./bin/ozone sh volume create vol1
      ./bin/ozone sh bucket create vol1/bucket1 --layout=FILE_SYSTEM_OPTIMIZED --replication=rs-10-4-1024k --type EC

      ./bin/ozone sh key put /vol1/bucket1/dd.2G dd.2G

      ./bin/ozone sh key get /vol1/bucket1/dd.2G down.2G

      output:

      java.lang.IndexOutOfBoundsException
              at java.nio.ByteBuffer.wrap(ByteBuffer.java:375)
              at org.apache.hadoop.ozone.client.io.ECBlockInputStreamProxy.read(ECBlockInputStreamProxy.java:143)
              at org.apache.hadoop.hdds.scm.storage.ByteArrayReader.readFromBlock(ByteArrayReader.java:57)
              at org.apache.hadoop.ozone.client.io.KeyInputStream.readWithStrategy(KeyInputStream.java:268)
              at org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:235)
              at org.apache.hadoop.ozone.client.io.OzoneInputStream.read(OzoneInputStream.java:56)
              at java.io.InputStream.read(InputStream.java:101)
              at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:94)
              at org.apache.hadoop.ozone.shell.keys.GetKeyHandler.execute(GetKeyHandler.java:88)
              at org.apache.hadoop.ozone.shell.Handler.call(Handler.java:98)
              at org.apache.hadoop.ozone.shell.Handler.call(Handler.java:44)
              at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
              at picocli.CommandLine.access$1300(CommandLine.java:145)
              at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
              at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
              at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
              at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172)
              at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550)
              at picocli.CommandLine.parseWithHandler(CommandLine.java:2485)
              at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:96)
              at org.apache.hadoop.ozone.shell.OzoneShell.lambda$execute$17(OzoneShell.java:55)
              at org.apache.hadoop.hdds.tracing.TracingUtil.executeInNewSpan(TracingUtil.java:159)
              at org.apache.hadoop.ozone.shell.OzoneShell.execute(OzoneShell.java:53)
              at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:87)
              at org.apache.hadoop.ozone.shell.OzoneShell.main(OzoneShell.java:47) 

      This is due to an int overflow in KeyInputStream:

      protected synchronized int readWithStrategy(ByteReaderStrategy strategy)
          throws IOException {
        Preconditions.checkArgument(strategy != null);
        checkOpen();
      
        int buffLen = strategy.getTargetLength();
        int totalReadLen = 0;
        while (buffLen > 0) {
          // if we are at the last block and have read the entire block, return
          if (blockStreams.size() == 0 ||
              (blockStreams.size() - 1 <= blockIndex &&
                  blockStreams.get(blockIndex)
                      .getRemaining() == 0)) {
            return totalReadLen == 0 ? EOF : totalReadLen;
          }
      
          // Get the current blockStream and read data from it
          BlockExtendedInputStream current = blockStreams.get(blockIndex);
          int numBytesToRead = Math.min(buffLen, (int)current.getRemaining());        <-- int overflow
          int numBytesRead = strategy.readFromBlock(current, numBytesToRead);
          if (numBytesRead != numBytesToRead) {
            // This implies that there is either data loss or corruption in the
            // chunk entries. Even EOF in the current stream would be covered in
            // this case.
            throw new IOException(String.format("Inconsistent read for blockID=%s "
                    + "length=%d numBytesToRead=%d numBytesRead=%d",
                current.getBlockID(), current.getLength(), numBytesToRead,
                numBytesRead));
          }
          totalReadLen += numBytesRead;
          buffLen -= numBytesRead;
          if (current.getRemaining() <= 0 &&
              ((blockIndex + 1) < blockStreams.size())) {
            blockIndex += 1;
          }
        }
        return totalReadLen;
      } 

      KeyInputStream is common path for both replicate read and ec write, but ECBlockInputStream getLength() is the length of the whole big block group which is easily > INT_MAX under an EC policy of 10+4 with default block.size 256MB

      256 * 1024 * 1024 * 10 > INT_MAX(2147483647) 

      Attachments

        Issue Links

          Activity

            People

              markgui Mark Gui
              markgui Mark Gui
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: