HBase
  1. HBase
  2. HBASE-11660

Make WAL reader follow contract for java.io.InputStream.available()

    Details

    • Type: Bug Bug
    • Status: Patch Available
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 0.99.0, 2.0.0, 0.98.6
    • Fix Version/s: None
    • Component/s: wal
    • Labels:
      None

      Description

      In the process of building support to running HBase on Microsoft Azure HDInsight, I hit an issue in the HBase WAL reading process that took a lot of time to debug. The WAL reading code depends on available() for the log InputStream never returing 0 until end of file. This is not the same as the contract in java.io.InputStream for available.

      To prevent future grief for others that may want to port HBase onto storage systems other than HDFS, I propose to change the HBase WAL reader so it does not assume that EOF has been reached when available() == 0. It instead would treat available only as described in InputStream, i.e. available() is merely the number of bytes that could be read from the stream without blocking. That could be 0 even before EOF.

        Activity

        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12659674/hbase-11660.01.patch
        against trunk revision .
        ATTACHMENT ID: 12659674

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 lineLengths. The patch does not introduce lines longer than 100

        +1 site. The mvn site goal succeeds with this patch.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.codec.TestCellCodec
        org.apache.hadoop.hbase.codec.TestKeyValueCodecWithTags
        org.apache.hadoop.hbase.codec.TestKeyValueCodec
        org.apache.hadoop.hbase.codec.TestCellCodecWithTags

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659674/hbase-11660.01.patch against trunk revision . ATTACHMENT ID: 12659674 +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.codec.TestCellCodec org.apache.hadoop.hbase.codec.TestKeyValueCodecWithTags org.apache.hadoop.hbase.codec.TestKeyValueCodec org.apache.hadoop.hbase.codec.TestCellCodecWithTags Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10396//console This message is automatically generated.
        Hide
        stack added a comment -

        Submitting patch to see how the patch does. Would be cool if we could do w/o. I seem to remember it needed but can't remember why.

        Show
        stack added a comment - Submitting patch to see how the patch does. Would be cool if we could do w/o. I seem to remember it needed but can't remember why.
        Hide
        Enis Soztutar added a comment -

        Agreed with Eric and Jeff that we should do a sweep for removing dependency on available() since it does not match the documented java semantics. Great find Eric!
        I'll try to take a shot if I find time.

        Show
        Enis Soztutar added a comment - Agreed with Eric and Jeff that we should do a sweep for removing dependency on available() since it does not match the documented java semantics. Great find Eric! I'll try to take a shot if I find time.
        Hide
        Eric Hanson added a comment -

        Thanks for the feedback. I agree with you that a wider evaluation is needed, and a test for a file input stream class that can return 0 before EOF. I'm not able to help with this, at least in the next couple of weeks. Maybe we can keep this open for discussion or if somebody else wants to help.

        Show
        Eric Hanson added a comment - Thanks for the feedback. I agree with you that a wider evaluation is needed, and a test for a file input stream class that can return 0 before EOF. I'm not able to help with this, at least in the next couple of weeks. Maybe we can keep this open for discussion or if somebody else wants to help.
        Hide
        Jeffrey Zhong added a comment -

        The issue is that HBase code is assuming InputStream.available() returns an accurate bytes left for reading. We have quite a few other places in IO operations rely on that. Even in ProtobufLogReader code we have a place like the following:

                  available = this.inputStream.available();
                  LOG.info("available size from inputStream.available()=" + available);
                  if (available > 0 && available < size) {
                    throw new EOFException("Available stream not enough for edit, " +
                        "inputStream.available()= " + this.inputStream.available() + ", " +
                        "entry size= " + size);
                  }
        

        The change will also incur one more IO seek because we can't rely on available() to exit early and after the change we force a last seek to trigger an EOF exception from underlying input stream. I'd suggest we do a more wider evaluation to see the possibility to remove the dependency on available().

        Show
        Jeffrey Zhong added a comment - The issue is that HBase code is assuming InputStream.available() returns an accurate bytes left for reading. We have quite a few other places in IO operations rely on that. Even in ProtobufLogReader code we have a place like the following: available = this .inputStream.available(); LOG.info( "available size from inputStream.available()=" + available); if (available > 0 && available < size) { throw new EOFException( "Available stream not enough for edit, " + "inputStream.available()= " + this .inputStream.available() + ", " + "entry size= " + size); } The change will also incur one more IO seek because we can't rely on available() to exit early and after the change we force a last seek to trigger an EOF exception from underlying input stream. I'd suggest we do a more wider evaluation to see the possibility to remove the dependency on available().
        Hide
        Matteo Bertozzi added a comment -

        if I read it correctly advance() now will always return true (and the hasNext boolean can be removed).
        but this seems to work just because we know the number of items in the WAL ahead, or am I missing something?

        HLogSplitter.java
        for (int i = 0; i < count; i++) {
              // Throw index out of bounds if our cell count is off
              if (!cells.advance()) {
        
        WALEdit.java:    while (kvs.size() < expectedCount && cellDecoder.advance()) {
        
        Show
        Matteo Bertozzi added a comment - if I read it correctly advance() now will always return true (and the hasNext boolean can be removed). but this seems to work just because we know the number of items in the WAL ahead, or am I missing something? HLogSplitter.java for ( int i = 0; i < count; i++) { // Throw index out of bounds if our cell count is off if (!cells.advance()) { WALEdit.java: while (kvs.size() < expectedCount && cellDecoder.advance()) {
        Hide
        Eric Hanson added a comment -

        I saw another check for available == 0 in the same file targeted by this patch. Perhaps there are more elsewhere. It'd be good for somebody familiar with the WAL splitting/reading process to have a look and see if there needs to be more changes to make the code only rely on the available() contract from InputStream.

        Show
        Eric Hanson added a comment - I saw another check for available == 0 in the same file targeted by this patch. Perhaps there are more elsewhere. It'd be good for somebody familiar with the WAL splitting/reading process to have a look and see if there needs to be more changes to make the code only rely on the available() contract from InputStream.
        Hide
        Sean Busbey added a comment -

        I don't think those code paths test for the case where the stream for the WAL returns 0 bytes available when it is not at EOF, though. That kind of test is the best way to ensure that as the code base changes we keep accounting for the correct InputStream behavior.

        Show
        Sean Busbey added a comment - I don't think those code paths test for the case where the stream for the WAL returns 0 bytes available when it is not at EOF, though. That kind of test is the best way to ensure that as the code base changes we keep accounting for the correct InputStream behavior.
        Hide
        Eric Hanson added a comment -

        The existing WAL pretty printer code exercises this code path. The tests for that, and also for WAL splitting, should cover it.

        Show
        Eric Hanson added a comment - The existing WAL pretty printer code exercises this code path. The tests for that, and also for WAL splitting, should cover it.
        Hide
        Sean Busbey added a comment -

        Thanks for the patch Eric. Could you include a test as well?

        Show
        Sean Busbey added a comment - Thanks for the patch Eric. Could you include a test as well?
        Hide
        Eric Hanson added a comment -

        The existing WAL reading code appears to take a hard dependency on the behavior of the available method in DFSInputStream.java, which looks like this in the Hadoop trunk:

        /** Return the size of the remaining available bytes
        * if the size is less than or equal to {@link Integer#MAX_VALUE},
        * otherwise, return {@link Integer#MAX_VALUE}.
        */
          @Override
          public synchronized int available() throws IOException {
            if (closed) {
              throw new IOException("Stream closed");
            }
        
            final long remaining = getFileLength() - pos;
            return remaining <= Integer.MAX_VALUE? (int)remaining: Integer.MAX_VALUE;
          }
        

        This implied contract is different from the one in java.io.InputStream.

        Show
        Eric Hanson added a comment - The existing WAL reading code appears to take a hard dependency on the behavior of the available method in DFSInputStream.java, which looks like this in the Hadoop trunk: /** Return the size of the remaining available bytes * if the size is less than or equal to {@link Integer #MAX_VALUE}, * otherwise, return {@link Integer #MAX_VALUE}. */ @Override public synchronized int available() throws IOException { if (closed) { throw new IOException( "Stream closed" ); } final long remaining = getFileLength() - pos; return remaining <= Integer .MAX_VALUE? ( int )remaining: Integer .MAX_VALUE; } This implied contract is different from the one in java.io.InputStream.
        Hide
        Eric Hanson added a comment -

        Here's a patch that was enough to fix the problem for log files on Azure page blobs and block blobs.

        With quite a lot of effort to track this down and fix it, I've worked around this in our Azure file system driver for Hadoop (the WASB driver). Nevertheless, I think that for the future it'd be preferred to follow the InputStream contract for available(), which most input stream does.

        That can be good for HBase by letting more people port to different storage systems more easily.

        Show
        Eric Hanson added a comment - Here's a patch that was enough to fix the problem for log files on Azure page blobs and block blobs. With quite a lot of effort to track this down and fix it, I've worked around this in our Azure file system driver for Hadoop (the WASB driver). Nevertheless, I think that for the future it'd be preferred to follow the InputStream contract for available(), which most input stream does. That can be good for HBase by letting more people port to different storage systems more easily.

          People

          • Assignee:
            Unassigned
            Reporter:
            Eric Hanson
          • Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

            • Created:
              Updated:

              Development