Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-16766

Do not rely on InputStream.available()

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Abandoned
    • None
    • None
    • wal
    • None

    Description

      ProtobufLogReader relies on InputStream.available() to figure out whether we have exhausted the file. However InputStream.available() javadoc states:

           * <p> Note that while some implementations of {@code InputStream} will return
           * the total number of bytes in the stream, many will not.  It is
           * never correct to use the return value of this method to allocate
           * a buffer intended to hold all data in this stream.
      

      HDFS and many other Hadoop FS's, and things like ByteBufferInputStream, etc all return remaining bytes, so the code works on top of HDFS. However, on other file systems, it may or may not be true that IS.available() returns the remaining bytes. In one specific case, the ADLS wrapper FS used implement available() call with the correct semantics, which ended up causing data loss in the WAL recovery. We have since fixed ADLS to implement the HDFS semantics, but we should fix HBase itself so that we do not rely on available() call.

      Attachments

        1. hbase-16766_v1.patch
          9 kB
          Enis Soztutar

        Activity

          People

            Unassigned Unassigned
            enis Enis Soztutar
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: