Derby
  1. Derby
  2. DERBY-3941

Unsafe use of DataInput.skipBytes() in StoredPage and StoredFieldHeader

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.6.1.0
    • Component/s: Store
    • Labels:
      None
    • Issue & fix info:
      Newcomer

      Description

      Some methods in StoredFileHeader and StoredPage call java.io.DataInput.skipBytes(int) with the assumption that it always skips the requested number of bytes. According to the javadoc for skipBytes, it may skip fewer bytes than requested, possibly 0, even if the end of the stream hasn't been reached.

      The problem exists in these methods:

      StoredFieldHeader.readFieldDataLength()
      StoredPage.readRecordFromStream()
      StoredPage.skipField()
      StoredPage.readOneColumnFromPage()
      StoredPage.readRecordFromArray()

      We should change the code so that it works correctly even if skipBytes() were to skip fewer bytes than requested.

      1. derby-3941-5.stat
        0.1 kB
        Yun Lee
      2. derby-3941-5.diff
        4 kB
        Yun Lee
      3. derby-3941-4.stat
        0.5 kB
        Yun Lee
      4. derby-3941-4.diff
        12 kB
        Yun Lee
      5. derby-3941-3.stat
        0.5 kB
        Yun Lee
      6. derby-3941-3.diff
        14 kB
        Yun Lee
      7. derby-3941-2.stat
        0.5 kB
        Yun Lee
      8. derby-3941-2.diff
        14 kB
        Yun Lee
      9. derby-3941-1.stat
        0.4 kB
        Yun Lee
      10. derby-3941-1.diff
        13 kB
        Yun Lee

        Issue Links

          Activity

          Knut Anders Hatlen created issue -
          Knut Anders Hatlen made changes -
          Field Original Value New Value
          Link This issue is related to DERBY-3770 [ DERBY-3770 ]
          Hide
          Yun Lee added a comment -

          Hi, Knut. I'm interested in this issue, and wonder what's your favorite way to correct this problem.

          1.) to throw a IOException.
          2.) retry to skip for some define times, if failed throw a IOExcetion.

          Besides, there are still many other invocation on skipBytes(), i.e.*ImageReader, BlockDataInputStream, an so on. Should they be correct altogether for this issue? If so, is there a wrapped function to resolve this problem uniformly.

          Thanks!

          Show
          Yun Lee added a comment - Hi, Knut. I'm interested in this issue, and wonder what's your favorite way to correct this problem. 1.) to throw a IOException. 2.) retry to skip for some define times, if failed throw a IOExcetion. Besides, there are still many other invocation on skipBytes(), i.e.*ImageReader, BlockDataInputStream, an so on. Should they be correct altogether for this issue? If so, is there a wrapped function to resolve this problem uniformly. Thanks!
          Hide
          Knut Anders Hatlen added a comment -

          Hi Yun,

          My preferred solution would be to have a variant of org.apache.derby.iapi.services.io.InputStreamUtil.skipFully() that could take a DataInput argument. That method uses skip until 0 is returned, then it uses read() which is guaranteed to block until there is something to read. If read returns -1, an EOFException is thrown. Currently skipFully() is only implemented for InputStream, I think.

          I'm not sure I understand your question about *ImageReader and BlockDataInputStream. Those classes are part of the JDK, aren't they? I didn't find any references to them in the Derby code.

          Show
          Knut Anders Hatlen added a comment - Hi Yun, My preferred solution would be to have a variant of org.apache.derby.iapi.services.io.InputStreamUtil.skipFully() that could take a DataInput argument. That method uses skip until 0 is returned, then it uses read() which is guaranteed to block until there is something to read. If read returns -1, an EOFException is thrown. Currently skipFully() is only implemented for InputStream, I think. I'm not sure I understand your question about *ImageReader and BlockDataInputStream. Those classes are part of the JDK, aren't they? I didn't find any references to them in the Derby code.
          Yun Lee made changes -
          Assignee Yun Lee [ yunlee ]
          Hide
          Yun Lee added a comment - - edited

          Knut, thanks for your advice!

          >My preferred solution would be to have a variant of org.apache.derby.iapi.services.io.InputStreamUtil.skipFully() that could take a DataInput argument. That method uses skip until 0 is returned, then it uses read() which is guaranteed to block until there is something to read. If read returns -1, an EOFException is thrown. Currently skipFully() is only implemented for InputStream, I think.

          With the util, the problem can be resolved easily. I just doubt, skipPersistent() used in InputStreamUtil.skipFully() will cause a new problem on time efficiency, as it's possible to wait a long time for the block finishing.

          >I'm not sure I understand your question about *ImageReader and BlockDataInputStream. Those classes are part of the JDK, aren't they? I didn't find any references to them in the Derby code.

          I'm sorry to have seen the two classes in a careless look at 'call hierachy' window in Eclipse, .

          I will post a patch on this weekend.

          Yun

          Show
          Yun Lee added a comment - - edited Knut, thanks for your advice! >My preferred solution would be to have a variant of org.apache.derby.iapi.services.io.InputStreamUtil.skipFully() that could take a DataInput argument. That method uses skip until 0 is returned, then it uses read() which is guaranteed to block until there is something to read. If read returns -1, an EOFException is thrown. Currently skipFully() is only implemented for InputStream, I think. With the util, the problem can be resolved easily. I just doubt, skipPersistent() used in InputStreamUtil.skipFully() will cause a new problem on time efficiency, as it's possible to wait a long time for the block finishing. >I'm not sure I understand your question about *ImageReader and BlockDataInputStream. Those classes are part of the JDK, aren't they? I didn't find any references to them in the Derby code. I'm sorry to have seen the two classes in a careless look at 'call hierachy' window in Eclipse, . I will post a patch on this weekend. Yun
          Yun Lee made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Hide
          Yun Lee added a comment -

          Knut, I have added a function skipFullyDataInput(DataInput di, int skippedBytes) in InputStreamUtil following by one test case, and applied it to replace the calling to DataInput.skipBytes() in StoredFieldHeader, StoredPage and ClassInvestigator. Also, I have removed unused import clauses and organized the import part too.

          Please check it! Thanks!

          Yun

          Show
          Yun Lee added a comment - Knut, I have added a function skipFullyDataInput(DataInput di, int skippedBytes) in InputStreamUtil following by one test case, and applied it to replace the calling to DataInput.skipBytes() in StoredFieldHeader, StoredPage and ClassInvestigator. Also, I have removed unused import clauses and organized the import part too. Please check it! Thanks! Yun
          Yun Lee made changes -
          Attachment derby-3941-1.diff [ 12405222 ]
          Attachment derby-3941-1.stat [ 12405223 ]
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for the patch, Yun!

          I have some small comments:

          1) skipPersistentDataInput() checks if DataInput.readByte() returns -1 to detect that EOF has been reached, but I think that that method will throw EOFException and not return -1 on EOF

          2) I think I would have renamed skipPersistentDataInput() and skipFullyDataInput() to skipPersistent() and skipFully().

          Show
          Knut Anders Hatlen added a comment - Thanks for the patch, Yun! I have some small comments: 1) skipPersistentDataInput() checks if DataInput.readByte() returns -1 to detect that EOF has been reached, but I think that that method will throw EOFException and not return -1 on EOF 2) I think I would have renamed skipPersistentDataInput() and skipFullyDataInput() to skipPersistent() and skipFully().
          Hide
          Yun Lee added a comment -

          Knut,

          >1) skipPersistentDataInput() checks if DataInput.readByte() returns -1 to detect that EOF has been reached, but I think that that method will throw EOFException and not return -1 on EOF

          I have changed both skipPersistentDataInput() and skipPersistent() to perform like what you need, and changed the document of UTF8Util.internalSkip(final InputStream in, final long charsToSkip) which used the skipPersistent(). I think the new revision can act smartlier. Please check the new patches, thanks!

          >2) I think I would have renamed skipPersistentDataInput() and skipFullyDataInput() to skipPersistent() and skipFully().

          I have tried this before providing the first patches, however, I found it's not able to use overload here, as it will lead to some compiling-time error on ambiguous use, i.e. InputStreamUtil.skipFully(null, int), and InputStreamUtil.skipFully(dis, int) where dis is an instance of DataInputStream.

          Yun

          Show
          Yun Lee added a comment - Knut, >1) skipPersistentDataInput() checks if DataInput.readByte() returns -1 to detect that EOF has been reached, but I think that that method will throw EOFException and not return -1 on EOF I have changed both skipPersistentDataInput() and skipPersistent() to perform like what you need, and changed the document of UTF8Util.internalSkip(final InputStream in, final long charsToSkip) which used the skipPersistent(). I think the new revision can act smartlier. Please check the new patches, thanks! >2) I think I would have renamed skipPersistentDataInput() and skipFullyDataInput() to skipPersistent() and skipFully(). I have tried this before providing the first patches, however, I found it's not able to use overload here, as it will lead to some compiling-time error on ambiguous use, i.e. InputStreamUtil.skipFully(null, int), and InputStreamUtil.skipFully(dis, int) where dis is an instance of DataInputStream. Yun
          Yun Lee made changes -
          Attachment derby-3941-2.diff [ 12405270 ]
          Attachment derby-3941-2.stat [ 12405271 ]
          Hide
          Knut Anders Hatlen added a comment -

          Hi Yun,

          My point about DataInput.readByte() was that -1 is not an indication that the end of the stream has been reached, so it's not correct to throw an EOFException when readByte() returns -1. I think skipFully() will have to be implemented along the lines of this untested code:

          public static void skipFully(DataInput in, int bytesToSkip)
          throws IOException
          {
          if (in == null)

          { throw new NullPointerException(); }

          while (bytesToSkip > 0) {
          int skipped = in.skipBytes(bytesToSkip);
          if (skipped == 0)

          { // No bytes skipped, read one byte to see if EOF has been // reached. DataInput.readByte() will throw an EOFException // if there's nothing more to read. in.readByte(); // Still more data to read. Account for the byte we just read. skipped++; }


          bytesToSkip -= skipped;
          }
          }

          The changes to the javadoc for skipPersistent(InputStream,int) also look wrong, as that method is not supposed to throw an EOFException if end of stream is reached before all the bytes have been skipped. If it reaches end of stream prematurely, it returns the number of bytes actually skipped, just as the old javadoc said.

          As to the compile errors when using the same name for the different variants of skipFully(), it should be possible to disambiguate the method calls by using a cast, like this:
          InputStreamUtil.skipFully((InputStream) null, x);
          or
          InputStreamUtil.skipFully((DataInput) null, x);

          But perhaps the problem here is that we're trying to overload InputStreamUtil too much. Since a DataInput is not (necessarily) an InputStream it might be cleaner to create a new class, DataInputUtil, where we can collect utility methods for DataInputs without conflicting with the utilities for InputStreams.

          Show
          Knut Anders Hatlen added a comment - Hi Yun, My point about DataInput.readByte() was that -1 is not an indication that the end of the stream has been reached, so it's not correct to throw an EOFException when readByte() returns -1. I think skipFully() will have to be implemented along the lines of this untested code: public static void skipFully(DataInput in, int bytesToSkip) throws IOException { if (in == null) { throw new NullPointerException(); } while (bytesToSkip > 0) { int skipped = in.skipBytes(bytesToSkip); if (skipped == 0) { // No bytes skipped, read one byte to see if EOF has been // reached. DataInput.readByte() will throw an EOFException // if there's nothing more to read. in.readByte(); // Still more data to read. Account for the byte we just read. skipped++; } bytesToSkip -= skipped; } } The changes to the javadoc for skipPersistent(InputStream,int) also look wrong, as that method is not supposed to throw an EOFException if end of stream is reached before all the bytes have been skipped. If it reaches end of stream prematurely, it returns the number of bytes actually skipped, just as the old javadoc said. As to the compile errors when using the same name for the different variants of skipFully(), it should be possible to disambiguate the method calls by using a cast, like this: InputStreamUtil.skipFully((InputStream) null, x); or InputStreamUtil.skipFully((DataInput) null, x); But perhaps the problem here is that we're trying to overload InputStreamUtil too much. Since a DataInput is not (necessarily) an InputStream it might be cleaner to create a new class, DataInputUtil, where we can collect utility methods for DataInputs without conflicting with the utilities for InputStreams.
          Hide
          Yun Lee added a comment -

          Knut, I think you are right! DataInput.readByte() does act differently from InputStream.read(), and throws a EOF Exception when EOF is met.

          I have moved the code into a new util class DataInputUtil, and wrote the corresponding test class. Please check it, thanks!

          Yun

          Show
          Yun Lee added a comment - Knut, I think you are right! DataInput.readByte() does act differently from InputStream.read(), and throws a EOF Exception when EOF is met. I have moved the code into a new util class DataInputUtil, and wrote the corresponding test class. Please check it, thanks! Yun
          Yun Lee made changes -
          Attachment derby-3941-3.diff [ 12405317 ]
          Attachment derby-3941-3.stat [ 12405318 ]
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for the new patch. I think it looks good. Two tiny comments:

          1) The license header is missing in the new test class.

          2) Much of the diff in StoredPage is just reorganizing import statements. It's probably better to do that in a separate patch, if we think those statements need to be tidied up.

          Show
          Knut Anders Hatlen added a comment - Thanks for the new patch. I think it looks good. Two tiny comments: 1) The license header is missing in the new test class. 2) Much of the diff in StoredPage is just reorganizing import statements. It's probably better to do that in a separate patch, if we think those statements need to be tidied up.
          Hide
          Yun Lee added a comment -

          Knut. I agree with you, derby-3941-4.diff and derby-3941-4.stat gives change on skipFully(), while derby-3941-5.diff and derby-3941-5.stat gives a pure change on import organize on StoredPage. I think importer reorganization is necessary to keep code clean. Wish for your check.

          Yun

          Show
          Yun Lee added a comment - Knut. I agree with you, derby-3941-4.diff and derby-3941-4.stat gives change on skipFully(), while derby-3941-5.diff and derby-3941-5.stat gives a pure change on import organize on StoredPage. I think importer reorganization is necessary to keep code clean. Wish for your check. Yun
          Yun Lee made changes -
          Attachment derby-3941-4.diff [ 12405649 ]
          Attachment derby-3941-4.stat [ 12405650 ]
          Yun Lee made changes -
          Attachment derby-3941-5.diff [ 12405651 ]
          Attachment derby-3941-5.stat [ 12405652 ]
          Yun Lee made changes -
          Derby Info [Patch Available]
          Hide
          Knut Anders Hatlen added a comment -

          Thanks Yun! I committed patch #5 to trunk (revision 765943). Will take a look at #4 shortly.

          Show
          Knut Anders Hatlen added a comment - Thanks Yun! I committed patch #5 to trunk (revision 765943). Will take a look at #4 shortly.
          Hide
          Knut Anders Hatlen added a comment -

          Committed revision 766163.
          Thanks for your work on this issue, Yun!

          Show
          Knut Anders Hatlen added a comment - Committed revision 766163. Thanks for your work on this issue, Yun!
          Knut Anders Hatlen made changes -
          Status In Progress [ 3 ] Closed [ 6 ]
          Derby Info [Patch Available]
          Fix Version/s 10.6.0.0 [ 12313727 ]
          Resolution Fixed [ 1 ]
          Dag H. Wanvik made changes -
          Component/s Newcomer [ 12310640 ]
          Dag H. Wanvik made changes -
          Issue & fix info [Newcomer]
          Gavin made changes -
          Workflow jira [ 12445811 ] Default workflow, editable Closed status [ 12801642 ]

            People

            • Assignee:
              Yun Lee
              Reporter:
              Knut Anders Hatlen
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development