Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
There is a bug in libhdfs hdfsRead
jthr = invokeMethod(env, &jVal, INSTANCE, jInputStream, HADOOP_ISTRM, "read", "([B)I", jbRarray); if (jthr) { destroyLocalReference(env, jbRarray); errno = printExceptionAndFree(env, jthr, PRINT_EXC_ALL, "hdfsRead: FSDataInputStream#read"); return -1; } if (jVal.i < 0) { // EOF destroyLocalReference(env, jbRarray); return 0; } else if (jVal.i == 0) { destroyLocalReference(env, jbRarray); errno = EINTR; return -1; } (*env)->GetByteArrayRegion(env, jbRarray, 0, noReadBytes, buffer);
The method makes a call to FSInputStream#read(byte[]) to fill in the Java byte array, however, #read(byte[]) is not guaranteed to fill up the entire array, instead it returns the number of bytes written to the array (which could be less than the size of the array). Yet `GetByteArrayRegion decides to copy the entire contents of the jbArray into the buffer (noReadBytes is initialized to the length of the buffer and is never updated). So if FSInputStream#read(byte[]) decides to read less data than the size of the byte array, the call to GetByteArrayRegion will essentially copy more bytes than necessary.