Details
-
Sub-task
-
Status: Reopened
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
Reproducer:
char *buf2 = new char[file_info->mSize];
memset(buf2, 0, (size_t)file_info->mSize);
int ret = hdfsRead(fs, file, buf2, file_info->mSize);
delete [] buf2;
if(ret != file_info->mSize)
When it runs with a file ~1.4GB large, it will return an error like "tried to read 1468888890 bytes. but read 134217728 bytes". The HDFS cluster it runs against has a block size of 134217728 bytes. So it seems hdfsRead will stop at a block boundary. Looks like a regression. We should add retry to continue reading cross blocks in case of files w/ multiple blocks.