Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
deep in the HFile read path, there is this code:
synchronized (in)
{ in.seek(pos); ret = in.read(b, off, n); }this makes it so that only 1 read per file per thread is active. this prevents the OS and hardware from being able to do IO scheduling by optimizing lots of concurrent reads.
We need to either use a reentrant API (pread may be partially reentrant according to Todd) or use multiple stream objects, 1 per scanner/thread.
Attachments
Attachments
Issue Links
- blocks
-
HBASE-1505 [performance] hfile should change how it reads from hdfs -- pread/seek+read -- dependent on recent history
- Closed