Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Consider supporting:
- 2GB store files
- 1TB per node (500 store files)
- Cell values up to ~100MB
- Typical use case of RS running with 1GB of heap only
Some ideas:
- Drop MapFile and make a custom store file format with (competing) design goals:
- heap efficiency
- fast lookups
- minimize I/O operations
- optimize for typical DFS blocksizes (8MB, 64MB)
- MRU cache for filehandles and store file indexes
- Memory mapped store file indexes – don't hold the indexes in heap; rely on the OS blockcache for performance
- "Zero copy" I/O from IPC to store file and vice versa, like NIO buffers