Description
This is a follow-up for the work done by Hieu Huynh in 2019.
Add a new class HybridKVStore to make the history server faster when loading event files. When rebuilding the application state from event logs, HybridKVStore will first write data to an in-memory store and having a background thread that keeps pushing the change to levelDB.
I ran some tests on 3.0.1 on mac os:
kvstore type / log size | 100m | 200m | 500m | 1g | 2g |
---|---|---|---|---|---|
HybridKVStore | 5s to parse, 7s(include the parsing time) to switch to leveldb | 6s to parse, 10s to switch to leveldb | 15s to parse, 23s to switch to leveldb | 23s to parse, 40s to switch to leveldb | 37s to parse, 73s to switch to leveldb |
LevelDB | 12s to parse | 19s to parse | 43s to parse | 69s to parse | 124s to parse |
For example when loading a 1g file, HybridKVStore takes 23s to parse (that means, users only need to wait for 23s to see the UI), the background thread will still run 17s to copy data to leveldb. And after that, the in memory store can be closed, the entire store now moves to leveldb. So in general, it has 3x - 4x UI loading speed improvement.
Attachments
Issue Links
- relates to
-
SPARK-32350 Add batch write support on LevelDB to improve performance of HybridStore
- Resolved
- links to