Description
The FullUpdateInitializer follows the MetastoreCacheInitializer. It reads a bunch of information from HMS and uses Thrift structures to pass around, but in the end it just constructs Map<String, Set<String>>. It uses concurrent fetches from HMS, but synchronizes a lot on common data structures to update them.
I think that we can refactor all this code to make it faster and consume less memory. The idea is the following:
Use background threads to collect Thrift results from HMS calls (database, table and partition data). Then we can use a single thread to construct the resulting update and return it without using intermediate Thrift methods.
Attachments
Attachments
Issue Links
- is related to
-
SENTRY-1606 Sentry HDFS Sync should survive in presence of bad paths objects
- Resolved
-
SENTRY-1698 PathsUpdate.parsePath() calls FileSystem.getDefaultUri() way too often
- Resolved
-
SENTRY-1700 FullUpdateInitializer should not use preconditions to verify HMS data
- Resolved