Details
-
Bug
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
2.3.8
-
None
-
None
-
None
Description
Our hive servers are getting shutdown regularly by OOM.
Terminating due to java.lang.OutOfMemoryError: Compressed class space
We found out a lot of classes generated by janino compiler exist from heap dump,
(about 98% of all classes loaded)
, and those generated classes are cached in calcite's JaninoRelMetadataProvider.
This cache has no expiration, and whenever queries compile, hive server makes new metadata providers, one of keys for caching, which means hive servers make metadata classes generated in runtime every query and hive servers can't utilize the cache, but cache is getting bigger, and finally terminated by OOM due to lack of meta space.
By this issue, hive servers are getting slow down because it takes too much time to load classes, until OOM, as below flame graph.
(48% of sampling is class loading)
I think we can fix this issue by either
a) maintain a static metadata provider (HIVE-18920)
or
b) make constant size caches (https://issues.apache.org/jira/browse/CALCITE-1808)
To apply b), we need to upgrade calcite version to 1.15, but this includes lots of changes.
it may be inappropriate for patch releases. (+ inefficient solution)
In our production clusters, It is proven that 1) can prevent OOM and performance degradation.
Attachments
Attachments
Issue Links
- is related to
-
CALCITE-1808 JaninoRelMetadataProvider loading cache might cause OOM error
- Closed
-
HIVE-18920 CBO: Initialize the Janino providers ahead of 1st query
- Closed
- links to