Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.18.0
-
None
Description
In our production environment, we encountered the problem of task deploy failure. The root cause was that too many sst files of a single sub-task led to too much task deployment information(OperatorSubtaskState), and then caused akka request timeout in the task deploy phase. Therefore, I wanted to add sub-task level RocksDB file count metrics. It is convenient to avoid performance problems caused by too many sst files in time.
RocksDB has provided the JNI (https://javadoc.io/doc/org.rocksdb/rocksdbjni/6.20.3/org/rocksdb/RocksDB.html#getColumnFamilyMetaData ()) We can easily retrieve the file count and report it via metrics reporter.
Attachments
Attachments
Issue Links
- links to