Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
-
Ozone integration test (randomly observed)
Stack: [0x000000030cea5000,0x000000030cfa5000], sp=0x000000030cfa3620, free space=1017k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [librocksdbjni2897401891378344213.jnilib+0x20cf9] _Z18rocksdb_put_helperP7JNIEnv_PN7rocksdb2DBERKNS1_12WriteOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayiiSA_ii+0x109
j org.rocksdb.RocksDB.put(JJ[BII[BIIJ)V+0
j org.rocksdb.RocksDB.put(Lorg/rocksdb/ColumnFamilyHandle;Lorg/rocksdb/WriteOptions;[B[B)V+23
j org.apache.hadoop.hdds.utils.db.RocksDatabase.put(Lorg/apache/hadoop/hdds/utils/db/RocksDatabase$ColumnFamily;[B[B)V+25
j org.apache.hadoop.hdds.utils.db.RDBTable.put([B[B)V+14This is reproduced in isolated manner:
1. one thread keeps on calling read / write
2. Main thread closes the DB storeOzone integration test (randomly observed) Stack: [0x000000030cea5000,0x000000030cfa5000] , sp=0x000000030cfa3620, free space=1017k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [librocksdbjni2897401891378344213.jnilib+0x20cf9] _Z18rocksdb_put_helperP7JNIEnv_PN7rocksdb2DBERKNS1_12WriteOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayiiSA_ii+0x109 j org.rocksdb.RocksDB.put(JJ[BII[BIIJ)V+0 j org.rocksdb.RocksDB.put(Lorg/rocksdb/ColumnFamilyHandle;Lorg/rocksdb/WriteOptions;[B[B)V+23 j org.apache.hadoop.hdds.utils.db.RocksDatabase.put(Lorg/apache/hadoop/hdds/utils/db/RocksDatabase$ColumnFamily;[B[B)V+25 j org.apache.hadoop.hdds.utils.db.RDBTable.put([B[B)V+14 This is reproduced in isolated manner: 1. one thread keeps on calling read / write 2. Main thread closes the DB store
Description
During integration test of Ozon, its randomly observed that JVM crashes with rocks db stack.
Its observed if some of thread in Recon which is processing FCR/ICR report, jvm crashed with rocks db stack.
Solution Proposed:
1. every DB access in RocksDatabase,
- isClosed() check, if closed, then throw IOException
- counter increment on entry and decrement on exit of method
2. While RocksDB close, - set isClosed to true
- keep check for counter if it reaches to "0", with retry every milli second
- Another strategy of force close after 5 second.
This will provide performance as no lock.
Alternate solution (will have performance issue due to frequent lock/unlock):
1. every DB access,
- take a read lock and
- check for isClosed(), if closed, throw IOException
2. while RocksDB close, - take a write lock
- set isClosed and close the DB
This solution can have performance bottleneck as frequent Read lock / unlock