Description
We were able to reproduce the High GC / RS shutdown / phoenix KeyRange query high object count issue on cluster today.
Main observation is that this is reproducible when firing lots of query select from xyz where abc in (?, ?, ...) of this type with 4.10 phoenix client hitting 4.13 phoenix on HBase server side
(4.10 client/4.10 server works fine, 4.13 client with 4.13 server works fine)
We wrote a loader client (attached) with the below table/query , upserted ~100 million rows and fired the query in parallel using 4-5 loader clients with 16 threads each
TABLE: = "CREATE TABLE " + TABLE_NAME_TEMPLATE + " (\n" + " TestKey varchar(255) PRIMARY KEY, TestVal1 varchar(200), TestVal2 varchar(200), " + "TestValue varchar(10000))"; QUERY: = "SELECT * FROM " + TABLE_NAME_TEMPLATE + " WHERE TestKey IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"
After running this client immediately within a min or two we see the phoenix.query.KeyRange object count immediately going up to several lakhs and keeps on increasing continuously. This count doesn't seem to come down even after shutting down the clients
-bash-4.1$ ~/current/bigdata-util/tools/Linux/jdk/jdk1.8.0_102_x64/bin/jmap -histo:live 90725 | grep KeyRange 47: 274852 6596448 org.apache.phoenix.query.KeyRange 1851: 2 48 org.apache.phoenix.query.KeyRange$Bound 2434: 1 24 [Lorg.apache.phoenix.query.KeyRange$Bound; 3411: 1 16 org.apache.phoenix.query.KeyRange$1 3412: 1 16 org.apache.phoenix.query.KeyRange$2
After some time we also started seeing High GC issues and RegionServers crashing
Experiment Summary:
- 4.13 client/4.13 Server — Issue not reproducible (we do see KeyRange count increasing upto few 100's)
- 4.10 client/4.10 Server — Issue not reproducible (we do see KeyRange count increasing upto few 100's)
- 4.10 client/4.13 Server — Issue reproducible as described above
Attachments
Attachments
Issue Links
- Blocked
-
HBASE-19534 Document risks of RegionObserver.preStoreScannerOpen
- Open
- duplicates
-
PHOENIX-4451 KeyRange has a very high allocation rate
- Resolved