We figured out that RocksDB class does additional lookup on the key for write operations (put/delete) to track the number of rows. This is required to fulfill the metric of the state store, but after benchmarking it turns out performance hit is significant.
We can't find a good alternative to retain the number of rows without additional lookup, so we are proposing a new config to flag tracking the number of rows.
- config name: spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows
- default value: true (since we already serve the number and we want to avoid breaking change)
We will give "0" for the number of keys in the state store metric when the config is turned off. The ideal value seems to be a negative one, but currently SQL metric doesn't allow negative value and there seems to be some technical/historical issue not to.
We will also handle the case the config is flipped during restart - this will enable the way end users enjoy the benefit but also not lose the chance to know the number of state rows. End users can turn off the flag to maximize the performance, and turn on the flag (restart required) when they want to see the actual number of keys (for observability/debugging/etc).