[HDDS-10762] Review/discuss SchemaV3's prefix_seek optimization - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.3.0, 1.4.0
Fix Version/s: None
Component/s: None
Labels:
None

Description

As part of ~~HDDS-10744~~, I ran into the prefix_seek optimization in RocksDB, and the changes in how we access the DB in order to have this optimization.

The cost side of it is that we have a String key for the column families in the DataNode's RocksDB, starting with the container id, then a separator, then the same key we had in schema_v2 as I understand.
In order to deal with this, we need to do a lot of conversions, and copies along the lines as it seems, we have the FixedLengthStringCodec class, and we also have the StringUtils class, and we may also have a few other used conversion methods, with which during accessing the RocksDB, we convert from long to byte[] to String or from String to byte[] or byte[] to String, or in some code paths we do multiple conversions combining these conversions together.

Prefix_seek promises to spare some I/O cycles and some CPU cycles with workloads that are a good fit for the feature (ours seems to be one of them), but looking at our code and the things that are there to have this, I am not sure we have an overall benefit from the prefix_seek optimization considering maintainability, understandability, and the cost of that many runtime conversions.

I wanted to have a note on this concern.
CC: ritesh, szetszwo, markgui

Attachments

Issue Links

is related to

HDDS-6486 Add new container schema v3 definitions.

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: István Fajth

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 26/Apr/24 12:11

Updated:: 26/Apr/24 12:34