[CASSANDRA-9708] Serialize ClusteringPrefixes in batches - ASF JIRA

Agile Board

Attach files

Attach Screenshot

Bulk Copy Attachments

Bulk Move Attachments

Voters

Watch issue

Watchers

Convert to Issue

Move

Link

Clone

Labels

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Normal
Resolution: Fixed
Fix Version/s: 3.0.0 rc1
Component/s: None
Labels:
None

Description

Typically we will have very few clustering prefixes to serialize, however in theory they are not constrained (or are they, just to a very large number?). Currently we encode a fat header for all values up front (two bits per value), however those bits will typically be zero, and typically we will have only a handful (perhaps 1 or 2) of values.

This patch modifies the encoding to batch the prefixes in groups of up to 32, along with a header that is vint encoded. Typically this will result in a single byte per batch, but will consume up to 9 bytes if some of the values have their flags set. If we have more than 32 columns, we just read another header. This means we incur no garbage, and compress the data on disk in many cases where we have more than 4 clustering components.

I do wonder if we shouldn't impose a limit on clustering columns, though: If you have more than a handful merge performance is going to disintegrate. 32 is probably well in excess of what we should be seeing in the wild anyway.

Attachments

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Benedict Elliott Smith Assign to me

Reporter:: Benedict Elliott Smith

Authors:: Benedict Elliott Smith

Reviewers:: Sylvain Lebresne

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 02/Jul/15 10:00

Updated:: 16/Apr/19 09:31

Resolved:: 24/Jul/15 10:20

Agile

View on Board

Serialize ClusteringPrefixes in batches

Details

Description

Attachments

Attachments

Activity

People

Dates

Agile

Slack

Issue deployment