[CASSANDRA-4175] Reduce memory, disk space, and cpu usage with a column name/id map - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Normal
Resolution: Duplicate
Fix Version/s: None
Component/s: None
Labels:
- performance

Description

We spend a lot of memory on column names, both transiently (during reads) and more permanently (in the row cache). Compression mitigates this on disk but not on the heap.

The overhead is significant for typical small column values, e.g., ints.

Even though we intern once we get to the memtable, this affects writes too via very high allocation rates in the young generation, hence more GC activity.

Now that CQL3 provides us some guarantees that column names must be defined before they are inserted, we could create a map of (say) 32-bit int column id, to names, and use that internally right up until we return a resultset to the client.

Attachments

Issue Links

is duplicated by

CASSANDRA-7070 Virtual column name aliasing

Resolved

is related to

CASSANDRA-8099 Refactor and modernize the storage engine

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Jonathan Ellis

Votes:: 10 Vote for this issue

Watchers:: 34 Start watching this issue

Dates

Created:: 19/Apr/12 21:39

Updated:: 16/Apr/19 09:32

Resolved:: 11/Aug/15 16:37