[CASSANDRA-2915] Lucene based Secondary Indexes - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Normal
Resolution: Won't Fix
Fix Version/s: None
Component/s: Feature/2i Index
Labels:
- secondary_index

Description

Secondary indexes (of type KEYS) suffer from a number of limitations in their current form:

Multiple IndexClauses only work when there is a subset of rows under the highest clause
One new column family is created per index this means 10 new CFs for 10 secondary indexes

This ticket will use the Lucene library to implement secondary indexes as one index per CF, and utilize the Lucene query engine to handle multiple index clauses. Also, by using the Lucene we get a highly optimized file format.

There are a few parallels we can draw between Cassandra and Lucene.

Lucene indexes segments in memory then flushes them to disk so we can sync our memtable flushes to lucene flushes. Lucene also has optimize() which correlates to our compaction process, so these can be sync'd as well.

We will also need to correlate column validators to Lucene tokenizers, so the data can be stored properly, the big win in once this is done we can perform complex queries within a column like wildcard searches.

The downside of this approach is we will need to read before write since documents in Lucene are written as complete documents. For random workloads with lot's of indexed columns this means we need to read the document from the index, update it and write it back.

Attachments

Issue Links

incorporates

CASSANDRA-1684 Entity groups

Resolved

is blocked by

CASSANDRA-2982 Refactor secondary index api

Resolved

is depended upon by

CASSANDRA-1598 Add Boolean Expression to secondary querying

Resolved

CASSANDRA-1599 Add sort/order support for secondary indexing

Resolved

is superceded by

CASSANDRA-10661 Integrate SASI to Cassandra

Resolved

relates to

LUCENE-2312 Search on IndexWriter's RAM Buffer

Open

CASSANDRA-3249 Index search in provided set of rows (support of sub query)

Resolved

(2 relates to)

Sub-Tasks

Refactor secondary index api

Resolved

T Jake Luciani

Activity

People

Assignee:: Unassigned

Reporter:: T Jake Luciani

Votes:: 27 Vote for this issue

Watchers:: 45 Start watching this issue

Dates

Created:: 18/Jul/11 18:04

Updated:: 16/Apr/19 09:32

Resolved:: 05/Jan/16 19:32