[PHOENIX-4925] Use a Variant Segment tree to organize Guide Post Info - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Patch Available
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

As reported, Query compilation (for the sample queries showed below), especially deriving estimation and generating parallel scans from guide posts, becomes much slower after we introduced Phoenix Stats.
a. SELECT f1_c FROM MyCustomBigObjectb ORDER BY Pk1_c
b. SELECT f1_c FROM MyCustomBigObjectb WHERE nonpk1c = ‘x’ ORDER BY Pk1_c
c. SELECT f1_c FROM MyCustomBigObjectb WHERE pk2c = ‘x’ ORDER BY pk1c,pk2_c
d. SELECT f1_c FROM MyCustomBigObjectb WHERE pk1c = ‘x’ AND nonpk1c ORDER BY pk1c,pk2_c
e. SELECT f1_c FROM MyCustomBigObjectb WHERE pkc >= 'd' AND pkc <= 'm' OR pkc >= 'o' AND pkc <= 'x' ORDER BY pkc // pk_c is the only column to make the primary key.

By using prefix encoding for guide post info, we have to decode and traverse guide posts sequentially, which causes time complexity in BaseResultIterators.getParallelScan(...) to be O( n ) , where n is the total count of guide posts.

According to ~~PHOENIX-2417~~, to reduce footprint in client cache and over transmition, the prefix encoding is used as in-memory and over-the-wire encoding for guide post info.

We can use Segment Tree to address both memory and performance concerns. The guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by prefix encoding and the encoded data is a leaf node of the tree. The inner node contains summary info (the count of rows, the data size) of the sub tree rooted at the inner node.

With this tree like data structure, compared to the current data structure, the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The time complexity for queries a, b, c can be reduced to O(m) where m is the total count of regions; the time complexity for "EXPLAN" queries a, b, c can be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even be reduced to O(1). For queries d and e, the time complexity to find the start of target scan ranges can be reduced to O(log(n/k)).

The tree can also integrate AVL and B+ characteristics to support partial load/unload when interacting with stats client cache.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

PHOENIX-4925.phoenix-stats.003.patch
09/Jul/19 19:24
591 kB
Bin Shi
PHOENIX-4925.phoenix-stats.0502.patch
03/May/19 23:49
250 kB
Bin Shi
PHOENIX-4925.phoenix-stats.0510.patch
10/May/19 15:46
270 kB
Bin Shi

Issue Links

is a child of

PHOENIX-5058 Improvements to client side cache guideposts cache

Open

links to

Design Document "Phoenix-4925 (https://issues.apache.org/jira/browse/PHOENIX-4925) Use a Variant Segment Tree to Organize Guide Post Info"

GitHub Pull Request #482

Activity

People

Assignee:: Bin Shi

Reporter:: Bin Shi

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 25/Sep/18 17:20

Updated:: 09/Jul/19 20:04

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

13h 40m