[PHOENIX-180] Use stats to guide query parallelization - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 4.2.0, 3.2.0
Component/s: None
Labels:
- enhancement

old issue number:
49

Description

We're currently not using stats, beyond a table-wide min key/max key cached per client connection, to guide parallelization. If a query targets just a few regions, we don't know how to evenly divide the work among threads, because we don't know the data distribution. This other [issue] (https://github.com/forcedotcom/phoenix/issues/64) is targeting gather and maintaining the stats, while this issue is focused on using the stats.

The main changes are:

1. Create a PTableStats interface that encapsulates the stats information (and implements the Writable interface so that it can be serialized back from the server).
2. Add a stats member variable off of PTable to hold this.
3. From MetaDataEndPointImpl, lookup the stats row for the table in the stats table. If the stats have changed, return a new PTable with the updated stats information. We may want to cache the stats row and have the stats gatherer invalidate the cache row when updated so we don't have to always do a scan for it. Additionally, it would be idea if we could use the same split policy on the stats table that we use on the system table to guarantee co-location of data (for the sake of caching).

modify the client-side parallelization (ParallelIterators.getSplits()) to use this information to guide how to chunk up the scans at query time.

This should help boost query performance, especially in cases where the data is highly skewed. It's likely the cause for the slowness reported in this issue: https://github.com/forcedotcom/phoenix/issues/47.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

Phoenix-180_WIP.patch
13/Aug/14 18:46
229 kB
ramkrishna.s.vasudevan
Phoenix-180_v5.patch
19/Sep/14 20:24
344 kB
ramkrishna.s.vasudevan
Phoenix-180_v3.patch
19/Sep/14 18:28
342 kB
ramkrishna.s.vasudevan
Phoenix-180_V2.patch
19/Sep/14 12:35
330 kB
ramkrishna.s.vasudevan
Phoenix-180_V1.patch
19/Sep/14 11:01
331 kB
ramkrishna.s.vasudevan
Phoenix-180_3.0.patch
22/Sep/14 12:09
225 kB
ramkrishna.s.vasudevan

Activity

People

Assignee:: ramkrishna.s.vasudevan

Reporter:: James R. Taylor

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 10/Mar/14 23:46

Updated:: 21/Nov/15 02:15

Resolved:: 23/Sep/14 09:37