[PHOENIX-3271] Distribute UPSERT SELECT across cluster - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 4.10.0
Component/s: None
Labels:
None

Description

Based on some informal testing we've done, it seems that creation of a local index is orders of magnitude faster that creation of global indexes (17 seconds versus 10-20 minutes - though more data is written in the global index case). Under the covers, a global index is created through the running of an UPSERT SELECT. Also, UPSERT SELECT provides an easy way of copying a table. In both of these cases, the data being upserted must all flow back to the same client which can become a bottleneck for a large table. Instead, what can be done is to push each separate, chunked UPSERT SELECT call out to a different region server for execution there. One way we could implement this would be to have an endpoint coprocessor push the chunked UPSERT SELECT out to each region server and return the number of rows that were upserted back to the client.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

PHOENIX-3271.patch
20/Dec/16 10:44
15 kB
Ankit Singhal
PHOENIX-3271_v1.patch
20/Dec/16 11:51
16 kB
Ankit Singhal
PHOENIX-3271_v2.patch
10/Jan/17 07:12
21 kB
Ankit Singhal
PHOENIX-3271_v3.patch
10/Jan/17 09:19
21 kB
Ankit Singhal
PHOENIX-3271_v4.patch
24/Jan/17 08:37
22 kB
Ankit Singhal
PHOENIX-3271_v5.patch
24/Jan/17 17:37
27 kB
Ankit Singhal
PHOENIX-3271_v5_rebased.patch
30/Jan/17 07:29
26 kB
Ankit Singhal

Activity

People

Assignee:: Ankit Singhal

Reporter:: James R. Taylor

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 12/Sep/16 19:08

Updated:: 14/Mar/17 18:45

Resolved:: 31/Jan/17 06:35