[PHOENIX-4906] Introduce a coprocessor to handle cases where we can block merge for regions of salted table when it is problemetic - ASF JIRA

Details

Type: Bug
Status: Open
Priority: Critical
Resolution: Unresolved
Affects Version/s: 4.11.0, 4.14.0
Fix Version/s: None
Component/s: None
Labels:
None

Description

For a salted table, when a query is made for an entire data target, a different plan is created depending on the type of the query, and as a result, erroneous data is retrieved as a result.

// Actually, the schema of the table I used is different, but please ignore it.
create table if not exists test.test_tale (
  rk1 varchar not null,
  rk2 varchar not null,
  column1 varchar
  constraint pk primary key (rk1, rk2)
)
...
SALT_BUCKETS=16...
;

I created a table with 16 salting regions and then wrote a lot of data.
HBase automatically split the region and I did the merging regions for data balancing between the region servers.

Then, when run the query, you can see that another plan is created according to the Where clause.

query1
select count(*) from test.test_table;

+-------------------------------------------------------------------------------------------------------+-----------------+----------------+
|                                                            PLAN                                       | EST_BYTES_READ  | EST_ROWS_READ  |
+-------------------------------------------------------------------------------------------------------+-----------------+----------------+
| CLIENT 1851-CHUNK 5005959292 ROWS 1944546675532 BYTES PARALLEL 11-WAY FULL SCAN OVER TEST:TEST_TABLE  | 1944546675532   | 5005959292     |
|     SERVER FILTER BY FIRST KEY ONLY                                                                   | 1944546675532   | 5005959292     |
|     SERVER AGGREGATE INTO SINGLE ROW                                                                  | 1944546675532   | 5005959292     |
+-------------------------------------------------------------------------------------------------------+-----------------+----------------+

query2
select count(*) from test.test_table where rk2 = 'aa';

+-------------------------------------------------------------------------------------------------------------------+-----------------+----------------+
|                                                                  PLAN                                             | EST_BYTES_READ  | EST_ROWS_READ  |
+-------------------------------------------------------------------------------------------------------------------+-----------------+----------------+
| CLIENT 1846-CHUNK 4992196444 ROWS 1939177965768 BYTES PARALLEL 11-WAY RANGE SCAN OVER TEST:TEST_TABLE [0] - [15]  | 1939177965768   | 4992196444     |
|     SERVER FILTER BY FIRST KEY ONLY AND RK2 = 'aa'                                                                | 1939177965768   | 4992196444     |
|     SERVER AGGREGATE INTO SINGLE ROW                                                                              | 1939177965768   | 4992196444     |
+-------------------------------------------------------------------------------------------------------------------+-----------------+----------------+

Since rk2 used in the where clause of query2 is the second column of the PK, it must be a full scan query like query1.
However, as you can see, query2 is created by range scan and the generated chunk is also less than five compared to query1.
I added the log and printed out the startkey and endkey of the scan object generated by the plan.
And I found 5 chunks missing by query2.

All five missing chunks were found in regions where the originally generated region boundary value was not maintained through the merge operation.

After merging regions

The code that caused the problem is this part.

When a select query is executed, the org.apache.phoenix.iterate.BaseResultIterators#getParallelScans method creates a Scan object based on the GuidePost in the statistics table. In the case of a GuidePost that contains a region boundary, it is split into two Scan objects. The code used here is org.apache.phoenix.compile.ScanRanges#intersectScan.

In the case of a table that has been salted, the code compares it with the remainder after subtracting the salt(prefix) bytes.
I can not be sure that this code is buggy or intended.

In this case, I have merge the region directly, but it is likely to occur through HBase's Normalizer function.

I wish other users did not merge the region manually or not the table property Normalization_enabled to true in their production cluster. If so, check to see if the initial Salting Region boundary is correct. If the boundary value has disappeared, you are seeing the wrong data.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

initial_salting_region.png
17/Sep/18 06:39
86 kB
JeongMin Ju
merged-region.png
17/Sep/18 06:39
174 kB
JeongMin Ju
SaltingWithRegionMergeIT.java
17/Nov/21 08:35
4 kB
Istvan Toth
ScanRanges_intersectScan.png
17/Sep/18 06:02
31 kB
JeongMin Ju
TestSaltingWithRegionMerge.java
27/Feb/20 22:01
3 kB
Karthik Palanisamy

Issue Links

relates to

PHOENIX-6586 Set NORMALIZATION_ENABLED to false on salted tables

Resolved

PHOENIX-6910 Scans created during query compilation and execution against salted tables need to be more resilient

Resolved

PHOENIX-6587 Handle explicit pre-splits for new salted tables and validate splits when creating salted tables on existing HBase tables

Resolved

HBASE-27497 Add a note for RegionMerge tool.

Open

links to

GitHub Pull Request #1552

Introduce a coprocessor to handle cases where we can block merge for regions of salted table when it is problemetic

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates