[PHOENIX-4010] Hash Join cache may not be send to all regionservers when we have stale HBase meta cache - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 4.12.0
Component/s: None
Labels:
None

Description

If the region locations changed and our HBase meta cache is not updated then we might not be sending hash join cache to all region servers hosting the regions.
ConnectionQueryServicesImpl#getAllTableRegions

boolean reload =false;
        while (true) {
            try {
                // We could surface the package projected HConnectionImplementation.getNumberOfCachedRegionLocations
                // to get the sizing info we need, but this would require a new class in the same package and a cast
                // to this implementation class, so it's probably not worth it.
                List<HRegionLocation> locations = Lists.newArrayList();
                byte[] currentKey = HConstants.EMPTY_START_ROW;
                do {
                    HRegionLocation regionLocation = connection.getRegionLocation(
                            TableName.valueOf(tableName), currentKey, reload);
                    locations.add(regionLocation);
                    currentKey = regionLocation.getRegionInfo().getEndKey();
                } while (!Bytes.equals(currentKey, HConstants.EMPTY_END_ROW));
                return locations;

Skipping duplicate servers in ServerCacheClient#addServerCache

List<HRegionLocation> locations = services.getAllTableRegions(cacheUsingTable.getPhysicalName().getBytes());
            int nRegions = locations.size();
            
.....

 if ( ! servers.contains(entry) && 
                        keyRanges.intersectRegion(regionStartKey, regionEndKey,
                                cacheUsingTable.getIndexType() == IndexType.LOCAL)) {  
                    // Call RPC once per server
                    servers.add(entry);

For eg:- Table ’T’ has two regions R1 and R2 originally hosted on regionserver RS1.

while Phoenix/Hbase connection is still active, R2 is transitioned to RS2 , but stale meta cache will still give old region locations i.e R1 and R2 on RS1 and when we start copying hash table, we copy for R1 and skip R2 as they are hosted on same regionserver. so, the query on a table will fail as it will unable to find hash table cache on RS2 for processing regions R2.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

PHOENIX-4010.patch
11/Jul/17 15:44
39 kB
Ankit Singhal
PHOENIX-4010.addendum.patch
19/Mar/18 16:00
1 kB
Csaba Skrabak
PHOENIX-4010_v2.patch
17/Jul/17 09:46
86 kB
Ankit Singhal
PHOENIX-4010_v2_rebased.patch
18/Jul/17 06:50
87 kB
Ankit Singhal
PHOENIX-4010_v2_rebased_1.patch
18/Jul/17 09:51
87 kB
Ankit Singhal
PHOENIX-4010_v1.patch
13/Jul/17 17:19
52 kB
Ankit Singhal

Issue Links

causes

PHOENIX-4662 NullPointerException in TableResultIterator.java on cache resend

Resolved

links to

GitHub Pull Request #268

Activity

People

Assignee:: Ankit Singhal

Reporter:: Ankit Singhal

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 11/Jul/17 14:27

Updated:: 01/Aug/23 14:26

Resolved:: 20/Mar/18 04:41