Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.4.0, 1.3.1, 2.0.0
-
None
-
Reviewed
Description
In HBaseInterClusterReplicationEndpoint#replicate we try to replicate in batches. We create N lists. N is the minimum of configured replicator threads, number of 100-waledit batches, or number of current sinks. Every pending entry in the replication context is then placed in order by hash of encoded region name into one of these N lists. Each of the N lists is then sent all at once in one replication RPC. We do not test if the sum of data in each N list will exceed RPC size limits. This code presumes each individual edit is reasonably small. Not checking for aggregate size while assembling the lists into RPCs is an oversight and can lead to replication failure when that assumption is violated.
We can fix this by generating as many replication RPC calls as we need to drain a list, keeping each RPC under limit, instead of assuming the whole list will fit in one.
Attachments
Attachments
Issue Links
- relates to
-
HBASE-18116 Replication source in-memory accounting should not include bulk transfer hfiles
- Resolved