Details
-
Umbrella
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.3.0
-
None
-
None
Description
This jira is an umbrella for a set of issues around store file accounting on branch-1.3 and branch-1 (I believe).
At this point I do believe that many / most of those issues are related to backport of HBASE-13082 done long time ago. A number of related issues were identified and fixed previously, but some still yet to be debugged and fixed. I think that this class of problems prevents us from releasing 1.3.2 and moving stable pointer to branch 1.3 at this point, so marking as critical.
Below is overview by Andrew Purtell from dev list: (Subject: Re: Branch 1.4 update):
Let me provide some context.
The root issue was fallout from a locking change introduced just prior to
release of 1.3. That change wasHBASE-13082. Lars H proposed a change. It
was committed to trunk but quickly reverted. After the revert Lars decided
to drop the work rather than fix it for reapplication. However, the work
was picked up by others and eventually found its way into branch-1, then
branch-1.3, then 1.3.x releases. There were unintended side effects,
causing bugs. The umbrella issueHBASE-18397tracks a bunch of fix work the
community has done since. The last known bug fix wasHBASE-18771, found and
fixed by our Abhishek. The last known change I know of was work I did on
HBASE-18786to remove some dodgy exception handling (prefer aborts to
silent data corruption). Is this enough to move the stable pointer?
According to our testing at Salesforce, yes, so far. We have yet to run in
full production. Give us a few months of that and my answer will be
unconditional one way or another. According to some offline conversation
with Mikhail and Gary, the answer is in fact no, they still have one hairy
use case causing occasional problems that look like more of this, but that
feedback predatesHBASE-18771.