Details
Description
The SnapshotFileCache depends on last modified time to determine whether to update the Snapshot HFile cache. However, in S3, real 'folders' don't exist. S3 filesystems create a dummy file in place of a folder, but the dummy file last modified time is not updated when files are changed 'under' it. This means that the SnapshotFileCache doesn't pick up new snapshot HFiles and these files aren't removed from the HFileCleaner and can be eligible for deletion.
My patch removes the lastmodified assumption.
Attachments
Attachments
- HBASE-21070.master.001.patch
- 7 kB
- Zach York
- HBASE-21070.master.002.patch
- 7 kB
- Zach York
- HBASE-21070.master.003.patch
- 14 kB
- Zach York
Issue Links
- is related to
-
HBASE-20429 Support for mixed or write-heavy workloads on non-HDFS filesystems
- Resolved
- relates to
-
HBASE-22190 SnapshotFileCache may fail to load the correct snapshot file list when there is an on-going snapshot operation
- Resolved
- links to
Activity
patch 2 adds additionally logging suggested by stack. Responded to questions up in RB.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 10s | Docker mode activated. |
Prechecks | |||
+1 | hbaseanti | 0m 0s | Patch does not have any anti-patterns. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
-0 | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. |
master Compile Tests | |||
+1 | mvninstall | 3m 42s | master passed |
+1 | compile | 1m 42s | master passed |
+1 | checkstyle | 1m 9s | master passed |
+1 | shadedjars | 4m 0s | branch has no errors when building our shaded downstream artifacts. |
+1 | findbugs | 1m 55s | master passed |
+1 | javadoc | 0m 28s | master passed |
Patch Compile Tests | |||
+1 | mvninstall | 3m 32s | the patch passed |
+1 | compile | 1m 43s | the patch passed |
+1 | javac | 1m 43s | the patch passed |
+1 | checkstyle | 1m 6s | hbase-server: The patch generated 0 new + 1 unchanged - 1 fixed = 1 total (was 2) |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | shadedjars | 4m 8s | patch has no errors when building our shaded downstream artifacts. |
+1 | hadoopcheck | 7m 26s | Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. |
+1 | findbugs | 2m 3s | the patch passed |
+1 | javadoc | 0m 29s | the patch passed |
Other Tests | |||
-1 | unit | 106m 9s | hbase-server in the patch failed. |
+1 | asflicense | 0m 16s | The patch does not generate ASF License warnings. |
140m 24s |
Reason | Tests |
---|---|
Failed junit tests | hadoop.hbase.master.procedure.TestDisableTableProcedure |
Subsystem | Report/Notes |
---|---|
Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12937337/HBASE-21070.master.002.patch |
Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile |
uname | Linux 44fa2998392d 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux |
Build tool | maven |
Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh |
git revision | master / 5089256529 |
maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
Default Java | 1.8.0_181 |
findbugs | v3.1.0-RC3 |
unit | https://builds.apache.org/job/PreCommit-HBASE-Build/14229/artifact/patchprocess/patch-unit-hbase-server.txt |
Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14229/testReport/ |
Max. process+thread count | 4513 (vs. ulimit of 10000) |
modules | C: hbase-server U: hbase-server |
Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/14229/console |
Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
This message was automatically generated.
The patch looks good to me overall to remove the dependency on last modified time of top snapshort dir, which is not necessarily updated per Hadoop FileSystem contract.
The readLock is not used and can be removed. If all access to the cache and snapshots are accessed via synchronized, we don't need the volatile keyword here. triggerCacheRefreshForTesting() can be synchronized I think.
Good catch on readLock, missed removing that.
The volatile keyword is still needed because even if these methods are synchronized, a different thread might have an old reference (which I believe is what happened in my testing).
Let me think again on a UT... I couldn't find a good way to test this functionality, but maybe I can extend FileSystem to make a mock filesystem that can behave how I want it.
Thanks for the review liuml07!
Thanks zyork. If all access methods are synchronized, the cache and snapshots will be thread safe. No stale reference will be seen. Currently only one method SnapshotFileCache::SnapshotFileCache is not synchronized which I think should be.
I understand the difficulty of testing this. I'll post any idea I get.
Sorry for being a little absent on this over the past couple days. I found a bug in the original test code https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/master/snapshot/TestSnapshotFileCache.java#L261 is calling contains for a set of FileStatus's and trying to find a String which will never return true... I've been trying to wrestle to get the tests to perform as expected, but it has taken more time. I should probably fix it in a follow-up JIRA.
I was basically working on fixing the test as https://github.com/apache/hbase/commit/1b7e4fdcfc69a99d114910cd77a617f449eb33b1 did. I can now make progress on a UT.
liuml07 Sorry for the long, long silence. I got busy with other things.
If you're still interested in reviewing, I have a new patch that contains a test (I've confirmed it fails without this change).
I myself has been working on something else in public cloud. I refreshed my understanding and the patch makes sense to me. I'll review this week. Thanks zyork!
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 13s | Docker mode activated. |
Prechecks | |||
+1 | hbaseanti | 0m 0s | Patch does not have any anti-patterns. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
master Compile Tests | |||
+1 | mvninstall | 5m 29s | master passed |
+1 | compile | 2m 15s | master passed |
+1 | checkstyle | 1m 24s | master passed |
+1 | shadedjars | 5m 19s | branch has no errors when building our shaded downstream artifacts. |
+1 | findbugs | 2m 40s | master passed |
+1 | javadoc | 0m 36s | master passed |
Patch Compile Tests | |||
+1 | mvninstall | 5m 17s | the patch passed |
+1 | compile | 2m 25s | the patch passed |
+1 | javac | 2m 25s | the patch passed |
-1 | checkstyle | 1m 23s | hbase-server: The patch generated 1 new + 1 unchanged - 1 fixed = 2 total (was 2) |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | shadedjars | 5m 12s | patch has no errors when building our shaded downstream artifacts. |
+1 | hadoopcheck | 11m 2s | Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. |
+1 | findbugs | 2m 24s | the patch passed |
+1 | javadoc | 0m 30s | the patch passed |
Other Tests | |||
+1 | unit | 128m 19s | hbase-server in the patch passed. |
+1 | asflicense | 0m 20s | The patch does not generate ASF License warnings. |
175m 17s |
Subsystem | Report/Notes |
---|---|
Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12955123/HBASE-21070.master.003.patch |
Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile |
uname | Linux 4dcefddaef48 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 GNU/Linux |
Build tool | maven |
Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh |
git revision | master / 594341d6fe |
maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
Default Java | 1.8.0_181 |
findbugs | v3.1.0-RC3 |
checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/15602/artifact/patchprocess/diff-checkstyle-hbase-server.txt |
Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/15602/testReport/ |
Max. process+thread count | 5153 (vs. ulimit of 10000) |
modules | C: hbase-server U: hbase-server |
Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/15602/console |
Powered by | Apache Yetus 0.8.0 http://yetus.apache.org |
This message was automatically generated.
- The UT is simple but makes sure we don't rely on the last modified time. It makes sense to me.
- I think the synchronized keyword for refreshCache method will be enough to establish happens-before relationship to other threads calling the same method. Refer to this page. So the content and the reference of cache and snapshots will be cool across threads. No volatile is needed if we interpret the semantics the same. By the way, the tests pass consistently (>10 times) at my dev machine without "volatile" for cache or snapshots.
LOG.debug("No snapshots on-disk under: {}, cache empty", snapshotDir);
Nit: I appreciate the good intention of using LOG.isDebugEnabled() to further guard the unnecessary cost of generating logs, but I think it's not required if we use placeholder. But it harms nothing of course.
Thanks!
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 16s | Docker mode activated. |
Prechecks | |||
+1 | hbaseanti | 0m 0s | Patch does not have any anti-patterns. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
master Compile Tests | |||
+1 | mvninstall | 4m 38s | master passed |
+1 | compile | 2m 6s | master passed |
+1 | checkstyle | 1m 14s | master passed |
+1 | shadedjars | 4m 34s | branch has no errors when building our shaded downstream artifacts. |
+1 | findbugs | 2m 42s | master passed |
+1 | javadoc | 0m 35s | master passed |
Patch Compile Tests | |||
+1 | mvninstall | 4m 11s | the patch passed |
+1 | compile | 2m 4s | the patch passed |
+1 | javac | 2m 4s | the patch passed |
-1 | checkstyle | 1m 17s | hbase-server: The patch generated 1 new + 1 unchanged - 1 fixed = 2 total (was 2) |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | shadedjars | 4m 39s | patch has no errors when building our shaded downstream artifacts. |
+1 | hadoopcheck | 9m 24s | Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. |
+1 | findbugs | 2m 44s | the patch passed |
+1 | javadoc | 0m 33s | the patch passed |
Other Tests | |||
-1 | unit | 135m 32s | hbase-server in the patch failed. |
+1 | asflicense | 0m 21s | The patch does not generate ASF License warnings. |
177m 25s |
Reason | Tests |
---|---|
Failed junit tests | hadoop.hbase.io.hfile.bucket.TestBucketCache |
Subsystem | Report/Notes |
---|---|
Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12955123/HBASE-21070.master.003.patch |
Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile |
uname | Linux 634cc7d0861e 4.4.0-137-generic #163~14.04.1-Ubuntu SMP Mon Sep 24 17:14:57 UTC 2018 x86_64 GNU/Linux |
Build tool | maven |
Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh |
git revision | master / 63d0e6ed4a |
maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
Default Java | 1.8.0_181 |
findbugs | v3.1.11 |
checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/16277/artifact/patchprocess/diff-checkstyle-hbase-server.txt |
unit | https://builds.apache.org/job/PreCommit-HBASE-Build/16277/artifact/patchprocess/patch-unit-hbase-server.txt |
Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/16277/testReport/ |
Max. process+thread count | 5270 (vs. ulimit of 10000) |
modules | C: hbase-server U: hbase-server |
Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/16277/console |
Powered by | Apache Yetus 0.8.0 http://yetus.apache.org |
This message was automatically generated.
Sorry apurtell. I lost this issue for a bit while I had switched priorities. I opened https://github.com/apache/hbase/pull/209 with the change and removed the volatile liuml07
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 45s | Docker mode activated. |
Prechecks | |||
+1 | hbaseanti | 0m 0s | Patch does not have any anti-patterns. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
master Compile Tests | |||
+1 | mvninstall | 4m 36s | master passed |
+1 | compile | 0m 50s | master passed |
+1 | checkstyle | 1m 10s | master passed |
+1 | shadedjars | 4m 24s | branch has no errors when building our shaded downstream artifacts. |
+1 | findbugs | 3m 42s | master passed |
+1 | javadoc | 0m 31s | master passed |
Patch Compile Tests | |||
+1 | mvninstall | 3m 56s | the patch passed |
+1 | compile | 0m 49s | the patch passed |
+1 | javac | 0m 49s | the patch passed |
-1 | checkstyle | 1m 6s | hbase-server: The patch generated 1 new + 1 unchanged - 1 fixed = 2 total (was 2) |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | shadedjars | 4m 23s | patch has no errors when building our shaded downstream artifacts. |
+1 | hadoopcheck | 8m 1s | Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. |
+1 | findbugs | 3m 3s | the patch passed |
+1 | javadoc | 0m 30s | the patch passed |
Other Tests | |||
-1 | unit | 257m 17s | hbase-server in the patch failed. |
+1 | asflicense | 0m 27s | The patch does not generate ASF License warnings. |
296m 22s |
Reason | Tests |
---|---|
Failed junit tests | hadoop.hbase.master.TestAssignmentManagerMetrics |
hadoop.hbase.quotas.TestSpaceQuotas |
Subsystem | Report/Notes |
---|---|
Docker | Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/PreCommit-HBASE-Build/222/artifact/patchprocess/Dockerfile |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12955123/HBASE-21070.master.003.patch |
Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile |
uname | Linux 4ff491369cb6 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 GNU/Linux |
Build tool | maven |
Personality | dev-support/hbase-personality.sh |
git revision | master / 70296a2e78 |
maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
Default Java | 1.8.0_181 |
findbugs | v3.1.11 |
checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/222/artifact/patchprocess/diff-checkstyle-hbase-server.txt |
unit | https://builds.apache.org/job/PreCommit-HBASE-Build/222/artifact/patchprocess/patch-unit-hbase-server.txt |
Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/222/testReport/ |
Max. process+thread count | 5290 (vs. ulimit of 10000) |
modules | C: hbase-server U: hbase-server |
Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/222/console |
Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
This message was automatically generated.
Pushed to master, branch-1, and branch-2. Shout if it needs to go anywhere else.
Results for branch branch-1
build #808 on builds.a.o: -1 overall
details (if available):
-1 general checks
– For more information see general report
-1 jdk7 checks
– For more information see jdk7 report
-1 jdk8 hadoop2 checks
– For more information see jdk8 (hadoop2) report
+1 source release artifact
– See build output for details.
Results for branch branch-2
build #1872 on builds.a.o: -1 overall
details (if available):
-1 general checks
– For more information see general report
+1 jdk8 hadoop2 checks
– For more information see jdk8 (hadoop2) report
-1 jdk8 hadoop3 checks
– For more information see jdk8 (hadoop3) report
+1 source release artifact
– See build output for details.
+1 client integration test
Results for branch master
build #987 on builds.a.o: -1 overall
details (if available):
-1 general checks
– For more information see general report
+1 jdk8 hadoop2 checks
– For more information see jdk8 (hadoop2) report
-1 jdk8 hadoop3 checks
– For more information see jdk8 (hadoop3) report
+1 source release artifact
– See build output for details.
+1 client integration test
busbey So this commit is just for adding the test, I would assume the release note should be on HBASE-22190.
HBASE-21070This message was automatically generated.