Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-11238

Update the NameNode's Group Cache in the background when possible

    Details

      Description

      This patch addresses an issue where the namenode pauses during group resolution by only allowing a single group resolution query on expiry. There are two scenarios:
      1. When there is not yet a value in the cache, all threads which make a request will block while a single thread fetches the value.
      2. When there is already a value in the cache and it is expired, the new value will be fetched in the background while the old value is used by other threads
      This is handled by guava's cache.

      Negative caching is a feature built into the groups cache, and since guava's caches don't support different expiration times, we have a separate negative cache which masks the guava cache: if an element exists in the negative cache and isn't expired, we return it.

      In total the logic for fetching a group is:
      1. If username exists in static cache, return the value (this was already present)
      2. If username exists in negative cache and negative cache is not expired, raise an exception as usual
      3. Otherwise Defer to guava cache (see two scenarios above)

      Original Issue Below:
      ----------------------------
      Our namenode pauses for 12-60 seconds several times every hour. During these pauses, no new requests can come in.

      Around the time of pauses, we have log messages such as:
      2014-10-22 13:24:22,688 WARN org.apache.hadoop.security.Groups: Potential performance problem: getGroups(user=xxxxx) took 34507 milliseconds.

      The current theory is:
      1. Groups has a cache that is refreshed periodically. Each entry has a cache expiry.
      2. When a cache entry expires, multiple threads can see this expiration and then we have a thundering herd effect where all these threads hit the wire and overwhelm our LDAP servers (we are using ShellBasedUnixGroupsMapping with sssd, how this happens has yet to be established)
      3. group resolution queries begin to take longer, I've observed it taking 1.2 seconds instead of the usual 0.01-0.03 seconds when measuring in the shell `time groups myself`
      4. If there is mutual exclusion somewhere along this path, a 1 second pause could lead to a 60 second pause as all the threads compete for the resource. The exact cause hasn't been established

      Potential solutions include:
      1. Increasing group cache time, which will make the issue less frequent
      2. Rolling evictions of the cache so we prevent the large spike in LDAP queries
      3. Gate the cache refresh so that only one thread is responsible for refreshing the cache

      1. HADOOP-11238.003.patch
        17 kB
        Chris Li
      2. HADOOP-11238.003.patch
        17 kB
        Chris Li
      3. HADOOP-11238.patch
        16 kB
        Chris Li
      4. HADOOP-11238.patch
        15 kB
        Chris Li

        Issue Links

          Activity

          Hide
          chrilisf Chris Li added a comment -

          Uploading patch

          Show
          chrilisf Chris Li added a comment - Uploading patch
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12680662/HADOOP-11238.patch
          against trunk revision eace218.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common:

          org.apache.hadoop.security.ssl.TestReloadingX509TrustManager

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5060//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5060//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12680662/HADOOP-11238.patch against trunk revision eace218. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.security.ssl.TestReloadingX509TrustManager +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5060//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5060//console This message is automatically generated.
          Hide
          benoyantony Benoy Antony added a comment -

          Chris , Could you please describe the solution ?

          Show
          benoyantony Benoy Antony added a comment - Chris , Could you please describe the solution ?
          Hide
          chrilisf Chris Li added a comment -

          Sure:

          This patch addresses the issue by only allowing a single group resolution query on expiry. There are two scenarios:
          1. When there is not yet a value in the cache, all threads which make a request will block while a single thread fetches the value.
          2. When there is already a value in the cache and it is expired, the new value will be fetched in the background while the old value is used by other threads

          This is handled by guava's cache.

          Negative caching is a feature built into the groups cache, and since guava's caches don't support different expiration times, we have a separate negative cache which masks the guava cache: if an element exists in the negative cache and isn't expired, we return it.

          In total the logic for fetching a group is:

          1. If username exists in static cache, return the value (this was already present)
          2. If username exists in negative cache and negative cache is not expired, raise an exception
          3. Defer to guava cache (see two scenarios above)

          Let me know if you'd like additional details. Also I will investigate that failing test.

          Show
          chrilisf Chris Li added a comment - Sure: This patch addresses the issue by only allowing a single group resolution query on expiry. There are two scenarios: 1. When there is not yet a value in the cache, all threads which make a request will block while a single thread fetches the value. 2. When there is already a value in the cache and it is expired, the new value will be fetched in the background while the old value is used by other threads This is handled by guava's cache. Negative caching is a feature built into the groups cache, and since guava's caches don't support different expiration times, we have a separate negative cache which masks the guava cache: if an element exists in the negative cache and isn't expired, we return it. In total the logic for fetching a group is: 1. If username exists in static cache, return the value (this was already present) 2. If username exists in negative cache and negative cache is not expired, raise an exception 3. Defer to guava cache (see two scenarios above) Let me know if you'd like additional details. Also I will investigate that failing test.
          Hide
          chrilisf Chris Li added a comment -

          Hmm, that test passes on my local machine so it may be another patch causing that failure. I can re-upload the patch later to trigger another run.

          Show
          chrilisf Chris Li added a comment - Hmm, that test passes on my local machine so it may be another patch causing that failure. I can re-upload the patch later to trigger another run.
          Hide
          benoyantony Benoy Antony added a comment -

          Nice way to refresh the cache asynchronously to avoid penalizing few requests.

          In the new implementation, it looks like a user's group will be refreshed periodically forever even if the user accessed Hadoop only once. Is there a way to prevent that ?

          Line #63 is more than 100 lines.
          In line#63 could you keep the type of the variable as Map instead of ConcurrentHashMap ?

          Show
          benoyantony Benoy Antony added a comment - Nice way to refresh the cache asynchronously to avoid penalizing few requests. In the new implementation, it looks like a user's group will be refreshed periodically forever even if the user accessed Hadoop only once. Is there a way to prevent that ? Line #63 is more than 100 lines. In line#63 could you keep the type of the variable as Map instead of ConcurrentHashMap ?
          Hide
          benoyantony Benoy Antony added a comment -

          The test failure seems to be unrelated.

          Show
          benoyantony Benoy Antony added a comment - The test failure seems to be unrelated.
          Hide
          cnauroth Chris Nauroth added a comment -

          This issue might overlap with HADOOP-10077, which also has a patch available.

          Show
          cnauroth Chris Nauroth added a comment - This issue might overlap with HADOOP-10077 , which also has a patch available.
          Hide
          jnp Jitendra Nath Pandey added a comment -

          [~chris li] Do you think HADOOP-10077 complete addresses this one? Ideally one of them should be marked as duplicate.

          Show
          jnp Jitendra Nath Pandey added a comment - [~chris li] Do you think HADOOP-10077 complete addresses this one? Ideally one of them should be marked as duplicate.
          Hide
          cmccabe Colin P. McCabe added a comment -

          Chris Li, thanks for working on this. Can you update the JIRA title and description to better describe the patch here? It seems like the approach here is the same as on HADOOP-10077... refreshing the groups cache in the background. Normally, the description and title of a JIRA should refer to the patch. It is nice to have detailed information about exactly what happened on a particular cluster, but we generally put this in the first comment. That way, it is easier to browse through JIRAs to see what each JIRA implemented.

          The patch on HADOOP-10077 would need to be updated to handle negative caching and warningDeltaMs, which didn't exist back when I wrote it. So I feel that the patch on this JIRA is more fresh. If this JIRA gets committed, we should definitely mark HADOOP-10077 as a duplicate of it.

          +    this.cache = CacheBuilder.newBuilder()
          +      .refreshAfterWrite(cacheTimeout, TimeUnit.MILLISECONDS)
          +      .ticker(new TimerToTickerAdapter(timer))
          +      .build(new GroupCacheLoader());
          

          Do we need to set the ticker here? It seems that the default ticker uses System.nanoTime, which is the same monotonic time source that Hadoop's Timer uses.

          Can we add a comment to GroupCacheLoader explaining that this code runs in the background if the cache entry is expired and there is an existing cache entry? I don't think that would be completely clear to someone reading this for the first time. And perhaps a similar comment to Groups#getGroups explaining that the access may block, but only if there is no existing entry.

          Benoy wrote: In the new implementation, it looks like a user's group will be refreshed periodically forever even if the user accessed Hadoop only once. Is there a way to prevent that ?

          This is a good point. The code currently never shrinks the cache, only grows it. Why don't we use Guava's expireAfterWrite option to remove entries from the cache after a certain timeout? I guess this could be a separate configuration option, or we could just use 10 * cacheTimeout.

          Show
          cmccabe Colin P. McCabe added a comment - Chris Li , thanks for working on this. Can you update the JIRA title and description to better describe the patch here? It seems like the approach here is the same as on HADOOP-10077 ... refreshing the groups cache in the background. Normally, the description and title of a JIRA should refer to the patch. It is nice to have detailed information about exactly what happened on a particular cluster, but we generally put this in the first comment. That way, it is easier to browse through JIRAs to see what each JIRA implemented. The patch on HADOOP-10077 would need to be updated to handle negative caching and warningDeltaMs, which didn't exist back when I wrote it. So I feel that the patch on this JIRA is more fresh. If this JIRA gets committed, we should definitely mark HADOOP-10077 as a duplicate of it. + this .cache = CacheBuilder.newBuilder() + .refreshAfterWrite(cacheTimeout, TimeUnit.MILLISECONDS) + .ticker( new TimerToTickerAdapter(timer)) + .build( new GroupCacheLoader()); Do we need to set the ticker here? It seems that the default ticker uses System.nanoTime , which is the same monotonic time source that Hadoop's Timer uses. Can we add a comment to GroupCacheLoader explaining that this code runs in the background if the cache entry is expired and there is an existing cache entry? I don't think that would be completely clear to someone reading this for the first time. And perhaps a similar comment to Groups#getGroups explaining that the access may block, but only if there is no existing entry. Benoy wrote: In the new implementation, it looks like a user's group will be refreshed periodically forever even if the user accessed Hadoop only once. Is there a way to prevent that ? This is a good point. The code currently never shrinks the cache, only grows it. Why don't we use Guava's expireAfterWrite option to remove entries from the cache after a certain timeout? I guess this could be a separate configuration option, or we could just use 10 * cacheTimeout.
          Hide
          benoyantony Benoy Antony added a comment -

          In line#63 could you keep the type of the variable as Map instead of ConcurrentHashMap ?

          Just noted that the above recommendation cannot be carried out as the code uses ConcurrentHashMap.remove(key, value). You can ignore this recommendation.

          Show
          benoyantony Benoy Antony added a comment - In line#63 could you keep the type of the variable as Map instead of ConcurrentHashMap ? Just noted that the above recommendation cannot be carried out as the code uses ConcurrentHashMap.remove(key, value) . You can ignore this recommendation.
          Hide
          chrilisf Chris Li added a comment -

          Benoy Antony I trimmed the length on line 63

          Hi Colin P. McCabe, made those changes to the jira ticket, thanks for the advice.

          Do we need to set the ticker here? It seems that the default ticker uses System.nanoTime, which is the same monotonic time source that Hadoop's Timer uses.

          It's currently like this because the test case uses dependency injection to test timing. We could use a fake guava timer but I wanted to avoid tight coupling between hadoop and the guava library. Let me know what you think.

          Why don't we use Guava's expireAfterWrite option to remove entries from the cache after a certain timeout? I guess this could be a separate configuration option, or we could just use 10 * cacheTimeout.

          Good idea. I added this using 10*cacheTimeout, and added some comments

          Show
          chrilisf Chris Li added a comment - Benoy Antony I trimmed the length on line 63 Hi Colin P. McCabe , made those changes to the jira ticket, thanks for the advice. Do we need to set the ticker here? It seems that the default ticker uses System.nanoTime, which is the same monotonic time source that Hadoop's Timer uses. It's currently like this because the test case uses dependency injection to test timing. We could use a fake guava timer but I wanted to avoid tight coupling between hadoop and the guava library. Let me know what you think. Why don't we use Guava's expireAfterWrite option to remove entries from the cache after a certain timeout? I guess this could be a separate configuration option, or we could just use 10 * cacheTimeout. Good idea. I added this using 10*cacheTimeout, and added some comments
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12685414/HADOOP-11238.patch
          against trunk revision 3c72f54.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5167//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5167//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12685414/HADOOP-11238.patch against trunk revision 3c72f54. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5167//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5167//console This message is automatically generated.
          Hide
          benoyantony Benoy Antony added a comment -

          Reviewed the latest patch. Looks good.

          I have one issue with refresh().
          It invalidates the cache, but it doesn't clear the negative-cache. I believe, we should clear the negative cache in refresh().
          Though the issue is not introduced by this patch, it is better to fix it.
          Could you please fix it and add a test case to test it ?

          Show
          benoyantony Benoy Antony added a comment - Reviewed the latest patch. Looks good. I have one issue with refresh(). It invalidates the cache, but it doesn't clear the negative-cache. I believe, we should clear the negative cache in refresh(). Though the issue is not introduced by this patch, it is better to fix it. Could you please fix it and add a test case to test it ?
          Hide
          cmccabe Colin P. McCabe added a comment -

          It's currently like this because the test case uses dependency injection to test timing. We could use a fake guava timer but I wanted to avoid tight coupling between hadoop and the guava library. Let me know what you think.

          Sure, seems reasonable.

          Good idea. I added [expireAfterWrite] using 10*cacheTimeout, and added some comments

          Great.

          201	     * This method will block if a cache entry doesn't exist, and
          202	     * any subsequent requests for the user will wait on the first
          203	     * request to return. If a user already exists in the cache,
          204	     * this will be run in the background.
          

          Maybe say "any subsequent requests for the user will wait on this request to return" to make it slightly clearer.

          Small nit: can you give different names to your different patches? There was a discussion about this on hadoop-common-dev recently, the consensus is that putting numbers on each one makes it clearer which is the latest (yes, we know JIRA attaches a date as well

          I renamed the jira to "Update the NameNode's Group Cache in the background when possible" to better reflect the current patch (i.e. talked about the solution, as well as the problem)

          Thanks again.

          Show
          cmccabe Colin P. McCabe added a comment - It's currently like this because the test case uses dependency injection to test timing. We could use a fake guava timer but I wanted to avoid tight coupling between hadoop and the guava library. Let me know what you think. Sure, seems reasonable. Good idea. I added [expireAfterWrite] using 10*cacheTimeout, and added some comments Great. 201 * This method will block if a cache entry doesn't exist, and 202 * any subsequent requests for the user will wait on the first 203 * request to return . If a user already exists in the cache, 204 * this will be run in the background. Maybe say "any subsequent requests for the user will wait on this request to return" to make it slightly clearer. Small nit: can you give different names to your different patches? There was a discussion about this on hadoop-common-dev recently, the consensus is that putting numbers on each one makes it clearer which is the latest (yes, we know JIRA attaches a date as well I renamed the jira to "Update the NameNode's Group Cache in the background when possible" to better reflect the current patch (i.e. talked about the solution, as well as the problem) Thanks again.
          Hide
          chrilisf Chris Li added a comment -

          Benoy Antony good catch, I fixed that in this next patch.

          Colin P. McCabe Updated comment in this next patch. Issue rename looks good too. Also thanks for letting me know about the patch numbering, I'll do that from now on!

          Show
          chrilisf Chris Li added a comment - Benoy Antony good catch, I fixed that in this next patch. Colin P. McCabe Updated comment in this next patch. Issue rename looks good too. Also thanks for letting me know about the patch numbering, I'll do that from now on!
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12686121/HADOOP-11238.003.patch
          against trunk revision 03867eb.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 111 new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5216//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/5216//artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5216//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12686121/HADOOP-11238.003.patch against trunk revision 03867eb. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 111 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5216//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/5216//artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5216//console This message is automatically generated.
          Hide
          benoyantony Benoy Antony added a comment -

          Chris Li, I don't think the "findbugs warnings" are introduced by this patch. Could you please re-attach the patch to trigger a new build to make sure that the findbugs issues are gone ?

          +1 on the patch itself.

          Show
          benoyantony Benoy Antony added a comment - Chris Li , I don't think the "findbugs warnings" are introduced by this patch. Could you please re-attach the patch to trigger a new build to make sure that the findbugs issues are gone ? +1 on the patch itself.
          Hide
          chrilisf Chris Li added a comment -

          Re-attaching patch to re-trigger build (no changes were made)

          Show
          chrilisf Chris Li added a comment - Re-attaching patch to re-trigger build (no changes were made)
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12686935/HADOOP-11238.003.patch
          against trunk revision 46612c7.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 3 new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5258//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/5258//artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5258//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12686935/HADOOP-11238.003.patch against trunk revision 46612c7. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 3 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5258//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/5258//artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5258//console This message is automatically generated.
          Hide
          cmccabe Colin P. McCabe added a comment -

          One thing that I notice here is that negative cache entries will never be removed unless the user is accessed again. So if I try to get the groups for "bozo" and then never access that user again, there will always be an entry in the hashmap for bozo ever after.... unless someone does a refresh.

          Anyway, that's a pre-existing problem, and the rest of the patch looks good.

          +1. Thanks, Chris.

          Show
          cmccabe Colin P. McCabe added a comment - One thing that I notice here is that negative cache entries will never be removed unless the user is accessed again. So if I try to get the groups for "bozo" and then never access that user again, there will always be an entry in the hashmap for bozo ever after.... unless someone does a refresh. Anyway, that's a pre-existing problem, and the rest of the patch looks good. +1. Thanks, Chris.
          Hide
          cmccabe Colin P. McCabe added a comment -

          committed to 2.7, thanks!

          Show
          cmccabe Colin P. McCabe added a comment - committed to 2.7, thanks!
          Hide
          cmccabe Colin P. McCabe added a comment -

          Filed follow-up for clearing negative cache entries for never-again-accessed users

          Show
          cmccabe Colin P. McCabe added a comment - Filed follow-up for clearing negative cache entries for never-again-accessed users
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #6710 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6710/)
          HADOOP-11238. Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d)

          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java
          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #6710 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6710/ ) HADOOP-11238 . Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java hadoop-common-project/hadoop-common/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #774 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/774/)
          HADOOP-11238. Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d)

          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #774 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/774/ ) HADOOP-11238 . Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d) hadoop-common-project/hadoop-common/CHANGES.txt hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #40 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/40/)
          HADOOP-11238. Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d)

          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java
          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #40 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/40/ ) HADOOP-11238 . Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java hadoop-common-project/hadoop-common/CHANGES.txt hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #1971 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1971/)
          HADOOP-11238. Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d)

          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #1971 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1971/ ) HADOOP-11238 . Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d) hadoop-common-project/hadoop-common/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #37 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/37/)
          HADOOP-11238. Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d)

          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #37 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/37/ ) HADOOP-11238 . Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d) hadoop-common-project/hadoop-common/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #41 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/41/)
          HADOOP-11238. Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d)

          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #41 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/41/ ) HADOOP-11238 . Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d) hadoop-common-project/hadoop-common/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #1991 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1991/)
          HADOOP-11238. Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d)

          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1991 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1991/ ) HADOOP-11238 . Update the NameNode's Group Cache in the background when possible (Chris Li via Colin P. McCabe) (cmccabe: rev e5a692519956aefb3a540ed0137b63cf598ac10d) hadoop-common-project/hadoop-common/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestGroupsCaching.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Pulled this into 2.6.1 after Akira Ajisaka verified that the patch applies cleanly. Ran compilation and TestGroupsCaching before the push.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Pulled this into 2.6.1 after Akira Ajisaka verified that the patch applies cleanly. Ran compilation and TestGroupsCaching before the push.

            People

            • Assignee:
              chrilisf Chris Li
              Reporter:
              chrilisf Chris Li
            • Votes:
              0 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development