Uploaded image for project: 'Ranger'
  1. Ranger
  2. RANGER-3987

Potential risk of OOM

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 2.2.0
    • None
    • admin
    • None

    Description

      During every policy-loading process of other components, the attribute "LastActivationTimeInMillis" is always set to System.currentTimeMillis(). See loadPolicy(): 

      // from PolicyRefresher.java loadPolicy()
      
      //load policy from PolicyAdmin
      ServicePolicies svcPolicies = loadPolicyfromPolicyAdmin();
      
      if (svcPolicies == null) {
         //if Policy fetch from Policy Admin Fails, load from cache
         if (!policiesSetInPlugin) {
            svcPolicies = loadFromCache();
         }
      }
      
      if (PERF_POLICYENGINE_INIT_LOG.isDebugEnabled()) {
         long freeMemory = Runtime.getRuntime().freeMemory();
         long totalMemory = Runtime.getRuntime().totalMemory();
         PERF_POLICYENGINE_INIT_LOG.debug("In-Use memory: " + (totalMemory - freeMemory) + ", Free memory:" + freeMemory);
      }
      
      if (svcPolicies != null) {
         plugIn.setPolicies(svcPolicies);
         policiesSetInPlugin = true;
         serviceDefSetInPlugin = false;
         setLastActivationTimeInMillis(System.currentTimeMillis()); // always updated during each policy loading
         lastKnownVersion = svcPolicies.getPolicyVersion() != null ? svcPolicies.getPolicyVersion() : -1L;
      } else {
         if (!policiesSetInPlugin && !serviceDefSetInPlugin) {
            plugIn.setPolicies(null);
            serviceDefSetInPlugin = true;
         }
      } 

      In this case, the column "info" from table "x_plugin_info" would always need to be updated since it is a json string containing activationTime. See doCreateOrUpdateXXPluginInfo(): 

      // from AssetMgr, doCreateOrUpdateXXPluginInfo().
      if (lastPolicyActivationTime != null && lastPolicyActivationTime > 0 && (dbObj.getPolicyActivationTime() == null || !dbObj.getPolicyActivationTime().equals(lastPolicyActivationTime))) {
         dbObj.setPolicyActivationTime(lastPolicyActivationTime);
         needsUpdating = true;
      } 

      Since doCreateOrUpdateXXPluginInfo() is a Runnble committed to RangerTransactionService. (RangerTransactionSynchronizationAdapter in Ranger 2.3.0 though, the risk might still be there). Also see doCreateOrUpdateXXPluginInfo(): 

      // code placeholder
      commitWork = new Runnable() {
         @Override
         public void run() {
            doCreateOrUpdateXXPluginInfo(pluginInfo, entityType, isTagVersionResetNeeded, clusterName);
         }
      }; 
      ...
      activityLogger.commitAfterTransactionComplete(commitWork);

      RangerTransactionService use a thread pool with unlimited work queue, ScheduledExecutorService, to store extra Runnables.

      In our cases, there are 1000+ hive and hbase instances, the ranger admin seems to be  under tremendous pressure becuase every instance would periodically request policy-downloading API and trigger an update of the table "x_plugin_info". Since the core thread pool seems to be poor and DB is also likely under pressure, the work queue is stacking, leaking out JVM Heap and causing OOM finally.

      I think adding more core threads would help, but when the system grow, this part of code would bring a lot overhead, is there any solution?

       

       

       

      Attachments

        Activity

          People

            KyrieG KyrieG
            KyrieG KyrieG
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: