Uploaded image for project: 'Sentry (Retired)'
  1. Sentry (Retired)
  2. SENTRY-1907

Potential memory optimization when handling big full snapshots.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.0.0
    • Sentry
    • None

    Description

      PathImageRetriever.retrieveFullImage() has the following code:

            for (Map.Entry<String, Set<String>> pathEnt : pathImage.entrySet()) {
              TPathChanges pathChange = pathsUpdate.newPathChange(pathEnt.getKey());
      
              for (String path : pathEnt.getValue()) {
                pathChange.addToAddPaths(Lists.newArrayList(Splitter.on("/").split(path))); // here
              }
            }
      

      We convert many paths objects to list of strings per component so /a/b/c becomes

      {a, b, c}

      . There are tons of duplicates there, so after we split we should intern each component before adding it.

      This was observed by code inspection and confirmed by jxray analysis (thanks misha@cloudera.com) which shows that 61% of memory is used by duplicate strings and shows the following stack trace:

      4. REFERENCE CHAINS WITH HIGH RETAINED MEMORY (MAY SIGNAL MEMORY LEAK)
      
       ---- Object tree for GC root(s) Java Local@3c8e00c80 (org.apache.sentry.hdfs.service.thrift.TPathsUpdate) ----
      
        4,159,037K (33.4%) (1 of org.apache.sentry.hdfs.service.thrift.TPathsUpdate)
           <-- Java Local@3c8e00c80 (org.apache.sentry.hdfs.service.thrift.TPathsUpdate)
        4,135,376K (33.3%) (4897951 of j.u.ArrayList)
           <-- {j.u.ArrayList} <-- org.apache.sentry.hdfs.service.thrift.TPathChanges.addPaths <-- {j.u.ArrayList} <-- org.apache.sentry.hdfs.service.thrift.TPathsUpdate.pathChanges <-- Java Local@3c8e00c80 (org.apache.sentry.hdfs.service.thrift.TPathsUpdate)
        3,652,177K (29.4%) (52086231 objects)
           <-- {j.u.ArrayList} <-- {j.u.ArrayList} <-- org.apache.sentry.hdfs.service.thrift.TPathChanges.addPaths <-- {j.u.ArrayList} <-- org.apache.sentry.hdfs.service.thrift.TPathsUpdate.pathChanges <-- Java Local@3c8e00c80 (org.apache.sentry.hdfs.service.thrift.TPathsUpdate)
        GC root stack trace:
          org.apache.sentry.hdfs.service.thrift.TPathsUpdate$TPathsUpdateStandardScheme.write(TPathsUpdate.java:754)
          org.apache.sentry.hdfs.service.thrift.TPathsUpdate$TPathsUpdateStandardScheme.write(TPathsUpdate.java:671)
          org.apache.sentry.hdfs.service.thrift.TPathsUpdate.write(TPathsUpdate.java:584)
          org.apache.sentry.hdfs.service.thrift.TAuthzUpdateResponse$TAuthzUpdateResponseStandardScheme.write(TAuthzUpdateResponse.java:505)
          org.apache.sentry.hdfs.service.thrift.TAuthzUpdateResponse$TAuthzUpdateResponseStandardScheme.write(TAuthzUpdateResponse.java:435)
          org.apache.sentry.hdfs.service.thrift.TAuthzUpdateResponse.write(TAuthzUpdateResponse.java:377)
          org.apache.sentry.hdfs.service.thrift.SentryHDFSService$get_authz_updates_result$get_authz_updates_resultStandardScheme.write(SentryHDFSService.java:3608)
          org.apache.sentry.hdfs.service.thrift.SentryHDFSService$get_authz_updates_result$get_authz_updates_resultStandardScheme.write(SentryHDFSService.java:3572)
          org.apache.sentry.hdfs.service.thrift.SentryHDFSService$get_authz_updates_result.write(SentryHDFSService.java:3523)
          org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
          org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
          org.apache.sentry.hdfs.SentryHDFSServiceProcessorFactory$ProcessorWrapper.process(SentryHDFSServiceProcessorFactory.java:47)
          org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:123)
          org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
          java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
          java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
          java.lang.Thread.run(Thread.java:745)
      

      Attachments

        1. SENTRY-1907.01.patch
          4 kB
          Alex Kolbasov

        Issue Links

          Activity

            People

              akolb Alex Kolbasov
              akolb Alex Kolbasov
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: