Uploaded image for project: 'Harmony'
  1. Harmony
  2. HARMONY-4569

[classlib][performance] Ineffecient manifest parsing results in slowdown when debugging java code

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • None
    • Classlib
    • None

    Description

      I've reported this already some time ago [1], but since it wasn't noticed I decided to create a bug report.

      To run a hello world class which contains only 1 method takes 319046 method calls on Harmony. Out of them 291637 method calls are the calls that happen when Java class loader tries to find Hello.class and 27409 are what it takes for classlib to startup.

      I've found these numbers using two simple JVMTI agents that I wrote some time ago. First one [2] counts all MethodEnter events and outputs method names to log file. Second one [3] monitors a particular method and writes down a stack trace to the log file. It should be noted that MethodEnter event according to specification is sent by VM only in LIVE phase, that is after classlib startup, so to find out the overall number of method calls I've used debug version of DRLVM with an additional argument -Xtrace:jvmti.event.method.entry:log which wrote down all of the calls to Java method while executing the application.

      Parsing the logs for statistics can be done with the following command:

      cat method_entry.log | sort | uniq -c | sort -nr > method_entry.log-sorted

      It creates a file with the number that each method was called. The absolute champion since my post in [1] still remains the method Ljava/io/ByteArrayOutputStream;.write(I)V, it is called 131114 times. Using second agent [3] I've found out that the most popular stack trace for calling ByteArrayOutputStream;.write looks like this (please note that numbers at the ends of the lines are not line numbers, these are the bytecode numbers):

      Ljava/io/ByteArrayOutputStream;.write(I)V:0
      Ljava/util/jar/InitManifest;.nextChunk(Ljava/io/InputStream;Ljava/util/List[B:308
      Ljava/util/jar/InitManifest;.<init>(Ljava/io/InputStream;Ljava/util/jar/Attributes;Ljava/util/Map;Ljava/util/Map;Ljava/lang/String;)V:385
      Ljava/util/jar/Manifest;.read(Ljava/io/InputStream;)V:18
      Ljava/util/jar/Manifest;.<init>(Ljava/io/InputStream;Z)V:43
      Ljava/util/jar/JarFile;.getManifest()Ljava/util/jar/Manifest;:90
      Ljava/net/URLClassLoader;.createURLJarHandler(Ljava/net/URL;)Ljava/net/URLClassLoader$URLHandler;:141
      Ljava/net/URLClassLoader;.makeNewHandler()V:64
      Ljava/net/URLClassLoader;.getHandler(I)Ljava/net/URLClassLoader$URLHandler;:24
      Ljava/net/URLClassLoader;.findClassImpl(Ljava/lang/String;)Ljava/lang/Class;:82
      Ljava/net/URLClassLoader$4;.run()Ljava/lang/Class;:8
      Ljava/net/URLClassLoader$4;.run()Ljava/lang/Object;:1
      Ljava/security/AccessController;.doPrivilegedImpl(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;:79
      Ljava/security/AccessController;.doPrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;:16
      Ljava/net/URLClassLoader;.findClass(Ljava/lang/String;)Ljava/lang/Class;:13
      Ljava/lang/ClassLoader;.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;:80
      Ljava/lang/ClassLoader$SystemClassLoader;.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;:65
      Ljava/lang/ClassLoader;.loadClass(Ljava/lang/String;)Ljava/lang/Class;:3

      This stack trace occurs 123029 number of times, about 8000 times ByteArrayOutputStream;.write is called from InitManifest class with a bit different stack trace, but all of the calls to write occur from this class. It seems that InitManifest code copies memory by bytes and there is quite a lot of the memory that is copied. This may not be a big problem in the normal execution of Java application because JIT usually inlines such hot methods. Although it is still inefficient, it may be not a big performance hit (actually this should be checked).

      But when trying to debug Java code, DRLVM uses JET, which doesn't inline any methods. It has to execute all of them and all of those calls to ByteArrayOutputStream;.write are executed on startup of the application. It takes several minutes to get to the point of executing the actual user program.

      [1] http://thread.gmane.org/gmane.comp.java.harmony.devel/21104/focus=21133 see also messages posted in this subthread by me
      [2] http://people.apache.org/~gshimansky/methodee.cpp
      [3] http://people.apache.org/~gshimansky/methodee-stack.cpp

      Attachments

        1. remove_writes_25032008.patch
          108 kB
          Alexei Fedotov
        2. remove_writes_todo_completed.patch
          108 kB
          Alexei Fedotov
        3. ExposedByteArrayInputStream.java
          2 kB
          Alexei Fedotov
        4. ThreadLocalCache.java
          3 kB
          Alexei Fedotov
        5. ByteBuffer.java
          4 kB
          Alexei Fedotov
        6. remove_writes_todo_1.patch
          74 kB
          Alexei Fedotov
        7. remove_writes_todo.patch
          77 kB
          Alexei Fedotov
        8. remove_writes.patch
          76 kB
          Alexei Fedotov
        9. methodee-stack.cpp
          5 kB
          Gregory Shimansky
        10. methodee.cpp
          3 kB
          Gregory Shimansky

        Issue Links

          Activity

            People

              tellison Tim Ellison
              gshimansky Gregory Shimansky
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: