Jasper performance is abysmal when using jspc to precompiles jsps, or more especially to generate the java files. I have a webapp with 450 jsps, jasper generates the java files in 12 minutes. In comparison, javac compilation for those 450 generated java files is performed in 1 minute. Profiling and debugging the code, it appears clearly that most of the time is spent doing the same parsing for static include directives over and over. Typically a page includes an init page which itself includes 2 init pages. page.jsp -> init.jsp -> { init-a.jsp, init-b.jsp, ... } For each and every page, the parser always go down the tree without caching includes and parse init.jsp, and so on. So basically the init.jsp is parsed 450 times. The Mark object is also pretty inefficient as it generates bazillions of objects which are not so lightweight (via Mark.mark()).
Ok. Does this mean you have done some optimizations already (patches) ? Do you have test cases ? (JSPs which are particularly slow to compile) Note: This will looked at in 5.0.x (better to have a slow source generation than a *broken* source generation).
I'm currently hacking the jasper code to try to understand it as i'm not very familiar with the code and need to understand relations between objects and particulary how nodes are attached to each other, how dependant files are added and resolved, etc... it is a bit of obscure right now. I'm not sure yet what would be the best place to cache those include: ParserController, Parser, .. and I think I need to make those Node classes mutable to get control over the parent and reset/set it easily. I guess that for anyone familiar with the code, the view is pretty clear, mine is not at the moment. My current testcase is the liferay portal sources (www.liferay.com), it needs a bit of doc to setup so it's not convenient for an independant testcase. (NB: If I reported against 5.5.7 it is a mistake and is indeed 5.0.x)
I made the "mistake". As I said, this will not be addressed (as in -1 = veto) in 5.0.x.
Ah ok, did not really understood it that way since I consider it is 'better to have a slow source generation than a *broken* source generation' in any situation and not just in 5.0.x :). Thus the qualification as 'enhancement'. Nonetheless, I'm still trying to understand the code. I have made little progress as I realized I needed to delta the dependencies that are added through the Compiler instance if I want to cache the includeDirective fragment. I still need to find how are the taglibs evaluated to generate the java code. At this time it looks like I could dream sort of a 10x improvement if this sort of caching can be done.
My comment was indeed missing two words. BTW, regardless of what your profiler says, I don't think you can get 10x. However, now that the runtime performance is good, I'm interested in optimizing compile times. One part has been done already by replacing the compiler by one which decreses I/O (although the main benefit was to get compilation on the JRE instead of the JDK).
The compilation process is an external compilation process with ant javac task. I'm not using the internal one as for compilation within jspc it is often inappropriate and it is not flexible enough for file selection, classpath, and jvm settings. I'm using this because many people tend to name jsp include with .jsp extensions and it will f* up blind attempt to compile from Jasper as they will be generated to java and it is invalid code by essence. Workaround process is usually. 1. jspc (.jsp -> .java) 2. delete generated static includes based on text content selectors (looking for unevaluated taglibs syntax for example) 3. javac everything I'm myself always recommending people to name their includes as jspf (or .inc) extensions as it is: - visually more convenient - Jasper does not choke on it - JSP editors will still understand the syntax (except for .inc) I would really like to enhance this Ant jspc task but this is in my mile long todo list. As for the speed up, I cannot see any reason why it should be 10x slower than java compilation (2-3s for a single jsp->java is huge). but really, all I'm missing is the taglibs evaluation and the process apparently should be able to fly NB: For I/O it should not cost much to wrap BufferedOutputStream over the FileOutputStream. (I did not check if there was any change on this in cvs as I playing with 5.0.30 sources) but really it is peanuts compared to the zillions of I/O from constant loading and parsing of includes and tlds.
FYI, putting a simple hashmap to cache the TLDs TreeNode to avoid the DOM parsing of TLDs in TagLibraryInfoImpl for each and every JSP brings the timing to 8 minutes instead of 12 on my 'testcase'. A 30% improvement.
You still need to "play nice" with regular Jasper mode, when compilation occurs on access. I would complain if there are memory leaks, and/or abusive memory usage ;) If code generation times go down significantly, and since JDT is used to compile (fast and does very little I/O), I would assume it would encourage many people to not bother with precompilation (hence the need to avoid leaks and huge memory usage).
(In reply to comment #7) > FYI, putting a simple hashmap to cache the TLDs TreeNode to avoid the DOM > parsing of TLDs in TagLibraryInfoImpl for each and every JSP brings the timing > to 8 minutes instead of 12 on my 'testcase'. A 30% improvement. Nice improvement.
> You still need to "play nice" with regular Jasper mode, when compilation > occurs on access. I would complain if there are memory leaks, and/or abusive > memory usage ;) I think the cache needs to be enabled only in command line mode (batch compile). Otherwise it does not make much sense and it complicate things when you are development mode. > I would assume it would encourage many people to not bother with > precompilation It is totally different. Pre-compiling your jsp is mostly a way to check that there is no syntax or configuration problems with your taglibs or jsp. You don't care much with a petstore webapp with 10 jsps that you can check manually at deployment time, but with hundred of thems, it is not something you can afford. Assuming you have some QA dept or some automated robot to go through all your webpages, that's also useless to deliver something that does not even compile. So it's mostly a development thing and must be part of your process, and it should be part of the daily build as well of course. So if we can cut down this check from 12 to 1min, believe me I think some people would love it.
I've made this item a candidate for the Summer of Code: http://wiki.apache.org/general/SummerOfCode2005.
I was just looking at it. :) Good thing, Yoav, I unfortunately spend my time switching priorities due to the current nature of my work, so I hope someone will take over, there sure is a nice opportunity.
(In reply to comment #11) > I've made this item a candidate for the Summer of Code: > http://wiki.apache.org/general/SummerOfCode2005. I have applied to participate in this project during google summer of code. If I am accepted or not I am still very interested in this issue. I am a student intern at a company and during my time there I have been involved in development of a very large web application involving several hundred jsps. I have encountered this issue during development / deployment and would enjoy learning more about this issue and finding a clean optimized solution. Any suggestions on where to start? Feel free to email me.
I have changed the summer project to a more general "optimizing JSP compilation performance", as hopefully not all optimizations are going to be about caching stuff.
(In reply to comment #11) > I've made this item a candidate for the Summer of Code: > http://wiki.apache.org/general/SummerOfCode2005. I followed this bugzilla from the SummerOfCode2005 link. This looks like an interesting optimization. Is there any whitepaper/architecture document that I can look at? I would surely enjoy taking up this challenge, contest or no contest.
another aspect to this: Bug 31645
Created attachment 16121 [details] modified Options targeted at jasper performance improvement
Created attachment 16122 [details] modified JspC targeted at jasper performance improvement
Created attachment 16123 [details] modified EmbeddedServletOptions targeted at jasper performance improvement
Created attachment 16124 [details] modified TagLibraryInfoImpl targeted at jasper performance improvement
(In reply to comment #17) > Created an attachment (id=16121) [edit] > modified Options targeted at jasper performance improvement You should submit patches in diff format ("cvs diff"). It's too difficult to evaluate the patch when you just post the entire modified file.
(In reply to comment #21) > (In reply to comment #17) > > Created an attachment (id=16121) [edit] [edit] > > modified Options targeted at jasper performance improvement > > You should submit patches in diff format ("cvs diff"). It's too difficult to > evaluate the patch when you just post the entire modified file. Thank you. I will replace the format.
Created attachment 16126 [details] proposed patches for Options.java, JspC.java, EmbeddedServletOptions.java and TagLibraryInfoImpl.java in cvs diff format This patch is about caching the TreeNode tld in TagLibraryInfoImpl. Because the profiling results show the repeating parsing of the same TLD has dominated the precompilation process, this change would make the jasper works much better on a large Ant build :-) (I would also submit the different profiling snapshots before and after the code changes). Here is what I basically did: 1. Options.java: add two public functions in Options interface. One is isCacheTldXml(), and the other is Map getCachedTldXmlMap(). The parsed TLD XML data is cached in a Map(String uri, TreeNode tld). 2. JspC.java: add switch "-cacheTldXml" and implement the two functions defined in Options. The cacheTldXml is defaulted to true but the users can set it. 3. EmbeddedServletOptions.java: add boolean cacheTldXml and implement the two functions. As EmbeddedServletOptions is called by run-time compilation, cacheTldXml is set to false by default. 4. TagLibraryInfoImpl.java: add an if-else clause in parseTld to distinguish command-line build and run-time build. If the CompilerContext is JspC, try to get the TreeNode from cache first; otherwise call parseXMLDocument to parse the TLD directly.
Created attachment 16127 [details] cpu profiling snapshot by class (call tree) before the change The testbed is the Liferay Portal Source (www.liferay.com). It is a quite big and 464 Java source files would be created. The build machine is a typical Dell 4600 desktop with Fedora Core 4, 2.8GHz, 768M. Before the patch, the TagLibraryInfoImpl class takes more than 50% CPU time.
Created attachment 16128 [details] cpu profiling snapshot by class (call tree) after the change After the patch, the TagLibraryInfoImpl would take much less CPU time, less than 5%. As it's inconvenient to post the profiling snapshots here (one at a time), I would post the rest of them on my personal website to show the differences. I'll post the URL when it's done.
(In reply to comment #25) > Created an attachment (id=16128) [edit] > cpu profiling snapshot by class (call tree) after the change > After the patch, the TagLibraryInfoImpl would take much less CPU time, less > than 5%. > As it's inconvenient to post the profiling snapshots here (one at a time), I > would post the rest of them on my personal website to show the differences. > I'll post the URL when it's done. The URL is http://www.cs.ucf.edu/~xbgao/jasper/jasper-33650.htm. It contains 12 screenshots in total.
Ok, I've looked at the patch, and I think it looks good. Some of it could maybe be a bit more elegant (in particular, I think the instanceof should be replaced by the use the CacheTldXml flag). For additional optimizations, I think there's room seeing if includes could be cached as well (I'm fairly certain a static include happy webapp behaves as described by Stephane Bailliez in the original report), and of course optimizing hotspots (JspReader/Mark seems to be the biggest offender).
Created attachment 16139 [details] Modified patch Small tweaks to make names more generic
I do have a problem: with the patch, precompilation of the admin webapp fails, so I cannot apply it for the time being. There must be something wrong somewhere.
As the current code only caches the result of the xml parsing it should translates in roughly a 30% speedup for Liferay compilation. Liferay being static include happy, it is a good testcase as well. The xml parsing was the easy part :) As Remy mentions there is load of improvement in there as the JSP parsing is obviously not very efficient. As a side note, I'm nit-picking but I think it would be better to have a single method that returns the TreeNode. here I feel like they are 2 pieces of code related to 'is cache in use'. 1) getOptions().getCachedTldXmlMap() return the map or null 2) if (ctxt.getServletContext() instanceof org.apache.jasper.servlet.JspCServletContext) is used to figure out if it should look into the cache (why not checking if the map is null or not ?) I might be missing something, but this if is confusing me.
(In reply to comment #30) > As the current code only caches the result of the xml parsing it should > translates in roughly a 30% speedup for Liferay compilation. There's weird stuff going on, though. Maybe I did something wrong when applying the patch. Before trying to debug, I'll try to see if caching the TaglibInfoImpl works better. > Liferay being static include happy, it is a good testcase as well. The xml > parsing was the easy part :) As Remy mentions there is load of improvement in > there as the JSP parsing is obviously not very efficient. > > As a side note, I'm nit-picking but I think it would be better to have a > single method that returns the TreeNode. here I feel like they are 2 pieces of > code related to 'is cache in use'. > > 1) getOptions().getCachedTldXmlMap() return the map or null > 2) if (ctxt.getServletContext() instanceof > org.apache.jasper.servlet.JspCServletContext) is used to figure out if it > should look into the cache (why not checking if the map is null or not ?) > > I might be missing something, but this if is confusing me. Yes, it could be better. I assume the cache map will eventually be used for more than TaglibInfoImpl.
Created attachment 16140 [details] Modified patch 2 This patch will cache TagLibraryInfoImpl objects rather than the XML objects.
(In reply to comment #32) > Created an attachment (id=16140) [edit] > Modified patch 2 > > This patch will cache TagLibraryInfoImpl objects rather than the XML objects. I looked at this patch, but I am afraid that the TagLibraryInfoImpl cannot be cached in Parser.java and JspDocumentParser.java in this simple way. In TagLibraryInfoImpl the individual CompilerContext is used for each jsp precompilation specifically. I guess that's why it didn't work on the admin app precompilation.
(In reply to comment #30) > As the current code only caches the result of the xml parsing it should > translates in roughly a 30% speedup for Liferay compilation. > In my testcase with liferay source package (more specifically, all jsp files under portal-web/docroot), I could get more than 30% speedup. I did a series of experiments and the results were: 1. Dell Dimension 4600, 2.8GHz, 768M: time reduced from ~130s to ~65s. 2. one cluster node, AMD Athlon(tm) Processor, 908MHz, 768M: time reduced from (12m + 50s) to (2m + 55s). It depends on the machines but my cases show significant improvement. > Liferay being static include happy, it is a good testcase as well. The xml > parsing was the easy part :) As Remy mentions there is load of improvement in > there as the JSP parsing is obviously not very efficient. > > As a side note, I'm nit-picking but I think it would be better to have a > single method that returns the TreeNode. here I feel like they are 2 pieces of > code related to 'is cache in use'. > > 1) getOptions().getCachedTldXmlMap() return the map or null > 2) if (ctxt.getServletContext() instanceof > org.apache.jasper.servlet.JspCServletContext) is used to figure out if it > should look into the cache (why not checking if the map is null or not ?) > > I might be missing something, but this if is confusing me.
(In reply to comment #33) > I looked at this patch, but I am afraid that the TagLibraryInfoImpl cannot be > cached in Parser.java and JspDocumentParser.java in this simple way. In > TagLibraryInfoImpl the individual CompilerContext is used for each jsp > precompilation specifically. I guess that's why it didn't work on the admin app > precompilation. No, the latest patch I posted doesn't fail, for some reason (the first patch I attached before did, but it might have been a build clean issue or something, as the problem disappeared magically). I know there's a compiler context field which might possibly cause issues. It could be investigated if it really does.
The TL info caching seems to work reasonably well, after review, so I'll commit it. The Options interface changes are not final.
I do not see a reason to still keep this open. All the discussed patches are already applied to the sources 3+ years ago and are in TC 5.5.12 and later, and that includes all versions of TC 6.0. Thus I am resolving this as FIXED. If anyone has examples that are still unacceptable slow to compile, and likes to profile the Jasper/JspC code, I think everyone will be interested.