Uploaded image for project: 'Velocity Tools'
  1. Velocity Tools
  2. VELTOOLS-56

ImportSupport appears to have a veriety of problems, including a potential resource leak.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.2
    • Fix Version/s: 1.3
    • Component/s: None
    • Labels:
      None
    • Environment:
      Linux 2.4.28 + Java 1.4.2_09 + Tomcat 4.1.31

      Description

      I have what I believe to be a relatively lean and mean search application that uses Velocity Tools, and after 2 months I started getting the dreaded OutOfMemoryException. I started looking for anything obvious that could be the problem, and I found that I was using ImportTool.read in a few places, and my log files were full of exceptions that occurred during those operations, because the URLs were bad or whatever.

      Just for the heck of it, I decided to look at how the ImportTool actually imported it's text, since it's a pretty heavy operation and if it was failing alot perhaps it was contributing to my OOM woes. So, I dove into the code to see how everything worked. The ImportTool class is trivial... the real meat is in ImportSupport, where I found some things that we might want to consider.

      1. [trivial] In ImportSupport.acquireReader, an InputStreamReader is created with the character set loaded from the remote resource (makes sense). In the case where Java doesn't support the encoding, we use the default encoding. Unfortunately, the code uses catch(Exception) instead of catch(UnsupportedEncodingException). If some other error occurred, it would be swallowed, because there is no error reporting in this case.

      2. [confusing] This comment appears just before the end of ImportSupport.acquireReader:

      // check response code for HTTP URLs before returning, per spec,
      // before returning

      It is unclear which spec the author is referencing. It could be the HTTP spec (unlikely) or the HttpURLConnection spec (where I can't find any hints). It might be nice to clear this up.

      Further, it appears that the author was trying to check the status in order to throw an error if there wasn't an "OK" response. This code appears /after/ an input stream is retrieved from the request.

      It turns out that the class that you ultimately get back from opening an HTTP URL is sun.net.www.protocol.http.HttpURLConnection, which will throw an exception if you call getInputStream if the status of the connection is anything >= 400. That means that many of the error conditions are already handled by the time this status code check executes.

      Pretty much anything under 400 should be considered an okay response, no?

      3. [potentially dangerous] There is never any resource cleanup of the HTTP connection. This is actually complicated due to the way that the methods were written: acquireReader creates the connection and returns a Reader object, which is tied to the HttpURLConnection. aquireString knows nothing about the HttpURLConnection and can therefore not cleanup. The reader itself gets closed, but that does not guarentee that the connection itself gets cleaned up.

      There's not even cleanup code in the acquireReader method itself to close the connection if we throw an exception ourselves (for example, because we don't like the status code).

      These problems leave potential, if not actual, resource leaks. Sure, the GC will hopefully come along eventually and cleanup the mess that was left, but it's not guarenteed to do so in a timely mannar. Things like InputStreams and HTTP connections are probably relatively heavy objects and should be cleaned up whenever possible.

      4. [bad style?] All of the exceptions explicitly created and thrown from ImportSupport are of type java.lang.Exception. I feel like we could come up with some better exceptions to throw.

      I am happy to do the work required to fix all these issues – I'm not just a whiner I wanted to have a discussion about how the project leaders felt about each item and whether or not they were worth fixing.

      Thanks,
      -chris

      1. VELTOOLS-56.1.diff
        7 kB
        Christopher Schultz

        Activity

        Hide
        nbubna Nathan Bubna added a comment -

        Issues should not be assigned until the volunteer has agreed to take on the case. Chances are good that I will look into this at some point, but in the meantime, the bug should be unassigned so that Marino or other contributors (even yourself, perhaps) don't have to be concerned with duplicating effort.

        Show
        nbubna Nathan Bubna added a comment - Issues should not be assigned until the volunteer has agreed to take on the case. Chances are good that I will look into this at some point, but in the meantime, the bug should be unassigned so that Marino or other contributors (even yourself, perhaps) don't have to be concerned with duplicating effort.
        Hide
        nbubna Nathan Bubna added a comment -

        Ok, now to briefly acknowledge the issues...

        first, a disclaimer... Marino did most of the heavy lifting on this, though he may have leaned on Struts' Tiles tag for a lot of it. all that is to say, my familiarity and understanding of ImportSupport is currently limited.

        1. Agreed. gotta patch?
        2a. i've no idea what spec the comment is referencing. (perhaps http://www.w3.org/pub/WWW/Protocols/? just guessing though.)
        2b. how would you propose to have this work instead? (a commented patch will do as well as an explanation)
        3. yeah, cleanup is good. would this just mean a call to HttpURLConnection's disconnect() and closing the input streams? feel free to patch away on this one.
        4. i'm open to suggestions. just note that i'm not sure we're ready to require Java 1.4, and that i don't want ImportTool.read() to let any exceptions escape. they should just be logged and null returned.

        and i'm sure it goes without saying, but please test your patches before submitting them.

        Show
        nbubna Nathan Bubna added a comment - Ok, now to briefly acknowledge the issues... first, a disclaimer... Marino did most of the heavy lifting on this, though he may have leaned on Struts' Tiles tag for a lot of it. all that is to say, my familiarity and understanding of ImportSupport is currently limited. 1. Agreed. gotta patch? 2a. i've no idea what spec the comment is referencing. (perhaps http://www.w3.org/pub/WWW/Protocols/? just guessing though.) 2b. how would you propose to have this work instead? (a commented patch will do as well as an explanation) 3. yeah, cleanup is good. would this just mean a call to HttpURLConnection's disconnect() and closing the input streams? feel free to patch away on this one. 4. i'm open to suggestions. just note that i'm not sure we're ready to require Java 1.4, and that i don't want ImportTool.read() to let any exceptions escape. they should just be logged and null returned. and i'm sure it goes without saying, but please test your patches before submitting them.
        Hide
        chris@christopherschultz.net Christopher Schultz added a comment -

        Nathan... just so you know, I sisn't assign this issue to you... you are automatically assigned new bugs, I guess.

        I do not have any patches for any of the issues that I raised. I wanted to see if anyone else wanted them fixed before I did anything. For #3, the class may need some significant refactoring because in cases where everything goes according to plan, the current architecture does not allow the HttpURLConnection to be explicitly closed.

        Were you asking about JDK 1.4 because of the chained exceptions? You can always use your own chained-exception without using those built-into the JDK. The current behavior is for ImportTool.read to blow up and thrown an exception, rather than to return null.

        On the topic of testing... are there unit tests for anything in the project? I don't see anything obvious. I would prefer to work with an existing set of unit tests if possible.

        Show
        chris@christopherschultz.net Christopher Schultz added a comment - Nathan... just so you know, I sisn't assign this issue to you... you are automatically assigned new bugs, I guess. I do not have any patches for any of the issues that I raised. I wanted to see if anyone else wanted them fixed before I did anything. For #3, the class may need some significant refactoring because in cases where everything goes according to plan, the current architecture does not allow the HttpURLConnection to be explicitly closed. Were you asking about JDK 1.4 because of the chained exceptions? You can always use your own chained-exception without using those built-into the JDK. The current behavior is for ImportTool.read to blow up and thrown an exception, rather than to return null. On the topic of testing... are there unit tests for anything in the project? I don't see anything obvious. I would prefer to work with an existing set of unit tests if possible.
        Hide
        nbubna Nathan Bubna added a comment -

        Sorry, i didn't realize they were being automatically assigned to me. I'm new to JIRA and didn't know that was possible. I've found the option to change that now.

        If we're going to significantly refactor it to close resources, then this is the time for such refactoring. we just did a release and have plenty of time to play around with a revamped ImportSupport.

        and yes, i was referring to chained exceptions. i'm not sure at present what the best move is here. though i'm hesitant to support using NestableException from commons-lang or our own corollary. if you're going to work on this, do what you think is worthwhile. we'll look at it more closely then. as long as "significant refactoring" seems likely, then this is secondary.

        as for ImportTool.read(), i'm quite sure that it catches exceptions currently:
        http://svn.apache.org/viewcvs.cgi/jakarta/velocity/tools/trunk/src/java/org/apache/velocity/tools/view/tools/ImportTool.java?rev=190560&view=markup

        and no, unfortunately we have few (if any) unit tests. Shinobu wrote some for his contributions, but he's disappeared and i don't know the state of those. and i've been a slacker on the testing thing all along. it's definitely something we could use help with.

        Show
        nbubna Nathan Bubna added a comment - Sorry, i didn't realize they were being automatically assigned to me. I'm new to JIRA and didn't know that was possible. I've found the option to change that now. If we're going to significantly refactor it to close resources, then this is the time for such refactoring. we just did a release and have plenty of time to play around with a revamped ImportSupport. and yes, i was referring to chained exceptions. i'm not sure at present what the best move is here. though i'm hesitant to support using NestableException from commons-lang or our own corollary. if you're going to work on this, do what you think is worthwhile. we'll look at it more closely then. as long as "significant refactoring" seems likely, then this is secondary. as for ImportTool.read(), i'm quite sure that it catches exceptions currently: http://svn.apache.org/viewcvs.cgi/jakarta/velocity/tools/trunk/src/java/org/apache/velocity/tools/view/tools/ImportTool.java?rev=190560&view=markup and no, unfortunately we have few (if any) unit tests. Shinobu wrote some for his contributions, but he's disappeared and i don't know the state of those. and i've been a slacker on the testing thing all along. it's definitely something we could use help with.
        Hide
        chris@christopherschultz.net Christopher Schultz added a comment -

        Submitted patch for ImportSupport to solve these issues:

        1. Catching UnsupportedEncodingException instead of generic Exception when creating InputStreamReader in acquireReader.

        2. Moved HTTP status code check and potential exception throw earlier in acquireReader method, so we fail as fast as possible.

        3. Removed potentially dangerous non-threadsafe "isAbsoluteUrl" member.

        4. Added resource cleanup in these cases:

        • In acquireString (for absolute URLs), close Reader in new finally block.
        • In acquireReader, close InputStream and call HttpURLConnection.disconnect in catch blocks
        • In acquireReader, wrap InputStreamReader in new SafeClosingHttpURLConnectionReader, which will call HttpURLConnection.disconnect when the Reader is closed.
        Show
        chris@christopherschultz.net Christopher Schultz added a comment - Submitted patch for ImportSupport to solve these issues: 1. Catching UnsupportedEncodingException instead of generic Exception when creating InputStreamReader in acquireReader. 2. Moved HTTP status code check and potential exception throw earlier in acquireReader method, so we fail as fast as possible. 3. Removed potentially dangerous non-threadsafe "isAbsoluteUrl" member. 4. Added resource cleanup in these cases: In acquireString (for absolute URLs), close Reader in new finally block. In acquireReader, close InputStream and call HttpURLConnection.disconnect in catch blocks In acquireReader, wrap InputStreamReader in new SafeClosingHttpURLConnectionReader, which will call HttpURLConnection.disconnect when the Reader is closed.
        Hide
        nbubna Nathan Bubna added a comment -

        Sorry for my atrociously late response to this. It's an important fix, but i've been highly focused on my paid work lately and am not using this there. That said, i'm taking a good look at this now and hope to get it in today. As an initial note, this patch is a serious PITA to read since the formatting of the changes is inconsistent and very unreadable. In the future, please try to have changes conform to the code style used by the project.

        Show
        nbubna Nathan Bubna added a comment - Sorry for my atrociously late response to this. It's an important fix, but i've been highly focused on my paid work lately and am not using this there. That said, i'm taking a good look at this now and hope to get it in today. As an initial note, this patch is a serious PITA to read since the formatting of the changes is inconsistent and very unreadable. In the future, please try to have changes conform to the code style used by the project.
        Hide
        nbubna Nathan Bubna added a comment -

        Patch applied in revision 474989.

        Thanks again for the fixes, Christopher. I think i sorted out the formatting properly, but i'd appreciate if you'd give it a careful look and run it through your tests again to be sure. It seems to work great for me, but then again, i wasn't having problems before either.

        Show
        nbubna Nathan Bubna added a comment - Patch applied in revision 474989. Thanks again for the fixes, Christopher. I think i sorted out the formatting properly, but i'd appreciate if you'd give it a careful look and run it through your tests again to be sure. It seems to work great for me, but then again, i wasn't having problems before either.
        Hide
        chris@christopherschultz.net Christopher Schultz added a comment -

        I'll run my tests once I get everything back together again. Merging the svn project back into mine resulted in disaster due to the same patch being applied twice but slightly differently, so I opted to start again. There's a lot of new stuff in there.

        Now that I look at the diff, I can see what you meant by formatting issues.

        Sorry... I had intended for much of that code to go into as small a space as possible, hence the 2-line catch/handler kind of stuff. I generally format my code so that the "real stuff" takes the most space, and the error handling is as compact as possible. Same with the support class that I wrote... I didn't think that a bunch of delegate methods needed to be expanded completely.

        Forthcoming patches will be consistent with what I've seen, here.

        Show
        chris@christopherschultz.net Christopher Schultz added a comment - I'll run my tests once I get everything back together again. Merging the svn project back into mine resulted in disaster due to the same patch being applied twice but slightly differently, so I opted to start again. There's a lot of new stuff in there. Now that I look at the diff, I can see what you meant by formatting issues. Sorry... I had intended for much of that code to go into as small a space as possible, hence the 2-line catch/handler kind of stuff. I generally format my code so that the "real stuff" takes the most space, and the error handling is as compact as possible. Same with the support class that I wrote... I didn't think that a bunch of delegate methods needed to be expanded completely. Forthcoming patches will be consistent with what I've seen, here.

          People

          • Assignee:
            Unassigned
            Reporter:
            chris@christopherschultz.net Christopher Schultz
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development