Commons Digester
  1. Commons Digester
  2. DIGESTER-110

Create APIs that accept URL values directly, instead of only Strings to be converted to URLs

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.6
    • Fix Version/s: 1.8
    • Labels:
      None

      Description

      Currently, Digester provides a number of convenience methods such as parse() that access the XML content to be parsed in a variety of ways. The most commonly used ones are probably those that take the String form of a URL.

      In a web application environment, it is common for applications or frameworks to call ServletContext.getResource() to acquire the URL of a web application resource to be parsed. With the current Digester implementation, this must be converted to a String (using toString() or toExternalForm()), and then – internal to Digester – converted back to a URL. For this to work successfully, there must be a lossless conversion of the URL returned by ServletContext.getResource(), to a String, and then back to a URL. While this process succeeds on most popular servlet containers today, it is not guaranteed to work ... it is entirely reasonable for a servlet container to embed customized information in the URL implementation that is returned by ServletContext.getResource(), and this information would be lost in the conversions described above.

      To be safe, Digester should provide alternative public APIs that accept URLs directly, in addition to the current APIs maintained for backwards compatibility. At a minimum, that would mean adding the following public method signatures to Digester itself:

      • public Object parse(java.net.URL url)
      • public void register(java.lang.String publicId, java.net.URL entityURL)

      plus any other scenarios where strings are used as URLs.

      1. diffs.txt
        10 kB
        Craig McClanahan
      2. URLTestCase.java
        4 kB
        Craig McClanahan

        Issue Links

          Activity

          Hide
          Niall Pemberton added a comment -

          > * Change existing createInputSourceFromURL(String) to a non static
          > method

          Initiallly this had me concerned that this would break backwards compatibility with previous Digester versions - but that method was private (and not static) in Digester 1.7.0 so its not an issue. Adding this note so that anyone else reading that comment doesn't jump to the same conclusion.

          Show
          Niall Pemberton added a comment - > * Change existing createInputSourceFromURL(String) to a non static > method Initiallly this had me concerned that this would break backwards compatibility with previous Digester versions - but that method was private (and not static) in Digester 1.7.0 so its not an issue. Adding this note so that anyone else reading that comment doesn't jump to the same conclusion.
          Hide
          Craig McClanahan added a comment -

          Committed the proposed patch. Also updated the JIRA version numbers so we can use it to build release notes content, if desired.

          Show
          Craig McClanahan added a comment - Committed the proposed patch. Also updated the JIRA version numbers so we can use it to build release notes content, if desired.
          Hide
          Rahul Akolkar added a comment -

          Looks good to me (didn't try applying since I'm on limited dev time this weekend). digester trunk is open for commits, please go ahead.

          Show
          Rahul Akolkar added a comment - Looks good to me (didn't try applying since I'm on limited dev time this weekend). digester trunk is open for commits, please go ahead.
          Hide
          Craig McClanahan added a comment -

          New test case that uses the URL-based method signatures. The parse method is called twice to ensure that registered DTDs remain accessible in spite of input sources being closed.

          Show
          Craig McClanahan added a comment - New test case that uses the URL-based method signatures. The parse method is called twice to ensure that registered DTDs remain accessible in spite of input sources being closed.
          Hide
          Craig McClanahan added a comment -

          Here is my proposed changes to Digester (a new unit test case will be submitted separately). The primary changes are:

          • New public method register(String publicId, URL entityURL)
            that accepts a URL instead of String for the second parameter.
          • New public method parse(URL url) that accepts a URL
            to be parsed.
          • New public method createInputSourceFromURL(URL url)
            that accepts a URL instead of a String.
          • Change existing createInputSourceFromURL(String) to a non static
            method
          • Implementations of createInputSourceFromURL() now cache a
            list of InputSource instances that are created on this instance.
            The input streams of these should be closed at the end of
            parsing.
          • New protected method cleanup(), whose default implementation
            closes the InputStream for any InputSource that was created by
            createInputSourceFromURL().
          • All parse() methods are modified to call cleanup() after parsing.
          Show
          Craig McClanahan added a comment - Here is my proposed changes to Digester (a new unit test case will be submitted separately). The primary changes are: New public method register(String publicId, URL entityURL) that accepts a URL instead of String for the second parameter. New public method parse(URL url) that accepts a URL to be parsed. New public method createInputSourceFromURL(URL url) that accepts a URL instead of a String. Change existing createInputSourceFromURL(String) to a non static method Implementations of createInputSourceFromURL() now cache a list of InputSource instances that are created on this instance. The input streams of these should be closed at the end of parsing. New protected method cleanup(), whose default implementation closes the InputStream for any InputSource that was created by createInputSourceFromURL(). All parse() methods are modified to call cleanup() after parsing.
          Hide
          Craig McClanahan added a comment -

          > If you feel strongly that this should be in 1.8, feel free to jump in now and
          > I will cut another RC when you're done with this. Thats no problem at all.

          Thanks ... I'm going to take you up on that, but probably not until Friday morning. Still have some Shale stuff to review and update tonight.

          Show
          Craig McClanahan added a comment - > If you feel strongly that this should be in 1.8, feel free to jump in now and > I will cut another RC when you're done with this. Thats no problem at all. Thanks ... I'm going to take you up on that, but probably not until Friday morning. Still have some Shale stuff to review and update tonight.
          Hide
          Rahul Akolkar added a comment -

          If you feel strongly that this should be in 1.8, feel free to jump in now and I will cut another RC when you're done with this. Thats no problem at all.

          Show
          Rahul Akolkar added a comment - If you feel strongly that this should be in 1.8, feel free to jump in now and I will cut another RC when you're done with this. Thats no problem at all.

            People

            • Assignee:
              Unassigned
              Reporter:
              Craig McClanahan
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development