Commons Discovery
  1. Commons Discovery
  2. DISCOVERY-6

[discovery] Doesn't work with ClassLoaders that do not support getResource()

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.5
    • Labels:
      None
    • Environment:

      Operating System: Linux
      Platform: PC

      Description

      I am trying to run inside a plugin environment where the plugin ClassLoader
      extracts classes from nested jarfiles. This ClassLoader supports loadClass() and
      getResourceAsStream(), but does not support getResource().

      Discovery assumes that all classes that can be loaded by a ClassLoader can be
      accessed by getResource(), and thus fails in this environment. In my googling, I
      found several people experiencing this problem, but no solutions.

      Is this project still actively maintained?

        Activity

        Len Trigg created issue -
        Hide
        Craig McClanahan added a comment -

        There haven't been a lot of bugs reported against Discovery, so there hasn't
        been much need for activity. But personally, I have a really hard time feeling
        much sympathy for using a broken ClassLoader implementation that violates the
        JavaDoc requirements describing the needed functionality.

        Show
        Craig McClanahan added a comment - There haven't been a lot of bugs reported against Discovery, so there hasn't been much need for activity. But personally, I have a really hard time feeling much sympathy for using a broken ClassLoader implementation that violates the JavaDoc requirements describing the needed functionality.
        Hide
        Len Trigg added a comment -

        What specifically are the requirements that are violated? I could not see any
        mention that getResource() had to be able to return a URL for class data. The
        example implementation (NetworkClassLoader) given in the JavaDocs for
        ClassLoader just has a private method to extract the raw bytes for the class,
        that does not have to correspond to a URL, and no override for getResource at all.

        If there are some specific requirements being violated, I can email this
        information to the authors of the application.

        Show
        Len Trigg added a comment - What specifically are the requirements that are violated? I could not see any mention that getResource() had to be able to return a URL for class data. The example implementation (NetworkClassLoader) given in the JavaDocs for ClassLoader just has a private method to extract the raw bytes for the class, that does not have to correspond to a URL, and no override for getResource at all. If there are some specific requirements being violated, I can email this information to the authors of the application.
        Hide
        Len Trigg added a comment -

        I still haven't been able to find any information that says the custom class
        loader in this case is doing anything wrong. In fact, I haven't seen any mention
        that there has to be any overlap between the set of classes that can be
        loaded, and what data is available via either getResource or
        getResourceAsStream. It's quite possibly an "accident" that the approach
        discovery currently uses works at all!

        Show
        Len Trigg added a comment - I still haven't been able to find any information that says the custom class loader in this case is doing anything wrong. In fact, I haven't seen any mention that there has to be any overlap between the set of classes that can be loaded, and what data is available via either getResource or getResourceAsStream. It's quite possibly an "accident" that the approach discovery currently uses works at all!
        Hide
        Richard A. Sitze added a comment -

        As the implementor of the discovery dependencies on the class loader, I
        believe that I have a reasonable expectation that ClassLoaders adhere to the
        behavior described in the documentation. There is no clear indication that
        the getResource method is optional. The counter argument is, IMHO, an
        interesting point of debate, but avoids the question of reasonable
        expectation...

        That said, I acknowledge that you have a problem. I'm not sure how to resolve
        your problem... but am willing to discuss alternate designs if someone has a
        suggestion. If the discovery community can determine a reasonable design
        based on your scenario, we would be in a position to accept volunteer work to
        implement that design

        Just to level set - the discovery process is centered around the idea of
        obtaining a "name" that describes the discovered resources, and the idea of
        using an "URL" for that name is intuitively correct... I might argue that in
        fact if your ClassLoader cannot return a "name"/URL, then it's resources are
        not "discoverable" in this level. In that interpretation, it works as
        expected.

        Any suggestions?

        <ras>

        Show
        Richard A. Sitze added a comment - As the implementor of the discovery dependencies on the class loader, I believe that I have a reasonable expectation that ClassLoaders adhere to the behavior described in the documentation. There is no clear indication that the getResource method is optional. The counter argument is, IMHO, an interesting point of debate, but avoids the question of reasonable expectation... That said, I acknowledge that you have a problem. I'm not sure how to resolve your problem... but am willing to discuss alternate designs if someone has a suggestion. If the discovery community can determine a reasonable design based on your scenario, we would be in a position to accept volunteer work to implement that design Just to level set - the discovery process is centered around the idea of obtaining a "name" that describes the discovered resources, and the idea of using an "URL" for that name is intuitively correct... I might argue that in fact if your ClassLoader cannot return a "name"/URL, then it's resources are not "discoverable" in this level. In that interpretation, it works as expected. Any suggestions? <ras>
        Hide
        Len Trigg added a comment -

        Can you clarify what documented behaviour you are referring to? There is
        nothing in the ClassLoader JavaDoc documentation that says getResource must also
        allow access to class data. They do mention that a ClassLoader might construct
        class byte-code internally (thus there would not be a corresponding URL, without
        the ClassLoader also implementing a custom URL protocol handler).

        > suggestion. If the discovery community can determine a reasonable design
        > based on your scenario, we would be in a position to accept volunteer work to
        > implement that design

        What is wrong with just calling loadClass() (since it is finding classes that
        I'm worried about here, not resources)? I patched DiscoverClasses to use that
        instead, and it seems to work fine. It allows me to use Axis inside this plugin
        environment now, whereas previously I could only use Sun's JWSDP. I'll attach a
        patch against current CVS.

        As for discovering resources rather than classes, perhaps the system could fall
        back to seeing whether getResourceAsStream returns non-null (in fact that is
        what the first version of my DiscoverClasses patch did, since the ClassLoader
        I'm working with did allow this type of access to the class data). I'm not sure
        which of your classes that would fit into though.

        Show
        Len Trigg added a comment - Can you clarify what documented behaviour you are referring to? There is nothing in the ClassLoader JavaDoc documentation that says getResource must also allow access to class data. They do mention that a ClassLoader might construct class byte-code internally (thus there would not be a corresponding URL, without the ClassLoader also implementing a custom URL protocol handler). > suggestion. If the discovery community can determine a reasonable design > based on your scenario, we would be in a position to accept volunteer work to > implement that design What is wrong with just calling loadClass() (since it is finding classes that I'm worried about here, not resources)? I patched DiscoverClasses to use that instead, and it seems to work fine. It allows me to use Axis inside this plugin environment now, whereas previously I could only use Sun's JWSDP. I'll attach a patch against current CVS. As for discovering resources rather than classes, perhaps the system could fall back to seeing whether getResourceAsStream returns non-null (in fact that is what the first version of my DiscoverClasses patch did, since the ClassLoader I'm working with did allow this type of access to the class data). I'm not sure which of your classes that would fit into though.
        Hide
        Len Trigg added a comment -

        Created an attachment (id=13603)
        Patch to use loadClass to discover classes.

        Show
        Len Trigg added a comment - Created an attachment (id=13603) Patch to use loadClass to discover classes.
        Hide
        Richard A. Sitze added a comment -

        Just pulling a reference book from my shelf with the javadoc: "The default
        implementation of this method does nothing;" - as you indicated. "Subclasses
        of ClassLoader should override this method..."

        Thoughts:

        1. "should" versus "shall" or "must". Acknowledge the issue, but maintain I
        have a right to consider this expected behavior.

        2. Use of "loadClass". I put a lot of thought into making this performant...
        potentially there can be multiple hits on a resource, and I don't want to load
        all the classes. Consider the situation where discovery locates multiple XML
        parsers... do you really want it to load all of them? No. You want your
        application to determine which one it wants... or you want the discovery
        helper classes to load the first one only.

        3. Discovery does more than load classes, it locates any resource available
        from the classloader. Do to this (nested jars within jars?) it works on and
        is capable of returning lists of resource names (URLs) which can be
        individually loaded. Again, for performance reasons we don't want to force
        loading of these resources - we want to discover & report.. and let the
        applicaiton drive loading.

        AGAIN Bottom line, a ClassLoader that does not support discovery of
        resources is, by my definition/understanding of the goals/purposes of
        discovery, not usable by discovery. Some elements of discovery may be able
        to be made to work [as you have shown] with impacts on performance.. but are
        not generally useful.

        In this case, with your patch, discovery may locate your class, but would be
        unable to locate other resources within that nested jar file (property files,
        META-INF/services/*, whatever)... a further patch would be necessessary
        [getResourceAsStream] which only compounds the resource problem I raise.

        So, we have a conflict here I suppose we need to resolve. Anyone else care to
        comment on the performance issue of using loadClass [runtime AND memory] and
        getResourceAsStream versus a more "adaptable" discovery mechanism that works
        with err.. 'partial' ClassLoader implementations?

        Is there a URL/URI pattern that covers nested jar files that would be
        meaningful and useful for such a classloader?

        <ras>

        Show
        Richard A. Sitze added a comment - Just pulling a reference book from my shelf with the javadoc: "The default implementation of this method does nothing;" - as you indicated. "Subclasses of ClassLoader should override this method..." Thoughts: 1. "should" versus "shall" or "must". Acknowledge the issue, but maintain I have a right to consider this expected behavior. 2. Use of "loadClass". I put a lot of thought into making this performant... potentially there can be multiple hits on a resource, and I don't want to load all the classes. Consider the situation where discovery locates multiple XML parsers... do you really want it to load all of them? No. You want your application to determine which one it wants... or you want the discovery helper classes to load the first one only. 3. Discovery does more than load classes, it locates any resource available from the classloader. Do to this (nested jars within jars?) it works on and is capable of returning lists of resource names (URLs) which can be individually loaded. Again, for performance reasons we don't want to force loading of these resources - we want to discover & report.. and let the applicaiton drive loading. AGAIN Bottom line, a ClassLoader that does not support discovery of resources is, by my definition/understanding of the goals/purposes of discovery, not usable by discovery. Some elements of discovery may be able to be made to work [as you have shown] with impacts on performance.. but are not generally useful. In this case, with your patch, discovery may locate your class, but would be unable to locate other resources within that nested jar file (property files, META-INF/services/*, whatever)... a further patch would be necessessary [getResourceAsStream] which only compounds the resource problem I raise. So, we have a conflict here I suppose we need to resolve. Anyone else care to comment on the performance issue of using loadClass [runtime AND memory] and getResourceAsStream versus a more "adaptable" discovery mechanism that works with err.. 'partial' ClassLoader implementations? Is there a URL/URI pattern that covers nested jar files that would be meaningful and useful for such a classloader? <ras>
        Hide
        Joe Germuska added a comment -

        As a bystander, I find myself kind of ambivalent. My first response was to
        agree with Craig and Richard: the classloader must be broken. But I also find
        valid Len's point that the ClassLoader may be constructing a class' bytecode
        internally, therefore there may not be a corresponding URL.

        It's also true that while Richard's overall vision/architecture for Discovery is
        more wide-ranging, I think it's also true that most people want to use it to
        load classes, and probably in many cases, they just want to load the first one
        on the classpath.

        Since the patched method is specifically about returning ResourceClass objects
        whose sole purpose is to return an instance of java.lang.Class, why not simply
        use Len's patch as an "if (url == null)" block. While at it, why not hold on to
        the class returned by loadClass and pass it to the ResourceClass constructor
        which actually takes a Class object.

        There's perhaps a small hitch in passing in a null URL, should someone call the
        getResourceAsStream method which ResourceClass inherits, especially if as Len
        describes, these unusual ClassLoader implementations could return the resource
        as a stream if you asked it directly. You could probably fix that with a custom
        subclass of ResourceClass if it really seemed necessary.

        Show
        Joe Germuska added a comment - As a bystander, I find myself kind of ambivalent. My first response was to agree with Craig and Richard: the classloader must be broken. But I also find valid Len's point that the ClassLoader may be constructing a class' bytecode internally, therefore there may not be a corresponding URL. It's also true that while Richard's overall vision/architecture for Discovery is more wide-ranging, I think it's also true that most people want to use it to load classes, and probably in many cases, they just want to load the first one on the classpath. Since the patched method is specifically about returning ResourceClass objects whose sole purpose is to return an instance of java.lang.Class, why not simply use Len's patch as an "if (url == null)" block. While at it, why not hold on to the class returned by loadClass and pass it to the ResourceClass constructor which actually takes a Class object. There's perhaps a small hitch in passing in a null URL, should someone call the getResourceAsStream method which ResourceClass inherits, especially if as Len describes, these unusual ClassLoader implementations could return the resource as a stream if you asked it directly. You could probably fix that with a custom subclass of ResourceClass if it really seemed necessary.
        Hide
        Richard A. Sitze added a comment -

        Ah, another voice.. thank you.

        I want to note that the patch in question uses loadClass to determine if the
        resource is available, but in fact returns the resource name [along with the
        loader that found it]... the class isn't returned, isn't used, and it isn't
        [today] cached.

        On the other hand: for performance, the existing implementation defers
        resolution of classes and other artifacts it needs for as long as possible
        [won't claim perfection, just a goal and general scheme]. So... for the cases
        where you grab the first one available the current scheme should be sufficient
        even with the patch... and I think this is more to your point.

        The if(url = null) idea is more in line with where I would be willing to go...
        if a resource isn't available then the load should fail very quickly [assuming
        classloaders are implemented for performance].

        I haven't looked at the code in a while, so I'm not on top of the
        ramifications of this... suppose I need to refresh.

        Let me start looking at the code.

        <ras>

        Show
        Richard A. Sitze added a comment - Ah, another voice.. thank you. I want to note that the patch in question uses loadClass to determine if the resource is available, but in fact returns the resource name [along with the loader that found it]... the class isn't returned, isn't used, and it isn't [today] cached. On the other hand: for performance, the existing implementation defers resolution of classes and other artifacts it needs for as long as possible [won't claim perfection, just a goal and general scheme] . So... for the cases where you grab the first one available the current scheme should be sufficient even with the patch... and I think this is more to your point. The if(url = null) idea is more in line with where I would be willing to go... if a resource isn't available then the load should fail very quickly [assuming classloaders are implemented for performance]. I haven't looked at the code in a while, so I'm not on top of the ramifications of this... suppose I need to refresh. Let me start looking at the code. <ras>
        Hide
        Richard A. Sitze added a comment -

        Ok, I'd really like a test for this... Len, can you could send me a test that
        demonstrates the problem [will need to include it's own classloader]?

        <ras>

        Show
        Richard A. Sitze added a comment - Ok, I'd really like a test for this... Len, can you could send me a test that demonstrates the problem [will need to include it's own classloader] ? <ras>
        Hide
        Emmanuel Bourg added a comment -

        And what about catching an UnsupportedOperationException on calling getRessource
        and falling back on loadClass in this case ? It's slightly better than the null
        check.

        Show
        Emmanuel Bourg added a comment - And what about catching an UnsupportedOperationException on calling getRessource and falling back on loadClass in this case ? It's slightly better than the null check.
        Hide
        Len Trigg added a comment -

        I do not have the source to the problematic ClassLoader in question (otherwise I
        would probably have tried to patch it to add a getResource() implementation ).

        I will try to put together some simple test code.

        Show
        Len Trigg added a comment - I do not have the source to the problematic ClassLoader in question (otherwise I would probably have tried to patch it to add a getResource() implementation ). I will try to put together some simple test code.
        Hide
        Len Trigg added a comment -

        Created an attachment (id=13630)
        Jarfile containing testcase.

        Extract the jar into a new directory.
        Place commons-logging and commons-discovery jarfiles into lib directory.

        See discovery work without the custom classloader:
        ant no-brassloader

        See discovery fail when using the custom classloader:
        ant with-brassloader

        Drop commons-discovery.jar including my patch into lib and then ant
        with-brassloader will pass.

        Show
        Len Trigg added a comment - Created an attachment (id=13630) Jarfile containing testcase. Extract the jar into a new directory. Place commons-logging and commons-discovery jarfiles into lib directory. See discovery work without the custom classloader: ant no-brassloader See discovery fail when using the custom classloader: ant with-brassloader Drop commons-discovery.jar including my patch into lib and then ant with-brassloader will pass.
        Henri Yandell made changes -
        Field Original Value New Value
        issue.field.bugzillaimportkey 32369 12341892
        Henri Yandell made changes -
        Key COM-1740 DISCOVERY-6
        Component/s Discovery [ 12311112 ]
        Affects Version/s unspecified [ 12311647 ]
        Project Commons [ 12310458 ] Commons Discovery [ 12310472 ]
        Assignee Jakarta Commons Developers Mailing List [ commons-dev@jakarta.apache.org ]
        Hide
        Henri Yandell added a comment -

        Emmanuel's suggestion seems pretty good to me, but I don't think this is something to dive into for 0.4.

        Show
        Henri Yandell added a comment - Emmanuel's suggestion seems pretty good to me, but I don't think this is something to dive into for 0.4.
        Hide
        Simone Tripodi added a comment -

        I just applied the Emmanuel's suggestion, see DiscoverClasses r1090196

        Show
        Simone Tripodi added a comment - I just applied the Emmanuel's suggestion, see DiscoverClasses r1090196
        Simone Tripodi made changes -
        Assignee Simone Tripodi [ simone.tripodi ]
        Hide
        Simone Tripodi added a comment -

        fixed on /trunk, see DiscoverClasses r1090690, used Emmanuel's suggestion and Torsten Curdt's hint on http://vafer.org/blog/20081210001333/

        Show
        Simone Tripodi added a comment - fixed on /trunk, see DiscoverClasses r1090690, used Emmanuel's suggestion and Torsten Curdt's hint on http://vafer.org/blog/20081210001333/
        Simone Tripodi made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 0.5 [ 12312208 ]
        Resolution Fixed [ 1 ]
        Hide
        Sebb added a comment -

        ProtectionDomain.getCodeSource() can return null; this needs to be allowed for (line 107)

        Show
        Sebb added a comment - ProtectionDomain.getCodeSource() can return null; this needs to be allowed for (line 107)
        Sebb made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Hide
        Simone Tripodi added a comment -

        So, IIUC, when ProtectionDomain.getCodeSource() returns null, we have to consider the resource not found, right?

        Show
        Simone Tripodi added a comment - So, IIUC, when ProtectionDomain.getCodeSource() returns null, we have to consider the resource not found, right?
        Hide
        Sebb added a comment -

        Yes, that seems better than throwing NPE.

        Show
        Sebb added a comment - Yes, that seems better than throwing NPE.
        Hide
        Simone Tripodi added a comment -

        fixed on /trunk, see DiscoverClasses r1090698

        Show
        Simone Tripodi added a comment - fixed on /trunk, see DiscoverClasses r1090698
        Simone Tripodi made changes -
        Status Reopened [ 4 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Simone Tripodi added a comment -

        included in discovery-0.5 release

        Show
        Simone Tripodi added a comment - included in discovery-0.5 release
        Simone Tripodi made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Simone Tripodi
            Reporter:
            Len Trigg
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development