Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-3385

Make meta data discovery compatible with fat jars

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.0.0uimaFIT
    • Fix Version/s: 2.1.0uimaFIT
    • Component/s: uimaFIT
    • Labels:
      None

      Description

      Most fat jar approaches, e.g. Maven assembly, have problems with resources at the same location in the classpath in multiple JARs. E.g. if two JARs being bundled into a fat jar have a types.txt file, only one survives.

      One option to fix this would be to change the pattern that uimaFIT uses to search for these files, e.g. from classpath*:META-INF/org.apache.uima.fit/types.txt to classpath*:META-INF/org.apache.uima.fit/**/types.txt

        Activity

        Hide
        rec Richard Eckart de Castilho added a comment -

        Here is the documentation on how to build executable fat jars: http://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#ugr.tools.uimafit.packaging

        Show
        rec Richard Eckart de Castilho added a comment - Here is the documentation on how to build executable fat jars: http://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#ugr.tools.uimafit.packaging
        Hide
        m09 Hugo Mougard added a comment - - edited

        I use the slim jar packaging to run uimaFIT applications easily during dev phases (copying dependencies to make a fat jar is too long during dev). From my experience packaging the application is preferable to only mvn exec it when you access resources inside the packaged jar directly (which happens quite a lot in my applications), because some different code will be needed to access the resources out of the jar (during a mvn exec).

        Indeed it is maybe too specific a use case to include it in the packaging doc.

        Show
        m09 Hugo Mougard added a comment - - edited I use the slim jar packaging to run uimaFIT applications easily during dev phases (copying dependencies to make a fat jar is too long during dev). From my experience packaging the application is preferable to only mvn exec it when you access resources inside the packaged jar directly (which happens quite a lot in my applications), because some different code will be needed to access the resources out of the jar (during a mvn exec). Indeed it is maybe too specific a use case to include it in the packaging doc.
        Hide
        rec Richard Eckart de Castilho added a comment -

        Hugo, could you provide some context/motivation for this kind of packaging? I gather that people building fat-jars do so in order to easily transfer and run an (uimaFIT) application to some other machine. Under which circumstances would somebody prefer the slim approach?

        Show
        rec Richard Eckart de Castilho added a comment - Hugo, could you provide some context/motivation for this kind of packaging? I gather that people building fat-jars do so in order to easily transfer and run an (uimaFIT) application to some other machine. Under which circumstances would somebody prefer the slim approach?
        Hide
        m09 Hugo Mougard added a comment - - edited

        I just read the section on packaging and that is great - exactly what was needed.

        Maybe it would also be possible to include what to do to produce an executable "slim jar" :

        pom.xml
                    <plugin>
                        <artifactId>maven-jar-plugin</artifactId>
                        <version>2.4</version>
                        <configuration>
                            <archive>
                                <manifest>
                                    <addClasspath>true</addClasspath>
                                    <classpathLayoutType>repository</classpathLayoutType>
                                    <classpathPrefix>${settings.localRepository}</classpathPrefix>
                                    <mainClass>myClass</mainClass>
                                </manifest>
                            </archive>
                        </configuration>
                    </plugin>
        

        I'm not sure if it's really important but it would have been useful to me some times ago.

        Cheers

        Show
        m09 Hugo Mougard added a comment - - edited I just read the section on packaging and that is great - exactly what was needed. Maybe it would also be possible to include what to do to produce an executable "slim jar" : pom.xml <plugin> <artifactId> maven-jar-plugin </artifactId> <version> 2.4 </version> <configuration> <archive> <manifest> <addClasspath> true </addClasspath> <classpathLayoutType> repository </classpathLayoutType> <classpathPrefix> ${settings.localRepository} </classpathPrefix> <mainClass> myClass </mainClass> </manifest> </archive> </configuration> </plugin> I'm not sure if it's really important but it would have been useful to me some times ago. Cheers
        Hide
        rec Richard Eckart de Castilho added a comment -

        Regarding the discovery of the meta data, I am happy that this issue can be closed without even making a change to the code. As mentioned before, uimaFIT stands in a good tradition in placing its configuration files in well-known locations without taking explicit care of the problems that can occur when building fat-jars. I argue that the problem is not on the part of uimaFIT, but rather on the part of the Maven Assembly Plugin, which fails to handle such cases. Fortunately, the Maven Shade Plugin does handle such cases and can be used to build (executable) fat-jars including uimaFIT-based components. A section on how to do this has been added to the uimaFIT guide.

        Regarding the misinterpretation of META-INF/org.apache.uima.fit as package notation, I have added the following note to the documentation:

        Mind that the file types.txt is must be located in META-INF/ org.apache.uima.fit where org.apache.uima.fit is the name of a sub-directory inside META-INF. We are not using the Java package notation here!

        Show
        rec Richard Eckart de Castilho added a comment - Regarding the discovery of the meta data, I am happy that this issue can be closed without even making a change to the code. As mentioned before, uimaFIT stands in a good tradition in placing its configuration files in well-known locations without taking explicit care of the problems that can occur when building fat-jars. I argue that the problem is not on the part of uimaFIT, but rather on the part of the Maven Assembly Plugin, which fails to handle such cases. Fortunately, the Maven Shade Plugin does handle such cases and can be used to build (executable) fat-jars including uimaFIT-based components. A section on how to do this has been added to the uimaFIT guide. Regarding the misinterpretation of META-INF/org.apache.uima.fit as package notation, I have added the following note to the documentation: Mind that the file types.txt is must be located in META-INF/ org.apache.uima.fit where org.apache.uima.fit is the name of a sub-directory inside META-INF. We are not using the Java package notation here!
        Hide
        rec Richard Eckart de Castilho added a comment -

        I believe there is no principal problem with a pull request. I should be able to handle that (may be more of a problem for large contributions). I just wanted to point out that it may not be applied according to the conventions of social coding.

        Show
        rec Richard Eckart de Castilho added a comment - I believe there is no principal problem with a pull request. I should be able to handle that (may be more of a problem for large contributions). I just wanted to point out that it may not be applied according to the conventions of social coding.
        Hide
        lfoppiano Luca Foppiano added a comment -

        OK, if you prefer I can send you the patch via email or on the Mailing List.

        Show
        lfoppiano Luca Foppiano added a comment - OK, if you prefer I can send you the patch via email or on the Mailing List.
        Hide
        rec Richard Eckart de Castilho added a comment -

        Sounds good Also sounds like use are using git. Please mind, that (afaik) I cannot directly apply a pull request, because the git repository is just a clone of our actual primary svn repository. As far as I know, I have to apply the patch manually to the svn, loosing any history, attribution and whatsoever that may be associated with a regular pull request.

        Show
        rec Richard Eckart de Castilho added a comment - Sounds good Also sounds like use are using git. Please mind, that (afaik) I cannot directly apply a pull request, because the git repository is just a clone of our actual primary svn repository. As far as I know, I have to apply the patch manually to the svn, loosing any history, attribution and whatsoever that may be associated with a regular pull request.
        Hide
        lfoppiano Luca Foppiano added a comment -

        Richard, perhaps I didn't write it correctly. I just wanted to tell that the dots as separator, are misleading (at least that what happened to me).

        I personally like the idea of META-INF/uimafit/**/types.txt, let's see what other people say.

        I've added a footnote in the documentation. I'll push the change it later today.

        Show
        lfoppiano Luca Foppiano added a comment - Richard, perhaps I didn't write it correctly. I just wanted to tell that the dots as separator, are misleading (at least that what happened to me). I personally like the idea of META-INF/uimafit/**/types.txt, let's see what other people say. I've added a footnote in the documentation. I'll push the change it later today.
        Hide
        rec Richard Eckart de Castilho added a comment - - edited

        Other software tends to use just a simple directory under META-INF or not even any directory:

        • Maven: META-INF/maven/... (actually META-INF/maven/<groupId>/<artifactId>/...)
        • CXF: META-INF/cxf/...
        • Spring: META-INF/spring.schemas

        I used the long name, because I fear the short names bear to much potential for conflicts. Before changing the name in any way, there should be convincing reasons and a really good best-practice. Since the time that uimaFIT has this feature, you're actually the first that complains about this particular aspect (flat naming vs. hierarchic naming).

        Given the other requirements in this issue, I'd probably tend to switch to something like META-INF/uimafit/<groupId>/<artifactId>/types.txt … but don't nail me down on this just yet.

        Let's wait for some more feedback from other people.

        Show
        rec Richard Eckart de Castilho added a comment - - edited Other software tends to use just a simple directory under META-INF or not even any directory: Maven: META-INF/maven/... (actually META-INF/maven/<groupId>/<artifactId>/...) CXF: META-INF/cxf/... Spring: META-INF/spring.schemas I used the long name, because I fear the short names bear to much potential for conflicts. Before changing the name in any way, there should be convincing reasons and a really good best-practice. Since the time that uimaFIT has this feature, you're actually the first that complains about this particular aspect (flat naming vs. hierarchic naming). Given the other requirements in this issue, I'd probably tend to switch to something like META-INF/uimafit/<groupId>/<artifactId>/types.txt … but don't nail me down on this just yet. Let's wait for some more feedback from other people.
        Hide
        lfoppiano Luca Foppiano added a comment -

        Aha!!!

        So the directory name is actually org.apache.uima.fit and they are not intended as packages.

        To be honest, this should be changed in a more user-friendly way.

        Show
        lfoppiano Luca Foppiano added a comment - Aha!!! So the directory name is actually org.apache.uima.fit and they are not intended as packages. To be honest, this should be changed in a more user-friendly way.
        Hide
        rec Richard Eckart de Castilho added a comment -

        I believe in your case this is not working because you use the wrong folder structure.

        You use

        META-INF/org/apache/uima/fit/types.txt
        

        but it should be

        META-INF/org.apache.uima.fit/types.txt
        
        Show
        rec Richard Eckart de Castilho added a comment - I believe in your case this is not working because you use the wrong folder structure. You use META-INF/org/apache/uima/fit/types.txt but it should be META-INF/org.apache.uima.fit/types.txt
        Hide
        lfoppiano Luca Foppiano added a comment - - edited

        I have found this problem as well, here my assumptions:

        1. I'm using maven
        2. I've placed the file types.txt into the directory META-INF/org/apache/uima/fit/
        3. When I run the pipeline, the file types.txt is not found

        I've debugged until the Spring classes and beyond and for me is not clear why this is not working properly.

        My workaround is to use the JVM parameter -D....import_patterns to define the destination xml to parse.

        I've tried to patch uima-fit-core by adding /**/ but it didn't work.

        Show
        lfoppiano Luca Foppiano added a comment - - edited I have found this problem as well, here my assumptions: 1. I'm using maven 2. I've placed the file types.txt into the directory META-INF/org/apache/uima/fit/ 3. When I run the pipeline, the file types.txt is not found I've debugged until the Spring classes and beyond and for me is not clear why this is not working properly. My workaround is to use the JVM parameter -D....import_patterns to define the destination xml to parse. I've tried to patch uima-fit-core by adding /**/ but it didn't work.

          People

          • Assignee:
            rec Richard Eckart de Castilho
            Reporter:
            rec Richard Eckart de Castilho
          • Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development