Solr
  1. Solr
  2. SOLR-3405

maven artifacts should be equivalent to binary packaging

    Details

    • Type: Task Task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 4.9, 5.0
    • Component/s: Build
    • Labels:
      None

      Description

      Lets take the commons-csv scenario:

      • apache-solr-3.5.0 binary distribution contains no actual commons-csv.jar anywhere,
        in fact it contains no third party jars (the stuff present in solr/lib) at all.
      • binary distribution contains only the jars necessary for solrj and contrib plugins, and a solr.war

      I think the maven artifacts should match whats in the binary release (no third party jars
      inside the .war are "exposed", we just publish the .war itself). This exposes a lot less surface area.

        Issue Links

          Activity

          Hide
          Benson Margulies added a comment - - edited

          What you did is perfect in every way if you want to publish the JAR so that API-style users get the benefit, but it's a lot of work if all you want it to put a patch into a war or an assembly.

          How does the following alternative strike for getting patched binaries into the war without them leaking anywhere or renaming packages?

          WAR FILES:

          Some script (probably in ant):

          1. Grab and patch patch the source (not changing the package) and builds a jar for each patched item.
          2. The results are assembled into a 'sparse war file' (just containing WEB-INF/lib/all-them-jars).
          3. mvn install:install-file (or the maven ant tools) push the results to the local repository.
          4. the pom for the war file lists the results as an 'overlay'.

          It seems to me that the WAR file is the whole show here, since all the patched binaries go inside the war? If that's no so, let me know.

          Show
          Benson Margulies added a comment - - edited What you did is perfect in every way if you want to publish the JAR so that API-style users get the benefit, but it's a lot of work if all you want it to put a patch into a war or an assembly. How does the following alternative strike for getting patched binaries into the war without them leaking anywhere or renaming packages? WAR FILES: Some script (probably in ant): 1. Grab and patch patch the source (not changing the package) and builds a jar for each patched item. 2. The results are assembled into a 'sparse war file' (just containing WEB-INF/lib/all-them-jars). 3. mvn install:install-file (or the maven ant tools) push the results to the local repository. 4. the pom for the war file lists the results as an 'overlay'. It seems to me that the WAR file is the whole show here, since all the patched binaries go inside the war? If that's no so, let me know.
          Hide
          Uwe Schindler added a comment - - edited

          I dont understand this issue and I don't want to get the whole thing again by scanning again through the whole ML thread. Can we conclcde in the issue description, what we should do here? I think the 3.6 artifacts in Lucene and Solr are exectly the way we want it? It works out of the box and starts a working Solr? What should be changed?

          Show
          Uwe Schindler added a comment - - edited I dont understand this issue and I don't want to get the whole thing again by scanning again through the whole ML thread. Can we conclcde in the issue description, what we should do here? I think the 3.6 artifacts in Lucene and Solr are exectly the way we want it? It works out of the box and starts a working Solr? What should be changed?
          Hide
          Robert Muir added a comment -

          I don't think we have to attack the patched binaries scenario right now on this issue?

          I just want the current maven artifacts to be consistent with whats in apache-solr-xx.tar.gz

          If maven is consistent with the binary release, I think there will be a lot less concern
          about maven, because then we know what we are 'publishing'.

          But currently we don't! Maven is different here, and that should be fixed so its release
          artifacts are consistent with the binary package.

          Show
          Robert Muir added a comment - I don't think we have to attack the patched binaries scenario right now on this issue? I just want the current maven artifacts to be consistent with whats in apache-solr-xx.tar.gz If maven is consistent with the binary release, I think there will be a lot less concern about maven, because then we know what we are 'publishing'. But currently we don't! Maven is different here, and that should be fixed so its release artifacts are consistent with the binary package.
          Hide
          Benson Margulies added a comment -

          Oh, drat. I thought I was cleverly reducing noise on the list by parking this idea here. Sorry.

          Show
          Benson Margulies added a comment - Oh, drat. I thought I was cleverly reducing noise on the list by parking this idea here. Sorry.
          Hide
          Uwe Schindler added a comment -

          How does it differ? I dont understand it, sorry! Do you mean the JAR files are different? Please give an example.

          Show
          Uwe Schindler added a comment - How does it differ? I dont understand it, sorry! Do you mean the JAR files are different? Please give an example.
          Hide
          Robert Muir added a comment -
          $ unzip -l apache-solr-3.5.0.zip | grep commons-csv
          $ 
          

          But,

          http://search.maven.org/#artifactdetails|org.apache.solr|solr-commons-csv|3.5.0|jar

          Show
          Robert Muir added a comment - $ unzip -l apache-solr-3.5.0.zip | grep commons-csv $ But, http://search.maven.org/#artifactdetails |org.apache.solr|solr-commons-csv|3.5.0|jar
          Hide
          Uwe Schindler added a comment -

          OK, that explains

          In my opinion, Solr should not deploy any maven artifact except SOLRJ and the WAR file.

          Show
          Uwe Schindler added a comment - OK, that explains In my opinion, Solr should not deploy any maven artifact except SOLRJ and the WAR file.
          Hide
          Michael McCandless added a comment -

          +1

          Show
          Michael McCandless added a comment - +1
          Hide
          Ryan McKinley added a comment -

          wasn't this solved in 3.6?

          Show
          Ryan McKinley added a comment - wasn't this solved in 3.6?
          Hide
          Robert Muir added a comment -

          Not at all!

          Look at the issue title: apache-solr-3.6.0.zip does not contain
          the third party jars used in the war file (such as guava.jar), go look inside the zip.

          But the maven artifacts expose these inner details: http://search.maven.org/remotecontent?filepath=org/apache/solr/solr-core/3.6.0/solr-core-3.6.0.pom

          This is why the commons-csv issue emerged.

          Show
          Robert Muir added a comment - Not at all! Look at the issue title: apache-solr-3.6.0.zip does not contain the third party jars used in the war file (such as guava.jar), go look inside the zip. But the maven artifacts expose these inner details: http://search.maven.org/remotecontent?filepath=org/apache/solr/solr-core/3.6.0/solr-core-3.6.0.pom This is why the commons-csv issue emerged.
          Hide
          Ryan McKinley added a comment -

          why would we distribute guava.jar? Seems like we are doing the right thing here.

          Show
          Ryan McKinley added a comment - why would we distribute guava.jar? Seems like we are doing the right thing here.
          Hide
          Robert Muir added a comment -

          OK i'll go change it to use a patched guava jar. Now what?

          Now its a serious problem for maven (and we have to make either a "solr-guava" fake release, or suck in all of their code).

          But its no problem for any of our other packaging:

          • source build can download + patch
          • binary dist doesn't include guava jar anyway

          This needs to be fixed (maven should be equivalent to binary packaging), or we shouldn't publish any maven at all.

          So maven just needs the .war, and the solrj in its artifacts. it doesnt need all this other stuff.
          this makes like 50 or so third party dependencies so much simpler.

          The reduced exposure prevents things like commons-csv problems (totally 100% a maven problem, as I've always said, look at my comment above)

          it also makes it so that PMC members who don't understand maven, can simply look at the binary release
          and understand what we are ALSO releasing into maven.

          Show
          Robert Muir added a comment - OK i'll go change it to use a patched guava jar. Now what? Now its a serious problem for maven (and we have to make either a "solr-guava" fake release, or suck in all of their code). But its no problem for any of our other packaging: source build can download + patch binary dist doesn't include guava jar anyway This needs to be fixed (maven should be equivalent to binary packaging), or we shouldn't publish any maven at all. So maven just needs the .war, and the solrj in its artifacts. it doesnt need all this other stuff. this makes like 50 or so third party dependencies so much simpler. The reduced exposure prevents things like commons-csv problems (totally 100% a maven problem, as I've always said, look at my comment above) it also makes it so that PMC members who don't understand maven, can simply look at the binary release and understand what we are ALSO releasing into maven.
          Hide
          Steve Rowe added a comment -

          So under this proposal, which of these would NOT be published on maven central?:

          • solr-core-X.Y.Z.jar
          • solr-test-framework-X.Y.Z.jar
          • solr-<contrib-name>-X.Y.Z.jar

          If I understand properly, under this proposal, the Solr war would be published on maven central, but several maven proponents have said that that is not useful. By contrast, I believe there are people who currently depend on solr-core and solr-test-framework via Maven.

          For solrj, to make maven artifacts consistent with the binary distribution, I think the POM should mark as optional those dependencies that don't ship with the binary distribution (that may already be the case, I haven't checked).

          Show
          Steve Rowe added a comment - So under this proposal, which of these would NOT be published on maven central?: solr-core-X.Y.Z.jar solr-test-framework-X.Y.Z.jar solr-<contrib-name>-X.Y.Z.jar If I understand properly, under this proposal, the Solr war would be published on maven central, but several maven proponents have said that that is not useful. By contrast, I believe there are people who currently depend on solr-core and solr-test-framework via Maven. For solrj, to make maven artifacts consistent with the binary distribution, I think the POM should mark as optional those dependencies that don't ship with the binary distribution (that may already be the case, I haven't checked).
          Hide
          Robert Muir added a comment -

          I think the contribs are actually in our package (along with their third party dependencies!)

          So in my opinion, they should also be in maven: it should match.

          Show
          Robert Muir added a comment - I think the contribs are actually in our package (along with their third party dependencies!) So in my opinion, they should also be in maven: it should match.
          Hide
          Steve Rowe added a comment -

          But the maven artifacts expose these inner details: http://search.maven.org/remotecontent?filepath=org/apache/solr/solr-core/3.6.0/solr-core-3.6.0.pom

          I think the contribs are actually in our package (along with their third party dependencies!) So in my opinion, they should also be in maven: it should match.

          Ok, so if I understand correctly, the problem as you see it is not the binary jars/war that are published on Maven Central (AFAICT, the set of jars/war in Maven Central are the same as in Solr's binary distribution), but rather the POMs associated with them that refer to third-party artifacts, like commons-csv. Right?

          Show
          Steve Rowe added a comment - But the maven artifacts expose these inner details: http://search.maven.org/remotecontent?filepath=org/apache/solr/solr-core/3.6.0/solr-core-3.6.0.pom I think the contribs are actually in our package (along with their third party dependencies!) So in my opinion, they should also be in maven: it should match. Ok, so if I understand correctly, the problem as you see it is not the binary jars/war that are published on Maven Central (AFAICT, the set of jars/war in Maven Central are the same as in Solr's binary distribution), but rather the POMs associated with them that refer to third-party artifacts, like commons-csv. Right?
          Hide
          Robert Muir added a comment -

          yeah: i mean we can look at this two ways:
          1) that the solr binary package is broken by just shipping solr-core.jar without its dependnecies
          2) that the maven package is over-reaching by needing to specify them.

          I think, more importantly than anything else (as mentioned on this issue title), that they should match.

          if its so important to use solr-core.jar (but not the war), we could add these dependencies
          to the binary release too.

          However we should think seriously about this: because we are talking about a lot of third party dependencies,
          a lot more to be responsible for, and trickier handling of patched dependencies. And i've never heard
          anyone complain about e.g. guava.jar not being in the binary package, ever. but maybe i'm missing something.

          I hope this makes sense: the fact that they are different I think is the worst.

          Show
          Robert Muir added a comment - yeah: i mean we can look at this two ways: 1) that the solr binary package is broken by just shipping solr-core.jar without its dependnecies 2) that the maven package is over-reaching by needing to specify them. I think, more importantly than anything else (as mentioned on this issue title), that they should match. if its so important to use solr-core.jar (but not the war), we could add these dependencies to the binary release too. However we should think seriously about this: because we are talking about a lot of third party dependencies, a lot more to be responsible for, and trickier handling of patched dependencies. And i've never heard anyone complain about e.g. guava.jar not being in the binary package, ever. but maybe i'm missing something. I hope this makes sense: the fact that they are different I think is the worst.
          Hide
          Robert Muir added a comment -

          Or just said another way, we are currently releasing solr two different ways as binary:

          1. as an "app" (war file) in the .zip
          2. with its "guts exposed" on maven

          we should be able to come to an agreement about what needs to be in the binary release,
          and how it will be packaged, whether solr is an application or not, etc. we have to.

          its absurd to be releasing it two completely different ways.

          Show
          Robert Muir added a comment - Or just said another way, we are currently releasing solr two different ways as binary: 1. as an "app" (war file) in the .zip 2. with its "guts exposed" on maven we should be able to come to an agreement about what needs to be in the binary release, and how it will be packaged, whether solr is an application or not, etc. we have to. its absurd to be releasing it two completely different ways.
          Hide
          Steve Rowe added a comment -

          the fact that they are different I think is the worst.

          Stated another way: POMs for Solr jars/war published on Maven Central should never require (i.e., have a non-optional dependency on) a third party artifact if that third party dependency is not directly included in the binary package; the contents of the war don't count as "inclusion in the binary package".

          Show
          Steve Rowe added a comment - the fact that they are different I think is the worst. Stated another way: POMs for Solr jars/war published on Maven Central should never require (i.e., have a non-optional dependency on) a third party artifact if that third party dependency is not directly included in the binary package; the contents of the war don't count as "inclusion in the binary package".
          Hide
          Steve Rowe added a comment -

          2. with its "guts exposed" on maven

          hmm, by "guts exposed" you mean: the Solr Maven artifacts' POMs document their dependencies. Right?

          Show
          Steve Rowe added a comment - 2. with its "guts exposed" on maven hmm, by "guts exposed" you mean: the Solr Maven artifacts' POMs document their dependencies. Right?
          Hide
          Robert Muir added a comment -

          Well i think so, I mean the way maven publishes solr, it publishes it as if it were an api, not an application.
          But the binary release treats solr as an application. This is a big difference!

          Because of this we previously also published some war dependencies (commons-csv) also as api in maven too.
          This is what got people all upset, but if you look at our binary package we don't ever package their stuff up this way.

          Releasing an application is easier. we don't care about dependencies (except that they are legal): just that our .war works.
          and if the .war also wants to be in maven, then it should declare no dependencies (it works by itself).

          Show
          Robert Muir added a comment - Well i think so, I mean the way maven publishes solr, it publishes it as if it were an api , not an application. But the binary release treats solr as an application. This is a big difference! Because of this we previously also published some war dependencies (commons-csv) also as api in maven too. This is what got people all upset, but if you look at our binary package we don't ever package their stuff up this way. Releasing an application is easier. we don't care about dependencies (except that they are legal): just that our .war works. and if the .war also wants to be in maven, then it should declare no dependencies (it works by itself).
          Hide
          Ryan McKinley added a comment -

          it publishes it as if it were an api, not an application.

          solr-core.jar is an API (how would anyone write RequestHandlers,Components,etc,etc w/o it!)

          solr.war is an application

          Show
          Ryan McKinley added a comment - it publishes it as if it were an api , not an application. solr-core.jar is an API (how would anyone write RequestHandlers,Components,etc,etc w/o it!) solr.war is an application
          Hide
          Benson Margulies added a comment - - edited

          It might be helpful to note the following: with 3.5.0, 3.6.0, and 4.0-SNAPSHOT, I can create a Maven project with a dependency on solr-core, and have all the necessaries show up to sucessfully use EmbeddedSolrServer. The result is:

          [INFO] +- org.apache.solr:solr-core:jar:3.5.0:provided
          [INFO] |  +- org.apache.solr:solr-solrj:jar:3.5.0:provided
          [INFO] |  |  \- org.codehaus.woodstox:wstx-asl:jar:3.2.7:provided
          [INFO] |  +- org.apache.solr:solr-noggit:jar:3.5.0:provided
          [INFO] |  +- org.apache.lucene:lucene-core:jar:3.5.0:provided
          [INFO] |  +- org.apache.lucene:lucene-analyzers:jar:3.5.0:provided
          [INFO] |  +- org.apache.lucene:lucene-highlighter:jar:3.5.0:provided
          [INFO] |  +- org.apache.lucene:lucene-memory:jar:3.5.0:provided
          [INFO] |  +- org.apache.lucene:lucene-misc:jar:3.5.0:provided
          [INFO] |  +- org.apache.lucene:lucene-queries:jar:3.5.0:provided
          [INFO] |  |  \- jakarta-regexp:jakarta-regexp:jar:1.4:provided
          [INFO] |  +- org.apache.lucene:lucene-spatial:jar:3.5.0:provided
          [INFO] |  +- org.apache.lucene:lucene-spellchecker:jar:3.5.0:provided
          [INFO] |  +- org.apache.lucene:lucene-grouping:jar:3.5.0:provided
          [INFO] |  +- org.apache.solr:solr-commons-csv:jar:3.5.0:provided
          [INFO] |  +- commons-codec:commons-codec:jar:1.5:provided
          [INFO] |  +- commons-fileupload:commons-fileupload:jar:1.2.1:provided
          [INFO] |  +- commons-httpclient:commons-httpclient:jar:3.1:provided
          [INFO] |  +- org.slf4j:jcl-over-slf4j:jar:1.6.3:provided
          [INFO] |  +- commons-io:commons-io:jar:1.4:provided
          [INFO] |  +- commons-lang:commons-lang:jar:2.4:provided
          [INFO] |  +- com.google.guava:guava:jar:r05:provided
          [INFO] |  \- javax.servlet:servlet-api:jar:2.4:provided
          
          
          Show
          Benson Margulies added a comment - - edited It might be helpful to note the following: with 3.5.0, 3.6.0, and 4.0-SNAPSHOT, I can create a Maven project with a dependency on solr-core, and have all the necessaries show up to sucessfully use EmbeddedSolrServer. The result is: [INFO] +- org.apache.solr:solr-core:jar:3.5.0:provided [INFO] | +- org.apache.solr:solr-solrj:jar:3.5.0:provided [INFO] | | \- org.codehaus.woodstox:wstx-asl:jar:3.2.7:provided [INFO] | +- org.apache.solr:solr-noggit:jar:3.5.0:provided [INFO] | +- org.apache.lucene:lucene-core:jar:3.5.0:provided [INFO] | +- org.apache.lucene:lucene-analyzers:jar:3.5.0:provided [INFO] | +- org.apache.lucene:lucene-highlighter:jar:3.5.0:provided [INFO] | +- org.apache.lucene:lucene-memory:jar:3.5.0:provided [INFO] | +- org.apache.lucene:lucene-misc:jar:3.5.0:provided [INFO] | +- org.apache.lucene:lucene-queries:jar:3.5.0:provided [INFO] | | \- jakarta-regexp:jakarta-regexp:jar:1.4:provided [INFO] | +- org.apache.lucene:lucene-spatial:jar:3.5.0:provided [INFO] | +- org.apache.lucene:lucene-spellchecker:jar:3.5.0:provided [INFO] | +- org.apache.lucene:lucene-grouping:jar:3.5.0:provided [INFO] | +- org.apache.solr:solr-commons-csv:jar:3.5.0:provided [INFO] | +- commons-codec:commons-codec:jar:1.5:provided [INFO] | +- commons-fileupload:commons-fileupload:jar:1.2.1:provided [INFO] | +- commons-httpclient:commons-httpclient:jar:3.1:provided [INFO] | +- org.slf4j:jcl-over-slf4j:jar:1.6.3:provided [INFO] | +- commons-io:commons-io:jar:1.4:provided [INFO] | +- commons-lang:commons-lang:jar:2.4:provided [INFO] | +- com.google.guava:guava:jar:r05:provided [INFO] | \- javax.servlet:servlet-api:jar:2.4:provided
          Hide
          Robert Muir added a comment -

          Benson, but you cannot do this with the binary release right? surely not? there is no guava.jar in the binary package, nor any lucene jars.

          Like i said, the release should be the same.

          It can't be: we release solr as an application on lucene.apache.org, but separately/differently as an API over on sonatype.com

          Show
          Robert Muir added a comment - Benson, but you cannot do this with the binary release right? surely not? there is no guava.jar in the binary package, nor any lucene jars. Like i said, the release should be the same. It can't be: we release solr as an application on lucene.apache.org, but separately/differently as an API over on sonatype.com
          Hide
          Steve Rowe added a comment -

          It can't be: we release solr as an application on lucene.apache.org, but separately/differently as an API over on sonatype.com

          I disagree. The official Solr binary distribution includes the API jars outside of the .war.

          Show
          Steve Rowe added a comment - It can't be: we release solr as an application on lucene.apache.org, but separately/differently as an API over on sonatype.com I disagree. The official Solr binary distribution includes the API jars outside of the .war.
          Hide
          Steve Rowe added a comment - - edited

          there is no guava.jar in the binary package, nor any lucene jars.

          In the 3.6.0 binary dist, there are lucene jars under contrib/analysis-extras/lucene-libs/. In the 4.0 release, assuming the uima contrib follows the same pattern, it too will have a lucene-libs/ directory containing the Lucene uima module's jar.

          Show
          Steve Rowe added a comment - - edited there is no guava.jar in the binary package, nor any lucene jars. In the 3.6.0 binary dist, there are lucene jars under contrib/analysis-extras/lucene-libs/ . In the 4.0 release, assuming the uima contrib follows the same pattern, it too will have a lucene-libs/ directory containing the Lucene uima module's jar.
          Hide
          Robert Muir added a comment -

          The binary release simply doesn't include any third party libraries. its different packaging.

          I think for most people, if maven supporters had honestly said up front:

          putting your war application in maven means we must expose it as if it were an API and take responsibility that also all its 100+ jars are also themselves in maven (even patched, renamed, etc)

          That nobody in their right mind would have agreed to this.

          Lets either drop maven artifacts for solr completely, or package maven artifacts like everything else
          to prevent problems like commons-csv in the future!

          Show
          Robert Muir added a comment - The binary release simply doesn't include any third party libraries. its different packaging. I think for most people, if maven supporters had honestly said up front: putting your war application in maven means we must expose it as if it were an API and take responsibility that also all its 100+ jars are also themselves in maven (even patched, renamed, etc) That nobody in their right mind would have agreed to this. Lets either drop maven artifacts for solr completely, or package maven artifacts like everything else to prevent problems like commons-csv in the future!
          Hide
          Steve Rowe added a comment -

          Lets either drop maven artifacts for solr completely, or package maven artifacts like everything else to prevent problems like commons-csv in the future!

          Robert, the only difference between solr binary distribution and maven artifacts is the POMs, not the jars or the war.

          When you say "packaging" you imply that the Solr Maven artifacts "include" 3rd party jars. They don't. Their POMs say that those 3rd party jars are required via non-optional dependency declarations.

          Show
          Steve Rowe added a comment - Lets either drop maven artifacts for solr completely, or package maven artifacts like everything else to prevent problems like commons-csv in the future! Robert, the only difference between solr binary distribution and maven artifacts is the POMs, not the jars or the war. When you say "packaging" you imply that the Solr Maven artifacts "include" 3rd party jars. They don't. Their POMs say that those 3rd party jars are required via non-optional dependency declarations.
          Hide
          Benson Margulies added a comment - - edited

          Wait, who said "putting your war application in maven means we must expose it as if it were an API and take responsibility"? It's not true. It might be a default behavior of the maven-war-plugin in simple cases, but that's different.

          Anyway, to answer the previous question, no, of course I can't do that with the binary package.

          The issue here should not be the war file. If there's an issue, it's the dependency tree of solr-core as an ordinary dependency, and whether we want it to list (a) ordinary released versions of third party stuff, (b) patched versions of third party stuff, or (c) no versions of third party stuff. If you want (c), then <optional>true</optional> makes sense to me, as it allows Steve's maven build to work and leaves the dependency management for these things to the end user. Inconvenient but safe.

          Show
          Benson Margulies added a comment - - edited Wait, who said "putting your war application in maven means we must expose it as if it were an API and take responsibility"? It's not true. It might be a default behavior of the maven-war-plugin in simple cases, but that's different. Anyway, to answer the previous question, no, of course I can't do that with the binary package. The issue here should not be the war file. If there's an issue, it's the dependency tree of solr-core as an ordinary dependency, and whether we want it to list (a) ordinary released versions of third party stuff, (b) patched versions of third party stuff, or (c) no versions of third party stuff. If you want (c), then <optional>true</optional> makes sense to me, as it allows Steve's maven build to work and leaves the dependency management for these things to the end user. Inconvenient but safe.
          Hide
          Robert Muir added a comment -

          The issue goes much deeper. The issue also involves taking responsibility for third party jars also being in maven.

          Is it a requirement, that if i commit a patch, that the dependency MUST be in maven? what if its not?

          What if i commit https://issues.apache.org/jira/secure/attachment/12521033/SOLR-3296_noggit.patch right now?

          How will maven work?

          If everyone wants to pretend that maven is totally ok here, then I'll go commit that patch and lets see!

          Show
          Robert Muir added a comment - The issue goes much deeper. The issue also involves taking responsibility for third party jars also being in maven. Is it a requirement, that if i commit a patch, that the dependency MUST be in maven? what if its not? What if i commit https://issues.apache.org/jira/secure/attachment/12521033/SOLR-3296_noggit.patch right now? How will maven work? If everyone wants to pretend that maven is totally ok here, then I'll go commit that patch and lets see!
          Hide
          Benson Margulies added a comment -

          Rob, my experience here is that you pose a very specific question (e.g. do war files force public dependencies) and when I answer it, you switch the subject to a different question. Not an illegitimate or uninteresting question, but a different question.

          The instantaneous effect of committing that patch will be to break the convenience maven build until someone else does something else. If noggit is out there on central, then the fix will be a trivial adjustment to the template pom. If it's not, then my suggestion for a relatively painless solution is

          1) to add a CSV file to the top of the tree, where each line consists of:

          URL,GROUP-ID-INVENTED,ARTIFACT-ID-INVENTED,VERSION

          2) To add each one as a dependency to the corresponding pom with <optional>true</optional>

          3) implement code in the 'ant get-maven-poms' target to download them and run maven install:install-file on them using the information in the CSV.

          If you all want one of these to be a non-optional dependency, then it's a job for someone to coax it onto central, probably via ossrh. That's work, but it doesn't have to happen in a hurry.

          The CXF file could be created by scraping the ivy files, but that seems a lot of work.

          Steve, of course, gets first dibs on solving the problem, and he might not like my proposal.

          Show
          Benson Margulies added a comment - Rob, my experience here is that you pose a very specific question (e.g. do war files force public dependencies) and when I answer it, you switch the subject to a different question. Not an illegitimate or uninteresting question, but a different question. The instantaneous effect of committing that patch will be to break the convenience maven build until someone else does something else. If noggit is out there on central, then the fix will be a trivial adjustment to the template pom. If it's not, then my suggestion for a relatively painless solution is 1) to add a CSV file to the top of the tree, where each line consists of: URL,GROUP-ID-INVENTED,ARTIFACT-ID-INVENTED,VERSION 2) To add each one as a dependency to the corresponding pom with <optional>true</optional> 3) implement code in the 'ant get-maven-poms' target to download them and run maven install:install-file on them using the information in the CSV. If you all want one of these to be a non-optional dependency, then it's a job for someone to coax it onto central, probably via ossrh. That's work, but it doesn't have to happen in a hurry. The CXF file could be created by scraping the ivy files, but that seems a lot of work. Steve, of course, gets first dibs on solving the problem, and he might not like my proposal.
          Hide
          Robert Muir added a comment -

          Rob, my experience here is that you pose a very specific question (e.g. do war files force public dependencies) and when I answer it, you switch the subject to a different question. Not an illegitimate or uninteresting question, but a different question.

          I agree its somewhat off-topic, I'm just trying to point out that these 'implementation-detail' jars have real costs and are not free. By maven exposing them the way it does, it more than doubles the surface area of responsibility of third party jars as compared to the binary packaging.

          And by maven not being able to download jar files from anywhere except maven itself, it really boxes you into a corner as far as managing dependencies. Would it really be so bad if someone adds a maven plugin that can just download a jar file from any http location? I could call it 'maven-antivirus-plugin'?

          Show
          Robert Muir added a comment - Rob, my experience here is that you pose a very specific question (e.g. do war files force public dependencies) and when I answer it, you switch the subject to a different question. Not an illegitimate or uninteresting question, but a different question. I agree its somewhat off-topic, I'm just trying to point out that these 'implementation-detail' jars have real costs and are not free. By maven exposing them the way it does, it more than doubles the surface area of responsibility of third party jars as compared to the binary packaging. And by maven not being able to download jar files from anywhere except maven itself, it really boxes you into a corner as far as managing dependencies. Would it really be so bad if someone adds a maven plugin that can just download a jar file from any http location? I could call it 'maven-antivirus-plugin'?
          Hide
          Steve Rowe added a comment -

          If noggit is out there on central, then the fix will be a trivial adjustment to the template pom. If it's not, then my suggestion for a relatively painless solution is

          1) to add a CSV file to the top of the tree, where each line consists of:

          URL,GROUP-ID-INVENTED,ARTIFACT-ID-INVENTED,VERSION

          2) To add each one as a dependency to the corresponding pom with <optional>true</optional>

          3) implement code in the 'ant get-maven-poms' target to download them and run maven install:install-file on them using the information in the CSV.

          Benson, the Maven build used to be able to deal with "non-Mavenized" 3rd party jars, using a mechanism like you suggest (except that it pulled jars, & optionally POMs, from the local file system instead of from a URL). That capability was removed in preparation for the 3.6 release.

          You can see what it used to look like in r1298247 of the Lucene/Solr grandfather POM - it was a profile that listed all of the necessary jars to pull from lib/ directories and put into the local maven repository. Users were instructed to invoke it prior to using the Maven build: mvn -N -Pbootstrap install.

          Fixing this aspect would simply require putting that stuff back for non-Mavenized jars. This is how the Maven build worked before the era of Fake Maven Releases of Other People's Software (FMROOPS).

          Show
          Steve Rowe added a comment - If noggit is out there on central, then the fix will be a trivial adjustment to the template pom. If it's not, then my suggestion for a relatively painless solution is 1) to add a CSV file to the top of the tree, where each line consists of: URL,GROUP-ID-INVENTED,ARTIFACT-ID-INVENTED,VERSION 2) To add each one as a dependency to the corresponding pom with <optional>true</optional> 3) implement code in the 'ant get-maven-poms' target to download them and run maven install:install-file on them using the information in the CSV. Benson, the Maven build used to be able to deal with "non-Mavenized" 3rd party jars, using a mechanism like you suggest (except that it pulled jars, & optionally POMs, from the local file system instead of from a URL). That capability was removed in preparation for the 3.6 release. You can see what it used to look like in r1298247 of the Lucene/Solr grandfather POM - it was a profile that listed all of the necessary jars to pull from lib/ directories and put into the local maven repository. Users were instructed to invoke it prior to using the Maven build: mvn -N -Pbootstrap install . Fixing this aspect would simply require putting that stuff back for non-Mavenized jars. This is how the Maven build worked before the era of Fake Maven Releases of Other People's Software (FMROOPS).
          Hide
          Robert Muir added a comment -

          So then i can commit my patch, and we could release tomorrow and maven should work? great!

          But i suspect this isnt the case, you are conflating the 'maven build' with 'maven artifacts' no?

          Show
          Robert Muir added a comment - So then i can commit my patch, and we could release tomorrow and maven should work? great! But i suspect this isnt the case, you are conflating the 'maven build' with 'maven artifacts' no?
          Hide
          Steve Rowe added a comment -

          So then i can commit my patch, and we could release tomorrow and maven should work? great!

          But i suspect this isnt the case, you are conflating the 'maven build' with 'maven artifacts' no?

          I am not conflating the two; as Benson mentioned, marking those non-mavenized dependencies as optional in the POMs of modules that need them would allow "maven artifacts" on Maven Central to be useable.

          The mode of use, however, would be:

          • checkout the tagged release, including dev-tools, from subversion (or maybe from git instead?)
          • run ant get-maven-poms resolve ; mvn -N -Pbootstrap install in order to put the non-mavenized jars in one's local maven repository
          • add optional dependencies, using Benson's "groupId-invented/artifactId-invented" coordinates, to one's own project's POMs.
          Show
          Steve Rowe added a comment - So then i can commit my patch, and we could release tomorrow and maven should work? great! But i suspect this isnt the case, you are conflating the 'maven build' with 'maven artifacts' no? I am not conflating the two; as Benson mentioned, marking those non-mavenized dependencies as optional in the POMs of modules that need them would allow "maven artifacts" on Maven Central to be useable. The mode of use, however, would be: checkout the tagged release, including dev-tools, from subversion (or maybe from git instead?) run ant get-maven-poms resolve ; mvn -N -Pbootstrap install in order to put the non-mavenized jars in one's local maven repository add optional dependencies, using Benson's "groupId-invented/artifactId-invented" coordinates, to one's own project's POMs.
          Hide
          Benson Margulies added a comment -

          There is some work to do before you can 'commit today and release tomorrow.' Steve didn't claim to the contrary and neither did I. It's not a ton of work, and it's not complex. If you want to avoid jars in svn, you'll need my download idea, if you don't mind jars in svn (and I've lost track of the Apache <del>politics</del> rules of those, then you just reactivate the old scheme.

          if any of those jars are patched, my suggestion to avoid controversy is <optional>true</optional>

          Show
          Benson Margulies added a comment - There is some work to do before you can 'commit today and release tomorrow.' Steve didn't claim to the contrary and neither did I. It's not a ton of work, and it's not complex. If you want to avoid jars in svn, you'll need my download idea, if you don't mind jars in svn (and I've lost track of the Apache <del>politics</del> rules of those, then you just reactivate the old scheme. if any of those jars are patched, my suggestion to avoid controversy is <optional>true</optional>
          Hide
          Steve Rowe added a comment -

          Would it really be so bad if someone adds a maven plugin that can just download a jar file from any http location? I could call it 'maven-antivirus-plugin'?

          +1

          Show
          Steve Rowe added a comment - Would it really be so bad if someone adds a maven plugin that can just download a jar file from any http location? I could call it 'maven-antivirus-plugin'? +1
          Hide
          Steve Rowe added a comment -

          If you want to avoid jars in svn, you'll need my download idea

          Not really - just use ant resolve prior to running the maven build.

          Show
          Steve Rowe added a comment - If you want to avoid jars in svn, you'll need my download idea Not really - just use ant resolve prior to running the maven build.
          Hide
          Steve Rowe added a comment -

          RE: maven-antivirus-plugin, as an experiment, I added the following to the Lucene/Solr grandfather POM, and Maven (v2.2.1 & v3.0.4) didn't barf:

          <project ... xmlns:maven-antivirus-plugin="xmlns:maven-antivirus-plugin="http://example.org/maven-antivirus-plugin" ...>
            ...
            <properties>
              <maven-antivirus-plugin:url coordinates="groupId-invented:commons-csv:1.0-dev-r609327:jar">
                http://example.com/somewhere/commons-csv-1.0-dev-r609327.jar
              </maven-antivirus-plugin:url>
              <maven-antivirus-plugin:url coordinates="groupId-invented:commons-csv:1.0-dev-r609327:pom">
                http://example.com/somewhere/commons-csv-1.0-dev-r609327.pom
              </maven-antivirus-plugin:url>
              ...
            </properties>
          

          So this is a syntactically valid way to shoehorn download links for such a plugin into a POM.

          Show
          Steve Rowe added a comment - RE: maven-antivirus-plugin, as an experiment, I added the following to the Lucene/Solr grandfather POM, and Maven (v2.2.1 & v3.0.4) didn't barf: <project ... xmlns:maven-antivirus-plugin = " xmlns:maven-antivirus-plugin =" http://example.org/maven-antivirus-plugin" ...> ... <properties> <maven-antivirus-plugin:url coordinates= "groupId-invented:commons-csv:1.0-dev-r609327:jar" > http://example.com/somewhere/commons-csv-1.0-dev-r609327.jar </maven-antivirus-plugin:url> <maven-antivirus-plugin:url coordinates= "groupId-invented:commons-csv:1.0-dev-r609327:pom" > http://example.com/somewhere/commons-csv-1.0-dev-r609327.pom </maven-antivirus-plugin:url> ... </properties> So this is a syntactically valid way to shoehorn download links for such a plugin into a POM.
          Hide
          Steve Rowe added a comment -

          Re: maven-antivirus-plugin - looks like it's already been built: http://evgeny-goldin.com/wiki/Ivy-maven-plugin

          Show
          Steve Rowe added a comment - Re: maven-antivirus-plugin - looks like it's already been built: http://evgeny-goldin.com/wiki/Ivy-maven-plugin
          Show
          Steve Rowe added a comment - http://evgeny-goldin.com/wiki/Ivy-maven-plugin Licensed under Apache License v2.0: https://github.com/evgeny-goldin/maven-plugins/blob/master/ivy-maven-plugin/src/main/resources/license.txt , so it could be forked if necessary.
          Hide
          Benson Margulies added a comment -

          You might be surprised by this, but I agree with you.

          I could build that plugin for you, mostly. Here's what I can't do for you. I can't arrange for you to declare a dependency in terms of the URL to a JAR. I'm sorry, but I can't undo the narrow-minded thinking of the founder of maven. What I can do is make it possible to have a two-pass maven build: the first run of maven would use such a plugin to download things (and, if you like, patch and build them from source), so that the second run would just find them in the local repo.

          Actually, I'm not quite being truthful. Maven has an extension architecture for talking to repos called 'wagons'. I think that I could set up an wagon that defined a 'repository' in terms of that CSV file I described above. Not too awful, come to think of it. You add a declaration of that wagon to the pom, and a rather funny <repository> element to the pom ... but consumers might not thank you.

          The central tenant of maven thinking (who does not pay enough rent) is that it's never a big deal to grab a jar and stick it in some convenient repo and use it, so why do we need to allow for getting jars from anywhere else? And lots of people all over find this tolerable. You don't, and I'm not particularly motivated to tell you that you're wrong. Still and all, given a days' warning of the need, Steve or I or anyone else who cared to do the reading could get noggit or anything else onto Central via OSSRH. If we want to ask the author first, we need time for a response. If we just want to push it under our own coordinates (I'd use 'us.dchbk'), then it's just the time it takes the jar to wander out there.

          Show
          Benson Margulies added a comment - You might be surprised by this, but I agree with you. I could build that plugin for you, mostly. Here's what I can't do for you. I can't arrange for you to declare a dependency in terms of the URL to a JAR. I'm sorry, but I can't undo the narrow-minded thinking of the founder of maven. What I can do is make it possible to have a two-pass maven build: the first run of maven would use such a plugin to download things (and, if you like, patch and build them from source), so that the second run would just find them in the local repo. Actually, I'm not quite being truthful. Maven has an extension architecture for talking to repos called 'wagons'. I think that I could set up an wagon that defined a 'repository' in terms of that CSV file I described above. Not too awful, come to think of it. You add a declaration of that wagon to the pom, and a rather funny <repository> element to the pom ... but consumers might not thank you. The central tenant of maven thinking (who does not pay enough rent) is that it's never a big deal to grab a jar and stick it in some convenient repo and use it, so why do we need to allow for getting jars from anywhere else? And lots of people all over find this tolerable. You don't, and I'm not particularly motivated to tell you that you're wrong. Still and all, given a days' warning of the need, Steve or I or anyone else who cared to do the reading could get noggit or anything else onto Central via OSSRH. If we want to ask the author first, we need time for a response. If we just want to push it under our own coordinates (I'd use 'us.dchbk'), then it's just the time it takes the jar to wander out there.
          Hide
          Dawid Weiss added a comment -

          Re: maven-antivirus-plugin - looks like it's already been built: http://evgeny-goldin.com/wiki/Ivy-maven-plugin

          Interesting find, Steve. It won't allow you to declare regular dependencies though, will it? I mean – I tried to write a plugin that would fetch a JAR and declare a system dependency on it locally but even validation phase is performed after dependency resolution so this failed. Didn't try the above plugin but from the description I see it attaches jars directly to reactor's classpath, bypassing regular dependency resolution?

          Show
          Dawid Weiss added a comment - Re: maven-antivirus-plugin - looks like it's already been built: http://evgeny-goldin.com/wiki/Ivy-maven-plugin Interesting find, Steve. It won't allow you to declare regular dependencies though, will it? I mean – I tried to write a plugin that would fetch a JAR and declare a system dependency on it locally but even validation phase is performed after dependency resolution so this failed. Didn't try the above plugin but from the description I see it attaches jars directly to reactor's classpath, bypassing regular dependency resolution?
          Hide
          Steve Rowe added a comment -

          Didn't try the above plugin but from the description I see it attaches jars directly to reactor's classpath, bypassing regular dependency resolution?

          I haven't tried it yet either, but yes, I too think it's bypassing regular dependency resolution. However, it's hooking into Ivy's capabilities, which makes me think this could be a long term solution for Lucene/Solr.

          Show
          Steve Rowe added a comment - Didn't try the above plugin but from the description I see it attaches jars directly to reactor's classpath, bypassing regular dependency resolution? I haven't tried it yet either, but yes, I too think it's bypassing regular dependency resolution. However, it's hooking into Ivy's capabilities, which makes me think this could be a long term solution for Lucene/Solr.
          Hide
          Steve Rowe added a comment -

          From the description

          binary distribution contains only the jars necessary for solrj and contrib plugins, and a solr.war

          This is plainly false: all Solr jars, including solr-core and solr-test-framework, are included in the 3.6 binary distribution outside of the war.

          Show
          Steve Rowe added a comment - From the description binary distribution contains only the jars necessary for solrj and contrib plugins, and a solr.war This is plainly false: all Solr jars, including solr-core and solr-test-framework, are included in the 3.6 binary distribution outside of the war .
          Hide
          Robert Muir added a comment -

          Its really not, I am talking about third-party jars.

          Like i said: binary distribution doesnt expose these third party jars, nor even list what they are.
          maven distribution requires these to be published.

          Just look at the zip! There is no guava.jar or any of those other solr/lib
          dependencies included in the zip, however maven exposes these dependencies that are "impl details of the war".

          The only third party dependencies included are:

          • solrj_lib (the very few the client library needs to work)
          • solr contrib plugins (since they are plugins and need these to work)
          Show
          Robert Muir added a comment - Its really not, I am talking about third-party jars. Like i said: binary distribution doesnt expose these third party jars, nor even list what they are. maven distribution requires these to be published. Just look at the zip! There is no guava.jar or any of those other solr/lib dependencies included in the zip, however maven exposes these dependencies that are "impl details of the war". The only third party dependencies included are: solrj_lib (the very few the client library needs to work) solr contrib plugins (since they are plugins and need these to work)
          Hide
          Steve Rowe added a comment -

          Its really not, I am talking about third-party jars.

          It really is. You are also talking about the difference between an app an and api. If the api jars are included, then the binary dist is not exclusively an app.

          Show
          Steve Rowe added a comment - Its really not, I am talking about third-party jars. It really is. You are also talking about the difference between an app an and api. If the api jars are included, then the binary dist is not exclusively an app.
          Hide
          Robert Muir added a comment -

          But these inner dependencies are not exposed as APIs.

          Now you can see why the commons-csv thing was surprising to us. Because we package it inside the war only,
          as an implementation detail.

          If someone wants to use solr-core.jar and needs commons-csv, its up to them to get it: we werent PUBLISHING IT!

          On the other hand: maven distribution was!

          Show
          Robert Muir added a comment - But these inner dependencies are not exposed as APIs. Now you can see why the commons-csv thing was surprising to us. Because we package it inside the war only, as an implementation detail. If someone wants to use solr-core.jar and needs commons-csv, its up to them to get it: we werent PUBLISHING IT! On the other hand: maven distribution was!
          Hide
          Steve Rowe added a comment -

          Robert,

          Call me crazy, but I've read your comments on this issue as claiming that we should not publish solr-core (etc.) on Maven Central, because we don't do that in the binary dist. Well, we do do that in the binary dist.

          So, this time without avoiding the question: why should we not publish solr-core on Maven Central?

          Show
          Steve Rowe added a comment - Robert, Call me crazy, but I've read your comments on this issue as claiming that we should not publish solr-core (etc.) on Maven Central, because we don't do that in the binary dist. Well, we do do that in the binary dist. So, this time without avoiding the question: why should we not publish solr-core on Maven Central?
          Hide
          Robert Muir added a comment -

          So, this time without avoiding the question: why should we not publish solr-core on Maven Central?

          Because maven requires that its dependencies are also in maven, whereas the binary distribution does not:
          it exposes its "innards".

          Let's talk about how we can make some concrete process on this issue, throwing aside COMPLETELY the whole
          .war-third-party-exposure, and the fact that we are releasing as an "application" one way and as an "api" another way.
          Lets just table that for a second, since we will probably end up disagreeing on it anyway

          I think the maven artifacts should not be built from the source tree, they should instead be built from
          the binary release (e.g. unzipping the .zip + augmenting with poms). If we build them this way, this has
          a number of advantages:

          1. exact same jar files etc are put into the maven/ folder that are in the binary release. they are just
            augmented with poms.
          2. we can now easily validate, that maven/ folders don't contain anything (besides pom.xmls etc), that
            aren't found by unzip -l binary release. we can also test that these jar files are exactly the same.

          I think this would be a good, non-controversial step to improving the situation. Such a check would have
          detected the commons-csv situation, no? It also gives us some more faith in the maven artifacts, since
          they are the exact same jar files we are testing in the binary package.

          We could do this with lucene, too.

          Show
          Robert Muir added a comment - So, this time without avoiding the question: why should we not publish solr-core on Maven Central? Because maven requires that its dependencies are also in maven, whereas the binary distribution does not: it exposes its "innards". Let's talk about how we can make some concrete process on this issue, throwing aside COMPLETELY the whole .war-third-party-exposure, and the fact that we are releasing as an "application" one way and as an "api" another way. Lets just table that for a second, since we will probably end up disagreeing on it anyway I think the maven artifacts should not be built from the source tree, they should instead be built from the binary release (e.g. unzipping the .zip + augmenting with poms). If we build them this way, this has a number of advantages: exact same jar files etc are put into the maven/ folder that are in the binary release. they are just augmented with poms. we can now easily validate, that maven/ folders don't contain anything (besides pom.xmls etc), that aren't found by unzip -l binary release. we can also test that these jar files are exactly the same. I think this would be a good, non-controversial step to improving the situation. Such a check would have detected the commons-csv situation, no? It also gives us some more faith in the maven artifacts, since they are the exact same jar files we are testing in the binary package. We could do this with lucene, too.
          Hide
          Robert Muir added a comment -

          And in the idea above, obviously -sources.jar and -javadocs.jar are "exempt", as they
          are maven-specific and not in the binary packaging. Thats fine: I'm talking about
          the actual binary jars. Our checking script would exclude those.

          I think currently these are "the same" in the sense that
          they are built from the same code, but currently have timestamp differences as they
          are pulled from build/.

          On the non-maven side there are improvements like this as well: for example I think
          the lucene jars used by solr are "rebuilt" in the process. But i think it would be
          more ideal if solr 'prepare-release', when populating the jar, populated these lucene
          jars from lucene's binary release in dist/ the same way: so they are the exact same
          jars that were released in the lucene binary distribution.

          I dont think this stuff has to be done immediately, and i know its complicated and being
          really pedantic, but I think it would be a good step.

          Show
          Robert Muir added a comment - And in the idea above, obviously -sources.jar and -javadocs.jar are "exempt", as they are maven-specific and not in the binary packaging. Thats fine: I'm talking about the actual binary jars. Our checking script would exclude those. I think currently these are "the same" in the sense that they are built from the same code, but currently have timestamp differences as they are pulled from build/. On the non-maven side there are improvements like this as well: for example I think the lucene jars used by solr are "rebuilt" in the process. But i think it would be more ideal if solr 'prepare-release', when populating the jar, populated these lucene jars from lucene's binary release in dist/ the same way: so they are the exact same jars that were released in the lucene binary distribution. I dont think this stuff has to be done immediately, and i know its complicated and being really pedantic, but I think it would be a good step.
          Hide
          Steve Rowe added a comment -

          So, this time without avoiding the question: why should we not publish solr-core on Maven Central?

          Because maven requires that its dependencies are also in maven, whereas the binary distribution does not: it exposes its "innards".

          This is an argument against Maven generally, not exclusively the Solr artifacts; I view it as a thinly veiled re-assertion that Lucene/Solr should not support Maven at all. Again: -1.

          The fix here is not to stop publishing on Maven Central, but rather as you say on the issue: make the Maven Central artifacts like the binary artifacts. Using your logic, excluding solr-core from the Maven Central artifacts would make the two "not the same", and hence would be WRONG!!!

          Show
          Steve Rowe added a comment - So, this time without avoiding the question: why should we not publish solr-core on Maven Central? Because maven requires that its dependencies are also in maven, whereas the binary distribution does not: it exposes its "innards". This is an argument against Maven generally, not exclusively the Solr artifacts; I view it as a thinly veiled re-assertion that Lucene/Solr should not support Maven at all. Again: -1. The fix here is not to stop publishing on Maven Central, but rather as you say on the issue: make the Maven Central artifacts like the binary artifacts. Using your logic, excluding solr-core from the Maven Central artifacts would make the two "not the same", and hence would be WRONG!!!
          Hide
          Steve Rowe added a comment -

          I think the maven artifacts should not be built from the source tree, they should instead be built from the binary release (e.g. unzipping the .zip + augmenting with poms).

          +1, I've looked at doing this in the past but didn't see a quick way to do it.

          And in the idea above, obviously -sources.jar and -javadocs.jar are "exempt", as they
          are maven-specific and not in the binary packaging. Thats fine: I'm talking about
          the actual binary jars. Our checking script would exclude those.

          I think currently these are "the same" in the sense that
          they are built from the same code, but currently have timestamp differences as they
          are pulled from build/.

          On the non-maven side there are improvements like this as well: for example I think
          the lucene jars used by solr are "rebuilt" in the process. But i think it would be
          more ideal if solr 'prepare-release', when populating the jar, populated these lucene
          jars from lucene's binary release in dist/ the same way: so they are the exact same
          jars that were released in the lucene binary distribution.

          I dont think this stuff has to be done immediately, and i know its complicated and being
          really pedantic, but I think it would be a good step.

          +1 to all of these ideas.

          Show
          Steve Rowe added a comment - I think the maven artifacts should not be built from the source tree, they should instead be built from the binary release (e.g. unzipping the .zip + augmenting with poms). +1, I've looked at doing this in the past but didn't see a quick way to do it. And in the idea above, obviously -sources.jar and -javadocs.jar are "exempt", as they are maven-specific and not in the binary packaging. Thats fine: I'm talking about the actual binary jars. Our checking script would exclude those. I think currently these are "the same" in the sense that they are built from the same code, but currently have timestamp differences as they are pulled from build/. On the non-maven side there are improvements like this as well: for example I think the lucene jars used by solr are "rebuilt" in the process. But i think it would be more ideal if solr 'prepare-release', when populating the jar, populated these lucene jars from lucene's binary release in dist/ the same way: so they are the exact same jars that were released in the lucene binary distribution. I dont think this stuff has to be done immediately, and i know its complicated and being really pedantic, but I think it would be a good step. +1 to all of these ideas.
          Hide
          Robert Muir added a comment -

          This is an argument against Maven generally, not exclusively the Solr artifacts; I view it as a thinly veiled re-assertion that Lucene/Solr should not support Maven at all. Again: -1.

          Its really not that: and though i've asserted this before (especially when maven had no tests, but now it does), when
          did I do this on the recent thread? I have stated that I think we shouldn't release maven if its "different" than our
          other packaging because I think that causes it to be more of a mystery. I opened this issue to improve the situation,
          not to have an issue to argue about maven. you can s/maven/rpm/ and i feel the same way about all of this: these are
          just different packaging formats but I think the underlying products we release should be the same.

          I'm upset about the maven packaging on this issue because in my opinion, it packages solr up like an API which is
          different than our binary release: which packages it up to be used as an application. Frankly you really can't
          do much else with the solr binary packaging except use it as an application: those solr-core.jar's etc do you
          absolutely no good unless you hunt down all the jars (or yank em out of solr.war/WEB_INF, maybe some IDEs do that),
          yourself.

          +1, I've looked at doing this in the past but didn't see a quick way to do it.

          I also don't think we should do it for 4.0, its too risky. But we should look at it for the future. A few things to think about:

          • its annoying when releasing lucene/solr that you cant do it all with one command line. So I think we would add a top-level "prepare-release" to trunk/build.xml that would simply invoke solr/ prepare-release. And solr's prepare-release would depend on lucene's. That would be nice as we have one single command for this.
          • since solr prepare-release now knows that lucene's is also built, I think it would be easier for it to use the jars from the lucene release. easier, not easy.

          thats just the non-maven parts, the maven stuff is more blurry to me.

          Show
          Robert Muir added a comment - This is an argument against Maven generally, not exclusively the Solr artifacts; I view it as a thinly veiled re-assertion that Lucene/Solr should not support Maven at all. Again: -1. Its really not that: and though i've asserted this before (especially when maven had no tests, but now it does), when did I do this on the recent thread? I have stated that I think we shouldn't release maven if its "different" than our other packaging because I think that causes it to be more of a mystery. I opened this issue to improve the situation, not to have an issue to argue about maven. you can s/maven/rpm/ and i feel the same way about all of this: these are just different packaging formats but I think the underlying products we release should be the same . I'm upset about the maven packaging on this issue because in my opinion, it packages solr up like an API which is different than our binary release: which packages it up to be used as an application. Frankly you really can't do much else with the solr binary packaging except use it as an application: those solr-core.jar's etc do you absolutely no good unless you hunt down all the jars (or yank em out of solr.war/WEB_INF, maybe some IDEs do that), yourself. +1, I've looked at doing this in the past but didn't see a quick way to do it. I also don't think we should do it for 4.0, its too risky. But we should look at it for the future. A few things to think about: its annoying when releasing lucene/solr that you cant do it all with one command line. So I think we would add a top-level "prepare-release" to trunk/build.xml that would simply invoke solr/ prepare-release. And solr's prepare-release would depend on lucene's. That would be nice as we have one single command for this. since solr prepare-release now knows that lucene's is also built, I think it would be easier for it to use the jars from the lucene release. easier, not easy. thats just the non-maven parts, the maven stuff is more blurry to me.
          Hide
          Robert Muir added a comment -

          And yes, i did suggest as a compromise that perhaps we dont even put solr in maven at all, just lucene,
          (and this issue is supposed to be even more of a compromise, that we still put solr in maven, but package
          maven-solr up as an application just like the binary packaging). The latest suggestion is supposed to be
          even more of a compromise.

          The idea behind these compromises is so that people who like maven are happy, and so that PMC members
          who don't understand maven feel comfortable with us releasing maven artifacts and these threads about
          maven don't keep popping up anymore.

          Separately I do make vicious assaults on how maven works internally etc, because I think it deserves that.
          But thats unrelated to whether or not we release maven artifacts.

          Of course in an ideal situation we release lucene/solr and its instantly available everywhere in every single
          packaging format in perfect shape: rpm,yum,maven,bsd/macos ports,...: we just don't have the resources to do
          all of that.

          So when it comes to maven artifacts, you can expect me to be critical of it in the future, especially when
          its behavior differs from the other artifacts (like app versus API).

          None of this is an assault on the idea of us producing 'maven artifacts', none of it is saying
          "i don't see the value of maven artifacts", or "maven artifacts cant do cool things", or any of that.

          And sometimes when i say 'maven' its confusing whether i refer to 'maven the build system' or 'maven the artifacts'.
          This is because maven itself makes this confusing by conflating multiple things. Its not my fault.

          Its just trying to get this packaging stuff under control.

          Show
          Robert Muir added a comment - And yes, i did suggest as a compromise that perhaps we dont even put solr in maven at all, just lucene, (and this issue is supposed to be even more of a compromise, that we still put solr in maven, but package maven-solr up as an application just like the binary packaging). The latest suggestion is supposed to be even more of a compromise. The idea behind these compromises is so that people who like maven are happy, and so that PMC members who don't understand maven feel comfortable with us releasing maven artifacts and these threads about maven don't keep popping up anymore. Separately I do make vicious assaults on how maven works internally etc, because I think it deserves that. But thats unrelated to whether or not we release maven artifacts. Of course in an ideal situation we release lucene/solr and its instantly available everywhere in every single packaging format in perfect shape: rpm,yum,maven,bsd/macos ports,...: we just don't have the resources to do all of that. So when it comes to maven artifacts, you can expect me to be critical of it in the future, especially when its behavior differs from the other artifacts (like app versus API). None of this is an assault on the idea of us producing 'maven artifacts', none of it is saying "i don't see the value of maven artifacts", or "maven artifacts cant do cool things", or any of that. And sometimes when i say 'maven' its confusing whether i refer to 'maven the build system' or 'maven the artifacts'. This is because maven itself makes this confusing by conflating multiple things. Its not my fault. Its just trying to get this packaging stuff under control.
          Hide
          Michael McCandless added a comment -

          I have stated that I think we shouldn't release maven if its "different" than our
          other packaging because I think that causes it to be more of a mystery.

          you can s/maven/rpm/ and i feel the same way about all of this: these are
          just different packaging formats but I think the underlying products we release should be the same.

          I think the maven artifacts should not be built from the source tree, they should instead be built from the binary release (e.g. unzipping the .zip + augmenting with poms).

          +1

          This would make me more comfortable with our Maven artifacts...

          Do we know of any downstream repos that package up Solr? Do they
          also match the artifacts in our binary release?

          Could such a stronger decoupling of "our releases" and "pushing
          to Maven Central" also mean that issues like SOLR-2770 (where, I
          think, only the Maven POMs were messed up for the 3.4.0 release) might
          be correctable in the future w/o having to cut another "real"
          release...?

          Show
          Michael McCandless added a comment - I have stated that I think we shouldn't release maven if its "different" than our other packaging because I think that causes it to be more of a mystery. you can s/maven/rpm/ and i feel the same way about all of this: these are just different packaging formats but I think the underlying products we release should be the same. I think the maven artifacts should not be built from the source tree, they should instead be built from the binary release (e.g. unzipping the .zip + augmenting with poms). +1 This would make me more comfortable with our Maven artifacts... Do we know of any downstream repos that package up Solr? Do they also match the artifacts in our binary release? Could such a stronger decoupling of "our releases" and "pushing to Maven Central" also mean that issues like SOLR-2770 (where, I think, only the Maven POMs were messed up for the 3.4.0 release) might be correctable in the future w/o having to cut another "real" release...?
          Hide
          David Smiley added a comment -

          There is a lot of discussion here and I don't want to complicate anything.

          What I do want to say, as a user of Maven and of Lucene/Solr's Maven artifacts specifically, is that it is awesome that I can have a maven based project that has a dependency on the Solr test framework and it just works thanks to all of the dependency resolution of Maven, and thanks to Maven and IDE integration, IntelliJ grabs all the source which helps tremendously – its automatic. My code can either be strictly a SolrJ client or it can extend Lucene or Solr. I don't want this to go away. Once upon a time it didn't work or the dependencies metadata declared were poor and I did my part in making it work well (and certainly Steve did too).

          Show
          David Smiley added a comment - There is a lot of discussion here and I don't want to complicate anything. What I do want to say, as a user of Maven and of Lucene/Solr's Maven artifacts specifically, is that it is awesome that I can have a maven based project that has a dependency on the Solr test framework and it just works thanks to all of the dependency resolution of Maven, and thanks to Maven and IDE integration, IntelliJ grabs all the source which helps tremendously – its automatic. My code can either be strictly a SolrJ client or it can extend Lucene or Solr. I don't want this to go away. Once upon a time it didn't work or the dependencies metadata declared were poor and I did my part in making it work well (and certainly Steve did too).
          Hide
          Robert Muir added a comment -

          I don't care if maven can cook me dinner or get me a beer out of the fridge.

          Thousands of people can comment on this issue about how great it is, no one cares.

          The bottom line is that people are going to be uncomfortable with it being in our releases,
          and these threads will continue to pop up, as long as the maven artifacts are handled
          differently from the other packaging: its just that simple.

          By making it consistent with the other packaging people are less likely to complain,
          because then its not such a mystery and isnt a "separate/different product".

          Show
          Robert Muir added a comment - I don't care if maven can cook me dinner or get me a beer out of the fridge. Thousands of people can comment on this issue about how great it is, no one cares. The bottom line is that people are going to be uncomfortable with it being in our releases, and these threads will continue to pop up, as long as the maven artifacts are handled differently from the other packaging: its just that simple. By making it consistent with the other packaging people are less likely to complain, because then its not such a mystery and isnt a "separate/different product".
          Hide
          Dawid Weiss added a comment -

          bq By making it consistent with the other packaging people are less likely to complain,
          because then its not such a mystery and isnt a "separate/different product".

          I think that's the point David was making – if you go with manual POM + released JARs packaging then things will actually be of poorer quality (and very likely broken) for lots of maven users.

          Show
          Dawid Weiss added a comment - bq By making it consistent with the other packaging people are less likely to complain, because then its not such a mystery and isnt a "separate/different product". I think that's the point David was making – if you go with manual POM + released JARs packaging then things will actually be of poorer quality (and very likely broken) for lots of maven users.
          Hide
          Robert Muir added a comment -

          then things will actually be of poorer quality (and very likely broken) for lots of maven users.

          I'm not sure 'lots' is the correct word here. I think the vast majority of solr users use it
          as an application. The vocal ones here are the ones that are committers who PREFER to use maven,
          but thats a vocal minority.

          Show
          Robert Muir added a comment - then things will actually be of poorer quality (and very likely broken) for lots of maven users. I'm not sure 'lots' is the correct word here. I think the vast majority of solr users use it as an application. The vocal ones here are the ones that are committers who PREFER to use maven, but thats a vocal minority.
          Hide
          Uwe Schindler added a comment -

          then things will actually be of poorer quality (and very likely broken) for lots of maven users.

          I'm not sure 'lots' is the correct word here. I think the vast majority of solr users use it as an application. The vocal ones here are the ones that are committers who PREFER to use maven, but thats a vocal minority.

          I completely agree with Robert here! Solr's only artifacts should we solrj and the war file.

          Show
          Uwe Schindler added a comment - then things will actually be of poorer quality (and very likely broken) for lots of maven users. I'm not sure 'lots' is the correct word here. I think the vast majority of solr users use it as an application. The vocal ones here are the ones that are committers who PREFER to use maven, but thats a vocal minority. I completely agree with Robert here! Solr's only artifacts should we solrj and the war file.
          Hide
          Ryan McKinley added a comment -

          Again, solr is an API and an application – the plugin structure is well advertised, promoted, and well used.

          If anything, this discussion points me to think that the binary dist should include a solr-lib folder (though I don't really care)

          Show
          Ryan McKinley added a comment - Again, solr is an API and an application – the plugin structure is well advertised, promoted, and well used. If anything, this discussion points me to think that the binary dist should include a solr-lib folder (though I don't really care)
          Hide
          Steve Rowe added a comment -

          If anything, this discussion points me to think that the binary dist should include a solr-lib folder (though I don't really care)

          It already does - the folder is called dist/, and it includes all of the API .jars right there alongside the war.

          Show
          Steve Rowe added a comment - If anything, this discussion points me to think that the binary dist should include a solr-lib folder (though I don't really care) It already does - the folder is called dist/ , and it includes all of the API .jars right there alongside the war.
          Hide
          Robert Muir added a comment -

          it doesn't include their libs. unzip it and see!

          Show
          Robert Muir added a comment - it doesn't include their libs. unzip it and see!
          Hide
          Robert Muir added a comment -

          like i said: this whole issue came out of third party dependency issues.

          I've said it before, and I'll say it again (I might have to start copy/pasting myself?!):

          You can unzip the binary release and see that third party
          dependencies such as guava are not in it. Third party dependencies of solr
          are only inside the solr.war (treated as application).

          However the maven release treats it as an API, exposing the innards of the solr.war
          application and making us responsible for these addtl dependencies.

          Show
          Robert Muir added a comment - like i said: this whole issue came out of third party dependency issues. I've said it before, and I'll say it again (I might have to start copy/pasting myself?!): You can unzip the binary release and see that third party dependencies such as guava are not in it. Third party dependencies of solr are only inside the solr.war (treated as application). However the maven release treats it as an API, exposing the innards of the solr.war application and making us responsible for these addtl dependencies.
          Hide
          Steve Rowe added a comment -

          it doesn't include their libs. unzip it and see!

          I agree. I've never disagreed with this; it is a fact.

          By contrast, you have, at least from my perspective, asserted that Solr's API jars are not included in the official binary dist, when they clearly are. unzip it and see!

          Show
          Steve Rowe added a comment - it doesn't include their libs. unzip it and see! I agree. I've never disagreed with this; it is a fact. By contrast, you have, at least from my perspective, asserted that Solr's API jars are not included in the official binary dist, when they clearly are. unzip it and see!
          Hide
          Robert Muir added a comment -

          Nope, never. this issue has always been about third party dependencies.

          Show
          Robert Muir added a comment - Nope, never. this issue has always been about third party dependencies.
          Hide
          Robert Muir added a comment -

          I'm just going to repeat the description of the issue, since people are having problems finding it:

          Lets take the commons-csv scenario:

          • apache-solr-3.5.0 binary distribution contains no actual commons-csv.jar anywhere,
            in fact it contains no third party jars (the stuff present in solr/lib) at all.
          • binary distribution contains only the jars necessary for solrj and contrib plugins, and a solr.war

          I think the maven artifacts should match whats in the binary release (no third party jars
          inside the .war are "exposed", we just publish the .war itself). This exposes a lot less surface area.

          Show
          Robert Muir added a comment - I'm just going to repeat the description of the issue, since people are having problems finding it: Lets take the commons-csv scenario: apache-solr-3.5.0 binary distribution contains no actual commons-csv.jar anywhere, in fact it contains no third party jars (the stuff present in solr/lib) at all. binary distribution contains only the jars necessary for solrj and contrib plugins, and a solr.war I think the maven artifacts should match whats in the binary release (no third party jars inside the .war are "exposed", we just publish the .war itself). This exposes a lot less surface area.
          Hide
          Ryan McKinley added a comment -

          right, so given the problem:

          binary distribution contains only the jars necessary for solrj and contrib plugins, and a solr.war

          The solution is to add the dependencies for solr-core.jar to the binary distribution.

          Show
          Ryan McKinley added a comment - right, so given the problem: binary distribution contains only the jars necessary for solrj and contrib plugins, and a solr.war The solution is to add the dependencies for solr-core.jar to the binary distribution.
          Hide
          David Smiley added a comment -

          Ugh; I can't stay away from this soap opera train wreck.

          I'm not on the PMC so perhaps I should bud out, but if a successful conclusion to this JIRA issue means that dependencies such as commons-csv don't wind up in maven central, thus preventing me from effectively utilizing Solr as an API with Maven, I'm -1. All sorts of open-source dependencies are in maven central published "unofficially" using coordinates of another project that needed it there, customized or not. What's it to you?

          I understand if Rob, Mike, etc. want nothing to do with Maven and I think that's just fine. But please don't stand in Steve and I's way.

          Show
          David Smiley added a comment - Ugh; I can't stay away from this soap opera train wreck. I'm not on the PMC so perhaps I should bud out, but if a successful conclusion to this JIRA issue means that dependencies such as commons-csv don't wind up in maven central, thus preventing me from effectively utilizing Solr as an API with Maven, I'm -1. All sorts of open-source dependencies are in maven central published "unofficially" using coordinates of another project that needed it there, customized or not. What's it to you? I understand if Rob, Mike, etc. want nothing to do with Maven and I think that's just fine. But please don't stand in Steve and I's way.
          Hide
          Robert Muir added a comment -

          All sorts of open-source dependencies are in maven central published "unofficially" using coordinates of another project that needed it there, customized or not. What's it to you?

          I don't care. we shouldn't release other peoples code. Thats what got us into trouble in the first place.

          I understand if Rob, Mike, etc. want nothing to do with Maven and I think that's just fine. But please don't stand in Steve and I's way.

          A cavalier attitude about whats in our releases doesn't help increase our confidence in this maven business.

          Show
          Robert Muir added a comment - All sorts of open-source dependencies are in maven central published "unofficially" using coordinates of another project that needed it there, customized or not. What's it to you? I don't care. we shouldn't release other peoples code. Thats what got us into trouble in the first place. I understand if Rob, Mike, etc. want nothing to do with Maven and I think that's just fine. But please don't stand in Steve and I's way. A cavalier attitude about whats in our releases doesn't help increase our confidence in this maven business.
          Hide
          Michael McCandless added a comment -

          I understand if Rob, Mike, etc. want nothing to do with Maven and I think that's just fine. But please don't stand in Steve and I's way.

          It's not that I want to stand in your way.

          I agree that many users want to consume Lucene/Solr from Maven's
          central repository, and I agree that users want to to build their own
          projects, depending on Lucene/Solr, using Maven. That's all great.

          I want Lucene/Solr to be widely accessible/adopted and so pushing to
          Maven central helps achieve that goal.

          I just don't think it should be this PMC that votes on / pushes our
          released artifacts to Maven.

          Pushing to Maven has clear risks ("we" got "in trouble" for it), not
          all PMC members understand the Maven policies/conventions, it's a
          distraction ("we" are supposed to be focused on building great search
          engines around here).

          We don't push to all the other great repositories (apt, yum, FreeBSD,
          etc.) out there. We don't understand their conventions either. The
          PMC doesn't vote when a downstream package maintainer pushes our
          artifacts into their repository. Why should Maven central be any
          different from other repositories?

          And I still assert that a stronger decoupling the PMC voting on the
          "true" Lucene/Solr artifacts from pushing-to-Maven-central would
          net/net be a win for Maven users. Eg, Lucene 3.4.0's Maven artifacts
          were broken (SOLR-2770), and now apparently also 3.6.0's (SOLR-3411).
          But if the two events were fully decoupled then the Maven POMs could
          be re-pushed without this PMC being involved. And issues like this
          ("which jars/wars should be pushed into Maven central... solr.war
          expanded or not") wouldn't be this PMC's business. The Maven experts
          would be free to make such decisions.

          Maybe... a possible compromise here would be to continue pushing to
          Maven central, but as a downstream event (after a release) within this
          project. Meaning, the PMC votes on the "original" sources/convenience
          binaries, but then the Maven experts around here can separately (once
          the vote passes) take that binary release, expand it, attach POMs,
          etc., and push to Maven central. This would mean the PMC doesn't vote
          on what's-pushed-to-Maven, but we continue using this project's
          infrastructure (svn, continuous builds, Jira, etc.) to push to Maven
          central. Could something like that work?

          Show
          Michael McCandless added a comment - I understand if Rob, Mike, etc. want nothing to do with Maven and I think that's just fine. But please don't stand in Steve and I's way. It's not that I want to stand in your way. I agree that many users want to consume Lucene/Solr from Maven's central repository, and I agree that users want to to build their own projects, depending on Lucene/Solr, using Maven. That's all great. I want Lucene/Solr to be widely accessible/adopted and so pushing to Maven central helps achieve that goal. I just don't think it should be this PMC that votes on / pushes our released artifacts to Maven. Pushing to Maven has clear risks ("we" got "in trouble" for it), not all PMC members understand the Maven policies/conventions, it's a distraction ("we" are supposed to be focused on building great search engines around here). We don't push to all the other great repositories (apt, yum, FreeBSD, etc.) out there. We don't understand their conventions either. The PMC doesn't vote when a downstream package maintainer pushes our artifacts into their repository. Why should Maven central be any different from other repositories? And I still assert that a stronger decoupling the PMC voting on the "true" Lucene/Solr artifacts from pushing-to-Maven-central would net/net be a win for Maven users. Eg, Lucene 3.4.0's Maven artifacts were broken ( SOLR-2770 ), and now apparently also 3.6.0's ( SOLR-3411 ). But if the two events were fully decoupled then the Maven POMs could be re-pushed without this PMC being involved. And issues like this ("which jars/wars should be pushed into Maven central... solr.war expanded or not") wouldn't be this PMC's business. The Maven experts would be free to make such decisions. Maybe... a possible compromise here would be to continue pushing to Maven central, but as a downstream event (after a release) within this project. Meaning, the PMC votes on the "original" sources/convenience binaries, but then the Maven experts around here can separately (once the vote passes) take that binary release, expand it, attach POMs, etc., and push to Maven central. This would mean the PMC doesn't vote on what's-pushed-to-Maven, but we continue using this project's infrastructure (svn, continuous builds, Jira, etc.) to push to Maven central. Could something like that work?
          Hide
          Ryan McKinley added a comment -

          If I understand the concerns of this issue – it is that reviewing the binary distribution (the .zip/.tgz) does not fully expose the dependencies we assume.

          The core of that problem is that solr dependency structure is a mess. In SOLR-3400, we need to be explicit in ant about what dependencies are for solrj vs solr-core vs solr.war

          The solr.war dependencies are hidden implementation details. But solr-core.jar file should include its dependencies too.

          Show
          Ryan McKinley added a comment - If I understand the concerns of this issue – it is that reviewing the binary distribution (the .zip/.tgz) does not fully expose the dependencies we assume. The core of that problem is that solr dependency structure is a mess. In SOLR-3400 , we need to be explicit in ant about what dependencies are for solrj vs solr-core vs solr.war The solr.war dependencies are hidden implementation details. But solr-core.jar file should include its dependencies too.
          Hide
          Steve Rowe added a comment -

          Lucene 3.4.0's Maven artifacts were broken (SOLR-2770), and now apparently also 3.6.0's (SOLR-3411).

          I just resolved SOLR-3411 as "Not a Problem". The brokenness (from that issue's reporter's perspective) was an example of exactly the non-virality that you have been lobbying for, Mike.

          Show
          Steve Rowe added a comment - Lucene 3.4.0's Maven artifacts were broken ( SOLR-2770 ), and now apparently also 3.6.0's ( SOLR-3411 ). I just resolved SOLR-3411 as "Not a Problem". The brokenness (from that issue's reporter's perspective) was an example of exactly the non-virality that you have been lobbying for, Mike.
          Hide
          Steve Rowe added a comment - - edited

          But if the two events were fully decoupled then the Maven POMs could be re-pushed without this PMC being involved.

          Benson asserted elsewhere that if an ASF-external project wanted to push Lucene/Solr Maven artifacts, they would NOT be able to use org.apache.lucene/solr as the groupId for those artifacts. I view that as a significant problem, if it is in fact true.

          Show
          Steve Rowe added a comment - - edited But if the two events were fully decoupled then the Maven POMs could be re-pushed without this PMC being involved. Benson asserted elsewhere that if an ASF-external project wanted to push Lucene/Solr Maven artifacts, they would NOT be able to use org.apache.lucene/solr as the groupId for those artifacts. I view that as a significant problem, if it is in fact true.
          Hide
          Robert Muir added a comment -

          Who enforces that?

          Chris male had no problem putting up langdetect under com.cybozu.labs, and he has nothing to do with them

          http://search.maven.org/remotecontent?filepath=com/cybozu/labs/langdetect/1.1-20120112/langdetect-1.1-20120112.pom

          Show
          Robert Muir added a comment - Who enforces that? Chris male had no problem putting up langdetect under com.cybozu.labs, and he has nothing to do with them http://search.maven.org/remotecontent?filepath=com/cybozu/labs/langdetect/1.1-20120112/langdetect-1.1-20120112.pom
          Hide
          Steve Rowe added a comment -

          Maybe... a possible compromise here would be to continue pushing to
          Maven central, but as a downstream event (after a release) within this
          project. Meaning, the PMC votes on the "original" sources/convenience
          binaries, but then the Maven experts around here can separately (once
          the vote passes) take that binary release, expand it, attach POMs,
          etc., and push to Maven central. This would mean the PMC doesn't vote
          on what's-pushed-to-Maven, but we continue using this project's
          infrastructure (svn, continuous builds, Jira, etc.) to push to Maven
          central. Could something like that work?

          From the Apache board perspective, I suspect that this would be viewed as a distinction without a difference; that is, no matter whether the PMC votes on Maven artifacts, the fact that they would be hosted by the Lucene/Solr project, and for the foreseeable future anyway, published by a PMC member, the PMC will continue to carry responsibility for Mavenish things when they go wrong.

          That said, I'd be fine with this. The only (slight) snag: Maven artifacts have to be signed; for the .jars/.war that's not a problem - they can be taken from the binary distribution. The POMs, by contrast, will have to be separately signed by a Lucene/Solr PMC member.

          The PMC is supposed to only be voting on source releases anyway, right?

          Show
          Steve Rowe added a comment - Maybe... a possible compromise here would be to continue pushing to Maven central, but as a downstream event (after a release) within this project. Meaning, the PMC votes on the "original" sources/convenience binaries, but then the Maven experts around here can separately (once the vote passes) take that binary release, expand it, attach POMs, etc., and push to Maven central. This would mean the PMC doesn't vote on what's-pushed-to-Maven, but we continue using this project's infrastructure (svn, continuous builds, Jira, etc.) to push to Maven central. Could something like that work? From the Apache board perspective, I suspect that this would be viewed as a distinction without a difference; that is, no matter whether the PMC votes on Maven artifacts, the fact that they would be hosted by the Lucene/Solr project, and for the foreseeable future anyway, published by a PMC member, the PMC will continue to carry responsibility for Mavenish things when they go wrong. That said, I'd be fine with this. The only (slight) snag: Maven artifacts have to be signed; for the .jars/.war that's not a problem - they can be taken from the binary distribution. The POMs, by contrast, will have to be separately signed by a Lucene/Solr PMC member. The PMC is supposed to only be voting on source releases anyway, right?
          Hide
          Jan Høydahl added a comment -

          +1 to continue publishing to mvn-repositories

          It's a huge benefit for many users and downstream professionals. We have at least 2 committers willing to maintain this, and we're getting better at it each time. I think that's all it takes.

          It seems actually that the commons-csv issue - which was not a Maven issue - has actually helped us clean up a lot of mess in our sources, build system, dependency structure etc. It's been too easy to include questionable libs or non-released libs, and that's the real problem if you ask me. So publishing to mvn-repo actually keeps us accountable in legally being good Apache citizens as well as shipping higher quality, more stable stuff. It's a Good Thing™ that Noggit got its release. It will be a good thing if/when commons-csv ships a release that we can depend on without patching.

          Regarding "hiding" stuff in our binary .jars or .war - that won't solve anything. Some people actually run more than Solr in their app-server, add their own plugins etc. So the risk of package name clash or slf4j binding incompatibilities actually increases, the more things we throw into the .war. I just had a project with a webapp using SolrJ needed slf4j 1.5.8, which crashed with SolrJ's jcl-over-slf4j (1.6.1) dependency. The solution was simply to exclude the 1.6.1 dep and things worked fine. If SolrJ was just one huge .jar with all deps melted together that would not be an option.

          I'm also +1 for including all required deps in the binary release of Solr.

          Show
          Jan Høydahl added a comment - +1 to continue publishing to mvn-repositories It's a huge benefit for many users and downstream professionals. We have at least 2 committers willing to maintain this, and we're getting better at it each time. I think that's all it takes. It seems actually that the commons-csv issue - which was not a Maven issue - has actually helped us clean up a lot of mess in our sources, build system, dependency structure etc. It's been too easy to include questionable libs or non-released libs, and that's the real problem if you ask me. So publishing to mvn-repo actually keeps us accountable in legally being good Apache citizens as well as shipping higher quality, more stable stuff. It's a Good Thing™ that Noggit got its release. It will be a good thing if/when commons-csv ships a release that we can depend on without patching. Regarding "hiding" stuff in our binary .jars or .war - that won't solve anything. Some people actually run more than Solr in their app-server, add their own plugins etc. So the risk of package name clash or slf4j binding incompatibilities actually increases, the more things we throw into the .war. I just had a project with a webapp using SolrJ needed slf4j 1.5.8, which crashed with SolrJ's jcl-over-slf4j (1.6.1) dependency. The solution was simply to exclude the 1.6.1 dep and things worked fine. If SolrJ was just one huge .jar with all deps melted together that would not be an option. I'm also +1 for including all required deps in the binary release of Solr.
          Hide
          Robert Muir added a comment -

          It seems actually that the commons-csv issue - which was not a Maven issue

          Really? then explain this.

          Thanks.

          $ unzip -l apache-solr-3.5.0.zip | grep commons-csv
          $ 
          

          But,

          http://search.maven.org/#artifactdetails|org.apache.solr|solr-commons-csv|3.5.0|jar

          Show
          Robert Muir added a comment - It seems actually that the commons-csv issue - which was not a Maven issue Really? then explain this. Thanks. $ unzip -l apache-solr-3.5.0.zip | grep commons-csv $ But, http://search.maven.org/#artifactdetails |org.apache.solr|solr-commons-csv|3.5.0|jar
          Hide
          Dawid Weiss added a comment -

          I still think this is a misunderstanding of what a "maven release" is by the board. I mean the POM states clearly:

            <groupId>org.apache.solr</groupId>
            <artifactId>solr-commons-csv</artifactId>
            <name>Solr Specific Commons CSV</name>
            <version>3.5.0</version>
            <description>Solr Specific Commons CSV v1.0-SNAPSHOT-r966014</description>
          

          So it's not commons-csv. It's solr-SPECIFIC-commons-csv. Maven folks don't just download jars from maven central, they use pom dependencies. If you depend on the above, it's hard to call it an official commons-csv release...

          Show
          Dawid Weiss added a comment - I still think this is a misunderstanding of what a "maven release" is by the board. I mean the POM states clearly: <groupId>org.apache.solr</groupId> <artifactId>solr-commons-csv</artifactId> <name>Solr Specific Commons CSV</name> <version>3.5.0</version> <description>Solr Specific Commons CSV v1.0-SNAPSHOT-r966014</description> So it's not commons-csv. It's solr- SPECIFIC -commons-csv. Maven folks don't just download jars from maven central, they use pom dependencies. If you depend on the above, it's hard to call it an official commons-csv release...
          Hide
          Robert Muir added a comment -

          It's been too easy to include questionable libs or non-released libs, and that's the real problem if you ask me.
          So publishing to mvn-repo actually keeps us accountable in legally being good Apache citizens as well as shipping higher quality, more stable stuff.

          Thats bullshit. Being in maven repositories doesn't make anything more legal.

          Requiring that all dependencies be in maven harms software projects:

          • it prevents good features from being added, for example the most popular Tika issue (outlook support) is just hung on this stupid stuff (TIKA-623)
          • it encourages buggy software. Perhaps its "conventional" that software projects just pass the blame down along, but if we have bugs that break our release we should make our release work instead of passing blame.

          It's a Good Thing™ that Noggit got its release.

          I agree. I upload my patch to start using it nearly a month ago. Its too bad no maven supporters
          have done anything to make it accessible via maven.

          The fact its a real release is good, and the patch is good. Its time to commit it.

          Show
          Robert Muir added a comment - It's been too easy to include questionable libs or non-released libs, and that's the real problem if you ask me. So publishing to mvn-repo actually keeps us accountable in legally being good Apache citizens as well as shipping higher quality, more stable stuff. Thats bullshit. Being in maven repositories doesn't make anything more legal. Requiring that all dependencies be in maven harms software projects: it prevents good features from being added, for example the most popular Tika issue (outlook support) is just hung on this stupid stuff ( TIKA-623 ) it encourages buggy software. Perhaps its "conventional" that software projects just pass the blame down along, but if we have bugs that break our release we should make our release work instead of passing blame. It's a Good Thing™ that Noggit got its release. I agree. I upload my patch to start using it nearly a month ago. Its too bad no maven supporters have done anything to make it accessible via maven. The fact its a real release is good, and the patch is good. Its time to commit it.
          Hide
          Michael McCandless added a comment -

          I just resolved SOLR-3411 as "Not a Problem".

          OK thanks Steve. I'm glad it's not a real problem.

          From the Apache board perspective, I suspect that this would be viewed as a distinction without a difference;

          True, but I think that's OK. It's a compromise.

          The PMC is supposed to only be voting on source releases anyway, right?

          Legally, yes, but in practice, we are also testing and pushing out the
          convenience binaries (and, Maven's artifacts) at the same time. They
          are all read-only once published.

          Show
          Michael McCandless added a comment - I just resolved SOLR-3411 as "Not a Problem". OK thanks Steve. I'm glad it's not a real problem. From the Apache board perspective, I suspect that this would be viewed as a distinction without a difference; True, but I think that's OK. It's a compromise. The PMC is supposed to only be voting on source releases anyway, right? Legally, yes, but in practice, we are also testing and pushing out the convenience binaries (and, Maven's artifacts) at the same time. They are all read-only once published.
          Hide
          Jan Høydahl added a comment -

          It's been too easy to include questionable libs or non-released libs, and that's the real problem if you ask me. So publishing to mvn-repo actually keeps us accountable in legally being good Apache citizens as well as shipping higher quality, more stable stuff.

          Thats bullshit. Being in maven repositories doesn't make anything more legal.

          I'm not saying that. I'm saying that a positive side effect of publishing all our release artifacts to a broader public is that it helps detect bad and hacky practices in our own code. If we feel we need to hide the truth about our dependencies or build artifacts then it is better to put a bright light on why than shuffling things underneath a carpet.

          Once in a while we judge that it may still be more gain than pain to include some unreleased lib or a patched version of a lib in our distro (after having first tried to get it fixed upstream) and that's fine with me; if repackaging properly under new namespace and include this as a (temporary) custom dependency, both in our binary distro and therefore also in maven-repos. But we should try to replace these custom deps by official release versions when possible.

          Show
          Jan Høydahl added a comment - It's been too easy to include questionable libs or non-released libs, and that's the real problem if you ask me. So publishing to mvn-repo actually keeps us accountable in legally being good Apache citizens as well as shipping higher quality, more stable stuff. Thats bullshit. Being in maven repositories doesn't make anything more legal. I'm not saying that. I'm saying that a positive side effect of publishing all our release artifacts to a broader public is that it helps detect bad and hacky practices in our own code. If we feel we need to hide the truth about our dependencies or build artifacts then it is better to put a bright light on why than shuffling things underneath a carpet. Once in a while we judge that it may still be more gain than pain to include some unreleased lib or a patched version of a lib in our distro (after having first tried to get it fixed upstream) and that's fine with me; if repackaging properly under new namespace and include this as a (temporary) custom dependency, both in our binary distro and therefore also in maven-repos. But we should try to replace these custom deps by official release versions when possible.
          Hide
          Robert Muir added a comment -

          But you need to realize a lot of software has official releases, they just dont care about maven.

          A great example of that is the noggit release. Again i've had a patch up for a month, and I think
          it makes our release more clean to depend on this real release, than to have code copied from apache labs.

          But i've waited so long in the hopes someone will step up and put the thing in maven, i've detailed
          out the reasons on SOLR-3296.

          In this case, maven is making things less legal. I hope everyone sees that!

          Show
          Robert Muir added a comment - But you need to realize a lot of software has official releases, they just dont care about maven. A great example of that is the noggit release. Again i've had a patch up for a month, and I think it makes our release more clean to depend on this real release, than to have code copied from apache labs. But i've waited so long in the hopes someone will step up and put the thing in maven, i've detailed out the reasons on SOLR-3296 . In this case, maven is making things less legal. I hope everyone sees that!
          Hide
          Jan Høydahl added a comment -

          But you need to realize a lot of software has official releases, they just dont care about maven.

          A great example of that is the noggit release. Again i've had a patch up for a month, and I think it makes our release more clean to depend on this real release, than to have code copied from apache labs.

          I don't think Noggit is a good example. It is written by Yonik and prohibited from releasing anything since it's part of Apache Labs, so probably noone knows about it. If it rather had started its life as part of Lucene's source code and later been spawned out as its own project, it would have gotten more love and care, would have had Javadocs, some documentation etc. So having Noggit distributed to Maven is as close as asking your colleague to publish it.

          I would rather state that most Java libraries do care about Maven.

          Show
          Jan Høydahl added a comment - But you need to realize a lot of software has official releases, they just dont care about maven. A great example of that is the noggit release. Again i've had a patch up for a month, and I think it makes our release more clean to depend on this real release, than to have code copied from apache labs. I don't think Noggit is a good example. It is written by Yonik and prohibited from releasing anything since it's part of Apache Labs, so probably noone knows about it. If it rather had started its life as part of Lucene's source code and later been spawned out as its own project, it would have gotten more love and care, would have had Javadocs, some documentation etc. So having Noggit distributed to Maven is as close as asking your colleague to publish it. I would rather state that most Java libraries do care about Maven.
          Hide
          Ryan McKinley added a comment -

          We are a bit lost on what we are talking about – I don't expect we will all agree on the best maven strategy.

          Something mentioned over an over in this thread is concern that sonatype maven central is somehow the repository. That is nonsense, there is no reason to do crazy plugins to try to pretend stuff is there when we can just add (or suggest adding) other potential repositories. If we are worried about supporting the 1-off crazy patched jar, we can point it to something as crazy as:

          <pluginRepositories>
             <pluginRepository>
               <id>maven-timestamp</id>
               <url>http://maven-timestamp-plugin.googlecode.com/svn/trunk/repository</url>
             </pluginRepository>
          </pluginRepositories>
          

          but I feel like i am just adding more noise to an issue without focus

          Show
          Ryan McKinley added a comment - We are a bit lost on what we are talking about – I don't expect we will all agree on the best maven strategy. Something mentioned over an over in this thread is concern that sonatype maven central is somehow the repository. That is nonsense, there is no reason to do crazy plugins to try to pretend stuff is there when we can just add (or suggest adding) other potential repositories. If we are worried about supporting the 1-off crazy patched jar, we can point it to something as crazy as: <pluginRepositories> <pluginRepository> <id> maven-timestamp </id> <url> http://maven-timestamp-plugin.googlecode.com/svn/trunk/repository </url> </pluginRepository> </pluginRepositories> but I feel like i am just adding more noise to an issue without focus
          Hide
          Robert Muir added a comment -

          If we are worried about supporting the 1-off crazy patched jar, we can point it to something as crazy as:

          Really? Then you can also tell infra to disable the release mirroring system: hey its useless, we just have svn.

          Somehow I don't think that would go over well: they would probably just delete the jar.

          We still dont have:

          • a way to handle patched dependencies for maven
          • a way to handle dependencies that are not in maven
          • a packaging system for maven consistent with our other packaging.

          In other words: maven is out of control.

          I'm now with Mike, I think we have to get this out from under our PMC and do it some other way.

          Show
          Robert Muir added a comment - If we are worried about supporting the 1-off crazy patched jar, we can point it to something as crazy as: Really? Then you can also tell infra to disable the release mirroring system: hey its useless, we just have svn. Somehow I don't think that would go over well: they would probably just delete the jar. We still dont have: a way to handle patched dependencies for maven a way to handle dependencies that are not in maven a packaging system for maven consistent with our other packaging. In other words: maven is out of control. I'm now with Mike, I think we have to get this out from under our PMC and do it some other way.
          Hide
          Steve Rowe added a comment -

          I'm now with Mike, I think we have to get this out from under our PMC and do it some other way.

          What changed your mind? (Serious question)

          Show
          Steve Rowe added a comment - I'm now with Mike, I think we have to get this out from under our PMC and do it some other way. What changed your mind? (Serious question)
          Hide
          Robert Muir added a comment -

          What changed your mind? (Serious question)

          Seriously: I want our releases clean and bulletproof from problems.

          People can say we only vote on the source release, but we can't pretend that we are not
          responsible for binary/maven artifacts we produce too. The commons-csv issue showed that
          as a PMC we get hassled about these things too!

          So when we put stuff up in people.apache.org/~whoever/staging_area/lucene-solr-XXX.YYY,
          I want everything in that folder to be packaged correctly, not illegal, not causing
          problems to other projects, etc, etc.

          Its unrelated to the benefits of maven. I just want this stuff clean.

          So I got frustrated with some of the responses/suggestions here that seem like maybe
          people aren't taking this stuff as seriously as we should be.

          We are held responsible for the stuff we put out, so if people feel "anything goes"
          for the maven artifacts as long as they work, then I don't know how we as a PMC are
          supposed to have any confidence at all that they are clean!

          You can say i'm being overly anal or a policeman or whatever, but I feel I have to
          be watching this maven stuff like a hawk right now (even though i dont really understand
          it).

          So it starts to become clear to me, that not everyone cares so much about the maven
          artifacts being proper and correct. With that being the case, I don't want to be
          responsible for it, I'd just as soon absolve myself of it, get back to working on
          search engines, and let someone else (not our PMC) be held to the fire for it.

          Show
          Robert Muir added a comment - What changed your mind? (Serious question) Seriously: I want our releases clean and bulletproof from problems. People can say we only vote on the source release, but we can't pretend that we are not responsible for binary/maven artifacts we produce too. The commons-csv issue showed that as a PMC we get hassled about these things too! So when we put stuff up in people.apache.org/~whoever/staging_area/lucene-solr-XXX.YYY, I want everything in that folder to be packaged correctly, not illegal, not causing problems to other projects, etc, etc. Its unrelated to the benefits of maven. I just want this stuff clean. So I got frustrated with some of the responses/suggestions here that seem like maybe people aren't taking this stuff as seriously as we should be. We are held responsible for the stuff we put out, so if people feel "anything goes" for the maven artifacts as long as they work, then I don't know how we as a PMC are supposed to have any confidence at all that they are clean! You can say i'm being overly anal or a policeman or whatever, but I feel I have to be watching this maven stuff like a hawk right now (even though i dont really understand it). So it starts to become clear to me, that not everyone cares so much about the maven artifacts being proper and correct. With that being the case, I don't want to be responsible for it, I'd just as soon absolve myself of it, get back to working on search engines, and let someone else (not our PMC) be held to the fire for it.
          Hide
          Steve Rowe added a comment -

          So I got frustrated with some of the responses/suggestions here that seem like maybe people aren't taking this stuff as seriously as we should be.

          I'm taking this stuff seriously.

          • patched dependencies: There is no patched-dependencies solution for Maven at this point, but putting patched dependencies up as forked projects with "download jar" links on github makes them exactly like other non-mavenized dependencies, so if Lucene/Solr goes that route independent of Maven concerns, then it isn't a separate issue for Maven.
          • non-mavenized dependencies: the standard Maven-proponent answer (i.e. "just put them in Maven") may work some of the time, but it certainly isn't a panacea, and Lucene/Solr needs to cover all bases. I think ivy-maven-plugin could address most, and maybe all, of the cases where "just put them in Maven" doesn't work.
          • packaging: I would split this into two concerns:
            • Maven binary jar/war artifacts should be identical (bit for bit) to the official binary artifacts.
            • Maven POMs should require the same dependencies that Solr ships with. In other words, as I stated previously on this issue: POMs for Solr jars/war published on Maven Central should never require (i.e., have a non-optional dependency on) a third party artifact if that third party dependency is not directly included in the binary package; the contents of the war don't count as "inclusion in the binary package".

          This issue is supposed to be about this last point. I don't agree with the idea myself.

          Here's why: Maven POMs should list the dependencies required to use the associated artifact. I seriously don't understand why it matters if this differs from the 3rd party libraries shipped (directly, not in the war) with the convenience binary package.

          And, as Ryan has stated on this issue, what's included in the convenience binary package is subject to change - we could just start including all 3rd party libraries in the Solr convenience distribution. Why not?

          Show
          Steve Rowe added a comment - So I got frustrated with some of the responses/suggestions here that seem like maybe people aren't taking this stuff as seriously as we should be. I'm taking this stuff seriously. patched dependencies: There is no patched-dependencies solution for Maven at this point, but putting patched dependencies up as forked projects with "download jar" links on github makes them exactly like other non-mavenized dependencies, so if Lucene/Solr goes that route independent of Maven concerns, then it isn't a separate issue for Maven. non-mavenized dependencies: the standard Maven-proponent answer (i.e. "just put them in Maven") may work some of the time, but it certainly isn't a panacea, and Lucene/Solr needs to cover all bases. I think ivy-maven-plugin could address most, and maybe all, of the cases where "just put them in Maven" doesn't work. packaging: I would split this into two concerns: Maven binary jar/war artifacts should be identical (bit for bit) to the official binary artifacts. Maven POMs should require the same dependencies that Solr ships with. In other words, as I stated previously on this issue: POMs for Solr jars/war published on Maven Central should never require (i.e., have a non-optional dependency on) a third party artifact if that third party dependency is not directly included in the binary package; the contents of the war don't count as "inclusion in the binary package". This issue is supposed to be about this last point. I don't agree with the idea myself. Here's why: Maven POMs should list the dependencies required to use the associated artifact. I seriously don't understand why it matters if this differs from the 3rd party libraries shipped (directly, not in the war) with the convenience binary package. And, as Ryan has stated on this issue, what's included in the convenience binary package is subject to change - we could just start including all 3rd party libraries in the Solr convenience distribution. Why not?
          Hide
          Steve Rowe added a comment -

          Bulk move 4.4 issues to 4.5 and 5.0

          Show
          Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
          Hide
          Uwe Schindler added a comment -

          Move issue to Solr 4.9.

          Show
          Uwe Schindler added a comment - Move issue to Solr 4.9.

            People

            • Assignee:
              Unassigned
              Reporter:
              Robert Muir
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:

                Development