Nutch
  1. Nutch
  2. NUTCH-891

Nutch build should not depend on unversioned local deps

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Invalid
    • Affects Version/s: 2.1
    • Fix Version/s: 2.2
    • Component/s: None
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      The fix in NUTCH-873 introduces an unknown variable to the build process. Since local ivy artifacts are unversioned, different people that install Gora jars at different points in time will use the same artifact id but in fact the artifacts (jars) will differ because they will come from different revisions of Gora sources. Therefore Nutch builds based on the same svn rev. won't be repeatable across different environments.

      As much as it pains the ivy purists until Gora publishes versioned artifacts I'd like to revert the fix in NUTCH-873 and add again Gora jars built from a known external rev. We can add a README that contains commit id from Gora.

      1. gora-49_v1.patch
        2 kB
        Enis Soztutar
      2. gora.build.patch
        2 kB
        Chris A. Mattmann

        Activity

        Hide
        Doğacan Güney added a comment -

        Can't we put Gora jars somewhere? I would like to put jars up somewhere, revision them with commit id (so they will look like gora-core-78ab312.jar), and make nutch depend on a gora version with a commit id... Would this be difficult to do with ivy?

        Show
        Doğacan Güney added a comment - Can't we put Gora jars somewhere? I would like to put jars up somewhere, revision them with commit id (so they will look like gora-core-78ab312.jar), and make nutch depend on a gora version with a commit id... Would this be difficult to do with ivy?
        Hide
        Chris A. Mattmann added a comment -

        Hi Andrzej:

        Can I get some clarificatioin on this? First, local Ivy jars are versioned, by artifact id and by version #. So, we're talking about gora-0.1.jar here, right? So, your point is, if I'm off gitting and developing on Gora, at any point in time, I can run ant on gora and then it updates my local Ivy repo with a gora-0.1.jar file, right? And your point is, this file is different than the previous gora-0.1.jar file (-N minutes ago), and so thus, Nutch isn't really depending on a stable version, right?

        If the above is true, what it suggests to me is that perhaps the process of installing Gora as a local Ivy dependency (independent of Nutch's deps) needs a bit more discipline. I'd say, why not make the Gora Ant build publish a gora-0.1-<some snapshot id aka SVN rev or UUID or whatever>.jar? In that fashion, you could develop on Gora, without fear of changing anything in the way that Nutch depends on it (because the 0.1 version that Nutch depends on could be frozen as is).

        I'd also be a fan if the above isn't true or doesn't make sense, of actually just uploading Gora to Maven central – can we try that?

        Cheers,
        Chris

        P.S. I'm not trying to be difficult about NUTCH-873 b/c I was the one who did it. If in the end the consensus is to revert it, no egos here, go ahead. I'm just trying to figure out a solution to the problem that allows us to use Ivy as it should be and to not have to make exceptions. My other thought along these lines is that if we can't wrap our heads around Ivy, or getting to Maven Central in any short amount of time, then what about pulling Gora into Nutch SVN? It's ASLv2 licensed and there is nothing against doing this. From there, there would be a clean path to move to Incubation since the code would already be in Apache SVN anyways...

        Show
        Chris A. Mattmann added a comment - Hi Andrzej: Can I get some clarificatioin on this? First, local Ivy jars are versioned, by artifact id and by version #. So, we're talking about gora-0.1.jar here, right? So, your point is, if I'm off gitting and developing on Gora, at any point in time, I can run ant on gora and then it updates my local Ivy repo with a gora-0.1.jar file, right? And your point is, this file is different than the previous gora-0.1.jar file (-N minutes ago), and so thus, Nutch isn't really depending on a stable version, right? If the above is true, what it suggests to me is that perhaps the process of installing Gora as a local Ivy dependency (independent of Nutch's deps) needs a bit more discipline. I'd say, why not make the Gora Ant build publish a gora-0.1-<some snapshot id aka SVN rev or UUID or whatever>.jar? In that fashion, you could develop on Gora, without fear of changing anything in the way that Nutch depends on it (because the 0.1 version that Nutch depends on could be frozen as is). I'd also be a fan if the above isn't true or doesn't make sense, of actually just uploading Gora to Maven central – can we try that? Cheers, Chris P.S. I'm not trying to be difficult about NUTCH-873 b/c I was the one who did it. If in the end the consensus is to revert it, no egos here, go ahead. I'm just trying to figure out a solution to the problem that allows us to use Ivy as it should be and to not have to make exceptions. My other thought along these lines is that if we can't wrap our heads around Ivy, or getting to Maven Central in any short amount of time, then what about pulling Gora into Nutch SVN? It's ASLv2 licensed and there is nothing against doing this. From there, there would be a clean path to move to Incubation since the code would already be in Apache SVN anyways...
        Hide
        Andrzej Bialecki added a comment -

        So, your point is [..]

        Yes, that's exactly my point.

        I'd say, why not make the Gora Ant build publish a gora-0.1-<some snapshot id aka SVN rev or UUID or whatever>.jar?

        Sure, that would solve the problem for now - I'll bother the Gora devs, and you can create the patch, ok? Ultimately we should go with the other solution (publish to Maven), but it requires more involvement from Gora devs.

        I'm not trying to be difficult about NUTCH-873 ...

        Neither am I, no egos here - I just find the current situation after the fix to be intractable, especially when doing bugfixing and testing - because even if APIs stay the same, hidden bugs may not be the same across revisions...

        Show
        Andrzej Bialecki added a comment - So, your point is [..] Yes, that's exactly my point. I'd say, why not make the Gora Ant build publish a gora-0.1-<some snapshot id aka SVN rev or UUID or whatever>.jar? Sure, that would solve the problem for now - I'll bother the Gora devs, and you can create the patch, ok? Ultimately we should go with the other solution (publish to Maven), but it requires more involvement from Gora devs. I'm not trying to be difficult about NUTCH-873 ... Neither am I, no egos here - I just find the current situation after the fix to be intractable, especially when doing bugfixing and testing - because even if APIs stay the same, hidden bugs may not be the same across revisions...
        Hide
        Chris A. Mattmann added a comment -

        Sure, that would solve the problem for now - I'll bother the Gora devs, and you can create the patch, ok? Ultimately we should go with the other solution (publish to Maven), but it requires more involvement from Gora devs.

        I like it! lol. Sure, I'll try and create a patch to make it do that. Installing Gora forced me to figure out how to use git the other day, so why not figure out how to patch Gora! _

        Neither am I, no egos here - I just find the current situation after the fix to be intractable, especially when doing bugfixing and testing - because even if APIs stay the same, hidden bugs may not be the same across revisions...

        I hear ya. OK let me think on this – we definitely need a solution here. In the meanwhile I'll try and figure out how to patch Gora ant to make it version the jar on the Ivy install in a more meaningful way.

        Show
        Chris A. Mattmann added a comment - Sure, that would solve the problem for now - I'll bother the Gora devs, and you can create the patch, ok? Ultimately we should go with the other solution (publish to Maven), but it requires more involvement from Gora devs. I like it! lol. Sure, I'll try and create a patch to make it do that. Installing Gora forced me to figure out how to use git the other day, so why not figure out how to patch Gora! _ Neither am I, no egos here - I just find the current situation after the fix to be intractable, especially when doing bugfixing and testing - because even if APIs stay the same, hidden bugs may not be the same across revisions... I hear ya. OK let me think on this – we definitely need a solution here. In the meanwhile I'll try and figure out how to patch Gora ant to make it version the jar on the Ivy install in a more meaningful way.
        Hide
        Enis Soztutar added a comment -

        Of course the best way that Nutch uses Gora is that Gora publishes it's artifacts to Maven and Nutch uses ivy to fetch the jars. But Gora is still in heavy development and we need some more time to make a first release.

        Until then I think we can use the last commit sha1 in git for the revision number in git. We use this convention when uploading jars to guthub. Would that make sense?

        Show
        Enis Soztutar added a comment - Of course the best way that Nutch uses Gora is that Gora publishes it's artifacts to Maven and Nutch uses ivy to fetch the jars. But Gora is still in heavy development and we need some more time to make a first release. Until then I think we can use the last commit sha1 in git for the revision number in git. We use this convention when uploading jars to guthub. Would that make sense?
        Hide
        Andrzej Bialecki added a comment -

        Yes, this would help.

        Show
        Andrzej Bialecki added a comment - Yes, this would help.
        Hide
        Chris A. Mattmann added a comment -

        Hey Guys,

        OK, i finally had time to do this. I went ahead and added a $

        {now} parameter to the gora jar file names. I could have done like a sha1 appended to the jar name, but kept running into a chicken and egg problem. The attached patch works great and just uses the same ${now}

        format used in the Ivy build-common.xml part. I don't know how to get this contributed to gora, so I'm attaching it here – feel free to pull down into Gora.

        Cheers,
        Chris

        Show
        Chris A. Mattmann added a comment - Hey Guys, OK, i finally had time to do this. I went ahead and added a $ {now} parameter to the gora jar file names. I could have done like a sha1 appended to the jar name, but kept running into a chicken and egg problem. The attached patch works great and just uses the same ${now} format used in the Ivy build-common.xml part. I don't know how to get this contributed to gora, so I'm attaching it here – feel free to pull down into Gora. Cheers, Chris
        Hide
        Enis Soztutar added a comment -

        Nice patch, but I think changing the artifact name causes some problems in the other parts of the build (namely test and publish). I am attaching another patch for Gora which adds jar-snapshot and test-jar-snapshot targets to the top level build.

        Nutch can use :
        $ ant test-jar-snapshot

        and copy the resulting jars at will. Is this acceptable ?

        BTW, gora uses github's issue tracker, you can also comment there. http://github.com/enis/gora/issues/issue/49

        Show
        Enis Soztutar added a comment - Nice patch, but I think changing the artifact name causes some problems in the other parts of the build (namely test and publish). I am attaching another patch for Gora which adds jar-snapshot and test-jar-snapshot targets to the top level build. Nutch can use : $ ant test-jar-snapshot and copy the resulting jars at will. Is this acceptable ? BTW, gora uses github's issue tracker, you can also comment there. http://github.com/enis/gora/issues/issue/49
        Hide
        Chris A. Mattmann added a comment -

        +1. Great patch, Enis, I think we can use this. Are you going to apply it to Gora at Github?

        Also thanks for the link to the issue tracker there!

        Cheers,
        Chris

        Show
        Chris A. Mattmann added a comment - +1. Great patch, Enis, I think we can use this. Are you going to apply it to Gora at Github? Also thanks for the link to the issue tracker there! Cheers, Chris
        Hide
        Enis Soztutar added a comment -

        I have applied the patch to gora via http://github.com/enis/gora/issues/issue/49.
        As part of this issue, I think we can replace unversioned gora jars with the snapshot versions.

        Show
        Enis Soztutar added a comment - I have applied the patch to gora via http://github.com/enis/gora/issues/issue/49 . As part of this issue, I think we can replace unversioned gora jars with the snapshot versions.
        Hide
        Julien Nioche added a comment -

        Probably not an issue anymore. marking it as 2.x to triage unversioned issues, will check later

        Show
        Julien Nioche added a comment - Probably not an issue anymore. marking it as 2.x to triage unversioned issues, will check later
        Hide
        Lewis John McGibbney added a comment -

        Gora is now published to Maven Central and we have moved to Maven for builds over there as well.

        Show
        Lewis John McGibbney added a comment - Gora is now published to Maven Central and we have moved to Maven for builds over there as well.

          People

          • Assignee:
            Unassigned
            Reporter:
            Andrzej Bialecki
          • Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development