Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.6.0
    • Component/s: RPM
    • Labels:
      None

      Description

      Here's the deal – during the RPM upgrade sequence there's a point in time when files that have different names in new and old package exist side-by-side. What it means for our style of hadoop packaging is that we'll have both /usr/lib/hadoop/foo-<old version>.jar and /usr/lib/hadoop/foo-<new version>.jar getting onto the classpath when we issue a condrestart for any service.

      This is pretty bad.

      At this point my knee jerk reaction is to re-evaluate why do we need versioned jars to begit with. Do you guys think there's any value in something like:

      • /usr/lib/hadoop/hadoop-common-2.0.0.jar

      vs a simple:

      • /usr/lib/hadoop/hadoop-common.jar

        Activity

        Hide
        Konstantin Boudnik added a comment -

        I would be real hesitant in doing so. I will hold my reasons until later though

        Show
        Konstantin Boudnik added a comment - I would be real hesitant in doing so. I will hold my reasons until later though
        Hide
        Sean Mackrory added a comment -

        I don't know how practical this is or even if we would be able to update the symlinks by that point in the upgrade sequence, but what about having versioned JARs in separate directories and non-versioned symlinks to them in the classpath? I do think that dropping the versions from names would be annoying when trying to debug and ascertain which versions of each artifact is being used, so if we do need to drop them to enable smoother upgrades, then I definitely think we should provide another easy way to find that out.

        Show
        Sean Mackrory added a comment - I don't know how practical this is or even if we would be able to update the symlinks by that point in the upgrade sequence, but what about having versioned JARs in separate directories and non-versioned symlinks to them in the classpath? I do think that dropping the versions from names would be annoying when trying to debug and ascertain which versions of each artifact is being used, so if we do need to drop them to enable smoother upgrades, then I definitely think we should provide another easy way to find that out.
        Hide
        Bruno Mahé added a comment -

        Roman, could you give a try to the patch I am attaching and tell me if it fixes the upgrade path?

        Your issue sounds very similar to BIGTOP-367.
        Furthermore, by looking at the spec file, it hit me that we split jars into multiple packages (hadoop-hdfs, hadoop-yarn...). So the Requires(pre) should be applied to them as well so the old jars get removed before services get restarted.
        Or at least, this is what I suspect.

        Show
        Bruno Mahé added a comment - Roman, could you give a try to the patch I am attaching and tell me if it fixes the upgrade path? Your issue sounds very similar to BIGTOP-367 . Furthermore, by looking at the spec file, it hit me that we split jars into multiple packages (hadoop-hdfs, hadoop-yarn...). So the Requires(pre) should be applied to them as well so the old jars get removed before services get restarted. Or at least, this is what I suspect.
        Hide
        Peter Linnell added a comment -

        +1 for Bruno's approach. That is a relatively safe approach. Still needs testing, but this is a simple way to fix this.

        Show
        Peter Linnell added a comment - +1 for Bruno's approach. That is a relatively safe approach. Still needs testing, but this is a simple way to fix this.
        Hide
        Konstantin Boudnik added a comment -

        no objections to this approach Looks good.

        Show
        Konstantin Boudnik added a comment - no objections to this approach Looks good.
        Hide
        Roman Shaposhnik added a comment -

        Bruno, that was my first thought as well but that's not what the problem is. Now, applying your patch would be useful anyway but it doesn't fix this problem. Let me try to explain: the issue here is not that we don't have the right dependencies in place it is that we have 2 copies of the jars from different version in the same location. IOW, suppose you're upgrading from hadoop-hdfs 2.0.2 to hadoop-hdfs 2.0.3 – what I'm saying is that during that brief period of time when RPM tries to call condrestart your /usr/lib/hadoop-hdfs is going to look like this:

        /usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.2.jar
        /usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.3.jar
        /usr/lib/hadoop-hdfs/hadoop-hdfs.jar -> hadoop-hdfs-2.0.3.jar
        

        and since hadoop scripts use globbing all 3 of the jars (well 2 really, since the versionless one is a symlink) will end up on the classpath – bad, bad stuff.

        Show
        Roman Shaposhnik added a comment - Bruno, that was my first thought as well but that's not what the problem is. Now, applying your patch would be useful anyway but it doesn't fix this problem. Let me try to explain: the issue here is not that we don't have the right dependencies in place it is that we have 2 copies of the jars from different version in the same location. IOW, suppose you're upgrading from hadoop-hdfs 2.0.2 to hadoop-hdfs 2.0.3 – what I'm saying is that during that brief period of time when RPM tries to call condrestart your /usr/lib/hadoop-hdfs is going to look like this: /usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.2.jar /usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.3.jar /usr/lib/hadoop-hdfs/hadoop-hdfs.jar -> hadoop-hdfs-2.0.3.jar and since hadoop scripts use globbing all 3 of the jars (well 2 really, since the versionless one is a symlink) will end up on the classpath – bad, bad stuff.
        Hide
        Konstantin Boudnik added a comment -

        Bad stuff indeed.
        Drifting away from the versioned jars will certainly clobber this issue during upgrades, but it might make it worst by not manifesting the version of the libs in the product.

        Show
        Konstantin Boudnik added a comment - Bad stuff indeed. Drifting away from the versioned jars will certainly clobber this issue during upgrades, but it might make it worst by not manifesting the version of the libs in the product.
        Hide
        Bruno Mahé added a comment -

        Roman> Have you tried my patch?
        Because you are exactly describing the very same issue than BIGTOP-367, which was fixed for Apache Hadoop 1.X. The issue is re-appearing because jars got split in multiple packages and we did not update the dependencies.
        Check the description of BIGTOP-367 and I got it down to the rpm steps to upgrade packages.
        So how is your issue different?

        And the point of my patch is not to just be correct, it is exactly to ensure that the old jars are removed before we restart the services.
        See the documention of Requires(pre):

        In effect, Requires(pre) sort of tells RPM that the given dependency must be dealt with before the current package. So by specifying Requires(pre) for all packages containing needed jars, we ensure no old jar will be there when we restart the services.

        At the heart of the issue, this is just a dependency problem.

        Could you confirm that my patch does not fix your issue?
        If my patch does not fix your issue, I would be really curious to see some detailed logs.

        Show
        Bruno Mahé added a comment - Roman> Have you tried my patch? Because you are exactly describing the very same issue than BIGTOP-367 , which was fixed for Apache Hadoop 1.X. The issue is re-appearing because jars got split in multiple packages and we did not update the dependencies. Check the description of BIGTOP-367 and I got it down to the rpm steps to upgrade packages. So how is your issue different? And the point of my patch is not to just be correct, it is exactly to ensure that the old jars are removed before we restart the services. See the documention of Requires(pre): http://www.rpm.org/max-rpm-snapshot/s1-rpm-depend-manual-dependencies.html (Section Context Marked Dependencies ) http://rpm.org/api/4.4.2.2/tsort.html In effect, Requires(pre) sort of tells RPM that the given dependency must be dealt with before the current package. So by specifying Requires(pre) for all packages containing needed jars, we ensure no old jar will be there when we restart the services. At the heart of the issue, this is just a dependency problem. Could you confirm that my patch does not fix your issue? If my patch does not fix your issue, I would be really curious to see some detailed logs.
        Hide
        Roman Shaposhnik added a comment -

        My only explanation of why it didn't work could be that I tried it on CentOS5 (this is the oldest RPM we have to deal with). Another possibility is that I could've seen a different issue with condrestart not working (the one that is currently masked by BIGTOP-844) and thought that the patch didn't help. Let me re-run my experiments and attach the full logs.

        Show
        Roman Shaposhnik added a comment - My only explanation of why it didn't work could be that I tried it on CentOS5 (this is the oldest RPM we have to deal with). Another possibility is that I could've seen a different issue with condrestart not working (the one that is currently masked by BIGTOP-844 ) and thought that the patch didn't help. Let me re-run my experiments and attach the full logs.
        Hide
        Bruno Mahé added a comment -

        What was your testing process?

        • Did you try from an already existing RPM to a newly built and patched one?
        • Did you try from a patched version (my patch applied) to a new version with my patch as well?

        I am asking about the testing process because there are some differences between the two. For instance, there are parts of the upgrade ran by the old package's spec and some by the new one. Unfortunately, I forgot which parts and would have to checks.

        I may also try to reproduce the issue tonight or the coming days. So it would be great if you could detail as much as possible the steps or information useful for reproducing your issue.

        Show
        Bruno Mahé added a comment - What was your testing process? Did you try from an already existing RPM to a newly built and patched one? Did you try from a patched version (my patch applied) to a new version with my patch as well? I am asking about the testing process because there are some differences between the two. For instance, there are parts of the upgrade ran by the old package's spec and some by the new one. Unfortunately, I forgot which parts and would have to checks. I may also try to reproduce the issue tonight or the coming days. So it would be great if you could detail as much as possible the steps or information useful for reproducing your issue.
        Hide
        Bruno Mahé added a comment -

        ping.

        Show
        Bruno Mahé added a comment - ping.
        Hide
        Roman Shaposhnik added a comment -

        +1 and [a slightly more conservative version of the patch] commited!

        Show
        Roman Shaposhnik added a comment - +1 and [a slightly more conservative version of the patch] commited!

          People

          • Assignee:
            Roman Shaposhnik
            Reporter:
            Roman Shaposhnik
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development