Details

    • Type: Improvement Improvement
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.8.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Many of the dependencies that Kafka 0.8 uses are old. It would be a good idea to upgrade all of these where possible.

      log4j is set to 1.2.15, but the latest is 1.2.17
      zookeeper is at 3.3.4, , there is 3.3.6 (August 2012), and 3.4.5 (Nov 2012. 3.4.x includes several major enhancements and fixes over the 3.3.x line.
      org.slf4j is at 1.6.4, there is 1.6.6 (June 2012) and 1.7.5 (March 2013)
      net.sf.jopt-simple is at 3.2, there is 3.3 (May 2011) and 4.4 (Jan 2013)

      1. 0001-KAFKA-854-Upgrade-Deps-to-latest.patch
        6 kB
        Matt Christiansen
      2. KAFKA-854.patch
        22 kB
        Scott Carey
      3. KAFKA-854.patch
        20 kB
        Scott Carey
      4. KAFKA-854.patch
        23 kB
        Scott Carey

        Issue Links

          Activity

          Hide
          Matt Christiansen added a comment -

          Here is a patch that revs to the latest for all but ZK. Moving to ZK 3.4.x will require quiet a few test changes so should probably be its own ticket.

          Show
          Matt Christiansen added a comment - Here is a patch that revs to the latest for all but ZK. Moving to ZK 3.4.x will require quiet a few test changes so should probably be its own ticket.
          Hide
          Dragos Manolescu added a comment -

          While you're at it you could also upgrade Jackson. Currently the build file references org.codehaus.jackson which is old. The upgrade may require some code changes (the API differs in a few places), all trivial through.

          Here's what I use for Scala 2.9.2, they need to be adjusted for 2.8.

          libraryDependencies ++= Seq("com.fasterxml.jackson.core" % "jackson-core" % "2.1.4"
          , "com.fasterxml.jackson.core" % "jackson-databind" % "2.1.4"
          , "com.fasterxml.jackson.module" % "jackson-module-scala_2.9.2" % "2.1.3"
          )

          Show
          Dragos Manolescu added a comment - While you're at it you could also upgrade Jackson. Currently the build file references org.codehaus.jackson which is old. The upgrade may require some code changes (the API differs in a few places), all trivial through. Here's what I use for Scala 2.9.2, they need to be adjusted for 2.8. libraryDependencies ++= Seq("com.fasterxml.jackson.core" % "jackson-core" % "2.1.4" , "com.fasterxml.jackson.core" % "jackson-databind" % "2.1.4" , "com.fasterxml.jackson.module" % "jackson-module-scala_2.9.2" % "2.1.3" )
          Hide
          Scott Carey added a comment -

          Also, upgrading log4j means that all of the ugly bits with:

          
          

          //The issue is going from log4j 1.2.14 to 1.2.15, the developers added some features which required
          // some dependencies on various sun and javax packages.
          override def ivyXML =
          <dependencies>
          <exclude module="javax"/>
          <exclude module="jmxri"/>
          <exclude module="jmxtools"/>
          <exclude module="mail"/>
          <exclude module="jms"/>
          </dependencies>

          {ocde}

          can go away, they made all of the funky dependencies "system" "provided", or "optional" so they are not transitively pulled in (as they should have been to begin with).

          Show
          Scott Carey added a comment - Also, upgrading log4j means that all of the ugly bits with: //The issue is going from log4j 1.2.14 to 1.2.15, the developers added some features which required // some dependencies on various sun and javax packages. override def ivyXML = <dependencies> <exclude module="javax"/> <exclude module="jmxri"/> <exclude module="jmxtools"/> <exclude module="mail"/> <exclude module="jms"/> </dependencies> {ocde} can go away, they made all of the funky dependencies "system" "provided", or "optional" so they are not transitively pulled in (as they should have been to begin with).
          Hide
          Scott Carey added a comment - - edited

          Why are dependencies repeated?

          project/build/KafkaProject.scala defines:

          trait CoreDependencies

          Unknown macro: { val log4j = "log4j" % "log4j" % "1.2.15" val jopt = "net.sf.jopt-simple" % "jopt-simple" % "3.2" val slf4jSimple = "org.slf4j" % "slf4j-simple" % "1.6.4" }

          and
          project/Build.scala defines

          libraryDependencies ++= Seq(
          "log4j" % "log4j" % "1.2.15",
          "net.sf.jopt-simple" % "jopt-simple" % "3.2",
          "org.slf4j" % "slf4j-simple" % "1.6.4"
          ),

          Likewise the hadoop settings are repeated.

          Show
          Scott Carey added a comment - - edited Why are dependencies repeated? project/build/KafkaProject.scala defines: trait CoreDependencies Unknown macro: { val log4j = "log4j" % "log4j" % "1.2.15" val jopt = "net.sf.jopt-simple" % "jopt-simple" % "3.2" val slf4jSimple = "org.slf4j" % "slf4j-simple" % "1.6.4" } and project/Build.scala defines libraryDependencies ++= Seq( "log4j" % "log4j" % "1.2.15", "net.sf.jopt-simple" % "jopt-simple" % "3.2", "org.slf4j" % "slf4j-simple" % "1.6.4" ), Likewise the hadoop settings are repeated.
          Hide
          Scott Carey added a comment -

          Ick, worse, some dependencies are listed three times – once in SBT format, once in Maven XML format (bulky) and again Ivy xml fomat (yuck). The DRY principle gods are angry and are having their vengeance on any who wish to maintain this.

          Show
          Scott Carey added a comment - Ick, worse, some dependencies are listed three times – once in SBT format, once in Maven XML format (bulky) and again Ivy xml fomat (yuck). The DRY principle gods are angry and are having their vengeance on any who wish to maintain this.
          Hide
          Neha Narkhede added a comment -

          Scott,

          I think that is a remainder that we forgot to delete when we upgraded to latest sbt version. We want to keep only the dependencies mentioned in project/Build.scala and the individual project build files . For example, core/build.sbt. Feel free to upload a patch to clean up the dependencies, it will be great help to us while releasing 0.8

          Show
          Neha Narkhede added a comment - Scott, I think that is a remainder that we forgot to delete when we upgraded to latest sbt version. We want to keep only the dependencies mentioned in project/Build.scala and the individual project build files . For example, core/build.sbt. Feel free to upload a patch to clean up the dependencies, it will be great help to us while releasing 0.8
          Hide
          Scott Carey added a comment -

          The problem is I'm an SBT neophyte, and I keep thinking I could do all this easier with Maven – aside from the cross-version stuff SBT provides. I don't know what is used and what is cruft, what people care about keeping and what can be trimmed.

          Show
          Scott Carey added a comment - The problem is I'm an SBT neophyte, and I keep thinking I could do all this easier with Maven – aside from the cross-version stuff SBT provides. I don't know what is used and what is cruft, what people care about keeping and what can be trimmed.
          Hide
          Scott Carey added a comment -

          While you're at it you could also upgrade Jackson. Currently the build file references org.codehaus.jackson which is old. The upgrade may require some code changes (the API differs in a few places), all trivial through.

          Where is the Jackson dependency specified?

          find . -name *.sbt | xargs grep jackson

          returns nothing, and in project/Build.scala it is only used in relation to the hadoop settings. We could upgrade to the latest 1.x but not 2.x, because other hadoop dependencies use it (like the also ancient avro 1.4).

          What is the status of the contrib hadoop stuff?

          Matt's patch already upgrades everything, if we want to upgrade only the core Kafka dependencies, we can trim it down to those simpler, safer updates.

          (As an SBT newbie) –
          It appears that project/Build.scala is completely unnecessary, and could be significantly simplified if moved to core/build.sbt. Most of the content of Build.scala seems to be cruft or have simpler features available to do the same thing.

          Show
          Scott Carey added a comment - While you're at it you could also upgrade Jackson. Currently the build file references org.codehaus.jackson which is old. The upgrade may require some code changes (the API differs in a few places), all trivial through. Where is the Jackson dependency specified? find . -name *.sbt | xargs grep jackson returns nothing, and in project/Build.scala it is only used in relation to the hadoop settings. We could upgrade to the latest 1.x but not 2.x, because other hadoop dependencies use it (like the also ancient avro 1.4). What is the status of the contrib hadoop stuff? Matt's patch already upgrades everything, if we want to upgrade only the core Kafka dependencies, we can trim it down to those simpler, safer updates. (As an SBT newbie) – It appears that project/Build.scala is completely unnecessary, and could be significantly simplified if moved to core/build.sbt. Most of the content of Build.scala seems to be cruft or have simpler features available to do the same thing.
          Hide
          Scott Carey added a comment -

          Cleans up dependencies and SBT build in general (~350 lines fewer than prior).

          Show
          Scott Carey added a comment - Cleans up dependencies and SBT build in general (~350 lines fewer than prior).
          Hide
          Neha Narkhede added a comment -

          Scott,

          Please can you rebase the patch ? After checking in KAFKA-826, this patch no longer applies.

          Show
          Neha Narkhede added a comment - Scott, Please can you rebase the patch ? After checking in KAFKA-826 , this patch no longer applies.
          Hide
          Scott Carey added a comment -

          Rebased to 0.8 branch, 6dbf9212ae4dc6ed7f91fc99135b8a3b35ab5edb

          Big cleanup of sbt. Total build size is now 136 lines.

          Does not support publishing to maven, but that appeared broken previously. See http://www.scala-sbt.org/release/docs/Detailed-Topics/Publishing.

          Does not build a release tarball/zip, but that was also broken.

          Show
          Scott Carey added a comment - Rebased to 0.8 branch, 6dbf9212ae4dc6ed7f91fc99135b8a3b35ab5edb Big cleanup of sbt. Total build size is now 136 lines. Does not support publishing to maven, but that appeared broken previously. See http://www.scala-sbt.org/release/docs/Detailed-Topics/Publishing . Does not build a release tarball/zip, but that was also broken.
          Hide
          Cosmin Lehene added a comment - - edited

          See KAFKA-843 for release-zip / release-tar tasks
          it may conflict

          Show
          Cosmin Lehene added a comment - - edited See KAFKA-843 for release-zip / release-tar tasks it may conflict
          Hide
          Scott Carey added a comment -

          This patch has been rebased after the commit to KAFKA-843

          Show
          Scott Carey added a comment - This patch has been rebased after the commit to KAFKA-843
          Hide
          Neha Narkhede added a comment -

          Thanks for the rebased patch!

          1. Can you describe the motivation and purpose of each of your changes ? it will make it easier to review.
          2. What is the purpose of removing the core dependencies from core/build.sbt and putting them back into the main Build.sbt ?

          Show
          Neha Narkhede added a comment - Thanks for the rebased patch! 1. Can you describe the motivation and purpose of each of your changes ? it will make it easier to review. 2. What is the purpose of removing the core dependencies from core/build.sbt and putting them back into the main Build.sbt ?
          Hide
          Scott Carey added a comment -

          1. It is probably easier to review by looking at the final result than the diffs. The purpose was to simplify the build, following DRY principles and what appears to be the style that the latest sbt encourages:

          • using sbt files for settings and .scala files for build definitions and cross-module dependencies. Every sample / tutorial I found seemed to indicate that this was standard convention (none were passing settings in when defining the Project).
            2. All version declarations in one file, centrally, with the DRY principle. I could not find a way to do this in the base .sbt file since you can only set configuration keys there, not define variables. If you want to see or change a dependency version, there is only one place to go. Unlike Maven, SBT does not provide a standard place to configure your dependency versions to promote sharing across multiple projects.
          Show
          Scott Carey added a comment - 1. It is probably easier to review by looking at the final result than the diffs. The purpose was to simplify the build, following DRY principles and what appears to be the style that the latest sbt encourages: using sbt files for settings and .scala files for build definitions and cross-module dependencies. Every sample / tutorial I found seemed to indicate that this was standard convention (none were passing settings in when defining the Project). 2. All version declarations in one file, centrally, with the DRY principle. I could not find a way to do this in the base .sbt file since you can only set configuration keys there, not define variables. If you want to see or change a dependency version, there is only one place to go. Unlike Maven, SBT does not provide a standard place to configure your dependency versions to promote sharing across multiple projects.
          Hide
          Matt Christiansen added a comment -

          I tested out this patch in my local kafka build (im trying to piecemeal together a 0.8 for some POC work at my company) and it looks great to me. Much cleaner and the resulting release jars are smaller as they now only have the dependencies they need (no more hadoop-core in kafka core).

          To nit pick it seems like the metrics-annotation dependency isn't needed in any code and from what I can tell; it also isn't a runtime dependency for metrics-core.

          Also its funny you bring up Maven Scott, I wonder why in most cases Maven isn't used for scala project like this. It would have a wider audience of people to maintain it, is more widely known and (to me) makes more sense to how functions and how projects are laid out in it.

          Show
          Matt Christiansen added a comment - I tested out this patch in my local kafka build (im trying to piecemeal together a 0.8 for some POC work at my company) and it looks great to me. Much cleaner and the resulting release jars are smaller as they now only have the dependencies they need (no more hadoop-core in kafka core). To nit pick it seems like the metrics-annotation dependency isn't needed in any code and from what I can tell; it also isn't a runtime dependency for metrics-core. Also its funny you bring up Maven Scott, I wonder why in most cases Maven isn't used for scala project like this. It would have a wider audience of people to maintain it, is more widely known and (to me) makes more sense to how functions and how projects are laid out in it.
          Hide
          Michael Noll added a comment -

          Now that one year has passed since the last update to this ticket in 04/2013, what's the status/plan to upgrade Kafka's dependencies, notably on the ZooKeeper front? Anyone knows?

          Show
          Michael Noll added a comment - Now that one year has passed since the last update to this ticket in 04/2013, what's the status/plan to upgrade Kafka's dependencies, notably on the ZooKeeper front? Anyone knows?
          Hide
          Cosmin Lehene added a comment -

          We've upgraded our internal branch to 3.4.5 back then. it's been running fine (in production). We'll soon upgrade to the recently released 3.4.6 probably.

          Show
          Cosmin Lehene added a comment - We've upgraded our internal branch to 3.4.5 back then. it's been running fine (in production). We'll soon upgrade to the recently released 3.4.6 probably.

            People

            • Assignee:
              Unassigned
              Reporter:
              Scott Carey
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:

                Development