Hadoop Common
  1. Hadoop Common
  2. HADOOP-6945

Inclusion of Old Jackson-JSON Breaks tasks using Avro (or any task depending on Jackson JSON)

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      HADOOP-6184 added the ability to serialize the Configuration to JSON. However its inclusion of jackson-1.0.1 means that any map-reduce task that depends (directly or indirectly) on Jackson, can only use Jackson 1.0.1 APIs. The 100% fix is to give the task's classpath priority over the core/common classpath (MAPREDUCE-1700/MAPREDUCE-1938) so that a job can use newer revisions of the library.

      For a nearer term fix, is it possible to upgrade the included Jackson library to a recent version (1.5.x+)? The APIs are backwards compatible. alernatively, its possible to eliminate the use of Jackson for this minor feature as the Configuration object could be serialized to JSON without a 3rd party library.

      An ancilliary issue is that the jackson.version referenced in ivy.xml is never specified in the source tree (ivy/libraries.properties).

        Issue Links

          Activity

          Hide
          Scott Carey added a comment -

          Both CDH3 and 1.0 break if using Avro 1.6.2. Avro 1.6.3 (currently a release candidate) avoids using the Jackson bits that cause the issue when a Jackson library between 1.3.x and 1.5.x is on the classpath (as with CDH3).

          In our cluster we removed Jackson from all task tracker nodes (yet again, it was OK for a while). One day Hadoop won't put a bunch of unnecessary crap on M/R app classpaths and clearly distinguish libraries that are needed in this context from things that are needed only for the framework. Jackson is one of several libraries with similar issues.

          Show
          Scott Carey added a comment - Both CDH3 and 1.0 break if using Avro 1.6.2. Avro 1.6.3 (currently a release candidate) avoids using the Jackson bits that cause the issue when a Jackson library between 1.3.x and 1.5.x is on the classpath (as with CDH3). In our cluster we removed Jackson from all task tracker nodes (yet again, it was OK for a while). One day Hadoop won't put a bunch of unnecessary crap on M/R app classpaths and clearly distinguish libraries that are needed in this context from things that are needed only for the framework. Jackson is one of several libraries with similar issues.
          Hide
          Allen Wittenauer added a comment -

          This was fixed as part of HADOOP-7606 and HADOOP-7470.

          Show
          Allen Wittenauer added a comment - This was fixed as part of HADOOP-7606 and HADOOP-7470 .
          Hide
          Iván de Prado added a comment -

          Which is the status of that ticket? We are having the same trouble. For example, Avro depends on a higher Jackson version, so there are troubles when using it.

          Show
          Iván de Prado added a comment - Which is the status of that ticket? We are having the same trouble. For example, Avro depends on a higher Jackson version, so there are troubles when using it.
          Hide
          Steve Loughran added a comment -

          given that Jackson is only used to serialize the config, and also that generating valid JSON is fairly straightforward, why not get rid of the jackson dependency at all and just have a json serializer class? jackson (or something like json-lib or gson) could just be used to validate the content in the unit tests.

          Show
          Steve Loughran added a comment - given that Jackson is only used to serialize the config, and also that generating valid JSON is fairly straightforward, why not get rid of the jackson dependency at all and just have a json serializer class? jackson (or something like json-lib or gson) could just be used to validate the content in the unit tests.
          Hide
          Soren Macbeth added a comment -

          +1 It's a pain in the butt to have to strip this out of all the nodes in our cluster. Please either remove it or bump the version to a much more recent one.

          Show
          Soren Macbeth added a comment - +1 It's a pain in the butt to have to strip this out of all the nodes in our cluster. Please either remove it or bump the version to a much more recent one.
          Hide
          Scott Carey added a comment -

          +1. And this is a big enough annoyance to back-port to 0.20. It clearly has very small scope of breaking things (HADOOP-6184). I currently remove the jackson jar from the hadoop lib dir on all my nodes (CDH3 0.20).

          As long as so many hadoop jars come first in a task's path, any feature that creates a new jar dependency should be heavily scrutinized before inclusion.

          Show
          Scott Carey added a comment - +1. And this is a big enough annoyance to back-port to 0.20. It clearly has very small scope of breaking things ( HADOOP-6184 ). I currently remove the jackson jar from the hadoop lib dir on all my nodes (CDH3 0.20). As long as so many hadoop jars come first in a task's path, any feature that creates a new jar dependency should be heavily scrutinized before inclusion.

            People

            • Assignee:
              Unassigned
              Reporter:
              Greg Wittel
            • Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development