Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • None
    • None

    Description

      I have been running some tests getting hadoop to run within an osgi environment (specifically the Newton framework) and this has uncovered a number of minor bugs when mapred classes are instantiated from a different start point than their main methods.

      I have created a number of patches which I'll attach which solve these issues. It's possible these patches could be dealt with as separate issues but all are required to resolve the osgi issue. Happy to split up if easier to manage though.

      classpath.patch: this rearranges the classloader hierarchies for Task objects such that a Task is able to resolve api classes in the case where the api classes are no longer loaded from the system classloader.

      tasklog.patch: this ensures the log files are able to be resolved in the case where the child process is launched from a different directory to the parent process

      taskrunner.patch: this enables the TaskRunner to find a log dir in the case where the parent jvm is not launched by the hadoop scripts, also allows for a client to specify a substitute main class (which delegates to the TaskTracker$Child) in this case for purposes of resolving osgi classpaths but could be more general? Finally adds some extra logging in case where things go wrong.

      tasktracker.patch: allow parent to pass through configuration to child taskrunner (specifically in this case for purposes of passing classpath and laucher to taskrunner)

      Attachments

        1. tasktracker.patch
          0.9 kB
          David Savage
        2. classpath.patch
          11 kB
          David Savage
        3. tasklog.patch
          0.8 kB
          David Savage
        4. taskrunner.patch
          2 kB
          David Savage

        Issue Links

          Activity

            cutting Doug Cutting added a comment -

            A single patch for this is probably best. Some comments:

            • indentation is not Hadoop standard (2-spaces per level)
            • non-existent files in the classpath should not throw exceptions, should they?
            • some unit tests would be good to ensure that these changes are maintained
            • patches should not include patch-specific comments
            • i don't like modifying the child's job configuration. can't this be implemented by using 'final' parameters in the tasktracker's configuration, so that job's cannot override them?
            cutting Doug Cutting added a comment - A single patch for this is probably best. Some comments: indentation is not Hadoop standard (2-spaces per level) non-existent files in the classpath should not throw exceptions, should they? some unit tests would be good to ensure that these changes are maintained patches should not include patch-specific comments i don't like modifying the child's job configuration. can't this be implemented by using 'final' parameters in the tasktracker's configuration, so that job's cannot override them?

            After playing with Hadoop inside OSGi containers for some time, here are some complementary comments:

            • there is an issue with the web UI: this because resources inside Hadoop jars are referred to with OSGi specific URLs (e.g. jar:bundle://<bundle-id>/path/to/resource) that the embedded Jetty is unable to use.
            • i am thinking Map/Reduce jobs could be packaged as OSGi bundles too: dependencies (like 3rd party libraries) are then directly handled by the containers.
            kryzthov Christophe Taton added a comment - After playing with Hadoop inside OSGi containers for some time, here are some complementary comments: there is an issue with the web UI: this because resources inside Hadoop jars are referred to with OSGi specific URLs (e.g. jar:bundle://<bundle-id>/path/to/resource) that the embedded Jetty is unable to use. i am thinking Map/Reduce jobs could be packaged as OSGi bundles too: dependencies (like 3rd party libraries) are then directly handled by the containers.

            It seems there is no easy way to have Jetty5 running inside an OSGi container (more exactly, I did not manage to have it working after a couple of days spent debugging it).
            However Jetty6 runs without problems in an OSGi environment.

            kryzthov Christophe Taton added a comment - It seems there is no easy way to have Jetty5 running inside an OSGi container (more exactly, I did not manage to have it working after a couple of days spent debugging it). However Jetty6 runs without problems in an OSGi environment.

            Embedded web applications will need to be packaged as war files, so as to have Jetty6/OSGi correctly running: Jetty is only able to use OSGi specific URLs when reading a jar file (thus a war file).

            kryzthov Christophe Taton added a comment - Embedded web applications will need to be packaged as war files, so as to have Jetty6/OSGi correctly running: Jetty is only able to use OSGi specific URLs when reading a jar file (thus a war file).

            I'm gonna submit a new set of patches, including Karaf features.

            jbonofre Jean-Baptiste Onofré added a comment - I'm gonna submit a new set of patches, including Karaf features.
            stevel@apache.org Steve Loughran added a comment -

            revisiting this. I think it's time to close as a WONTFIX, as the trend towards isolation is generally some linux container. But we also need to embrace java 9+ module isolation to avoid transitive classpath hell. separate issue, covered elsewhere

            stevel@apache.org Steve Loughran added a comment - revisiting this. I think it's time to close as a WONTFIX, as the trend towards isolation is generally some linux container. But we also need to embrace java 9+ module isolation to avoid transitive classpath hell. separate issue, covered elsewhere

            People

              jbonofre Jean-Baptiste Onofré
              davemssavage David Savage
              Votes:
              3 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: