Bigtop
  1. Bigtop
  2. BIGTOP-423

hadoop package needs to be split into hadoop-client and hadoop-server packages

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.4.0
    • Fix Version/s: 0.4.0
    • Component/s: General
    • Labels:
      None

      Description

      Currently hadoop package co-bundles together dependencies for the daemons of hadoop (HDFS, YARN) and client side of the same projects. It would be much nicer to split this functionality into 2 separate packages so that downstream components (Pig,Hive,Oozie) don't have to depend on more bits than they have to.

      1. hadoop-client.list
        1 kB
        Roman Shaposhnik
      2. BIGTOP-423.patch.txt
        5 kB
        Roman Shaposhnik

        Activity

        Hide
        Roman Shaposhnik added a comment -

        Here's the current proposal (which is a bit less ambitious than the original intent since hadoop is still not fully ready to be cleanly split into server/client parts at the level of jar files):

        1. we are going to have an extra package called hadoop-client
        2. hadoop-client will depend on hadoop, hadoop-hdfs, hadoop-yarn and hadoop-mapreduce since they all have client and server bits co-mingled in their respective jar files
        3. hadoop-client package will install a bunch of the symbolic links under /usr/lib/hadoop/lib/client pointing to all the jar files in the packages it depends upon

        That way we will have a single location to be added to the client's class path (/usr/lib/hadoop/lib/client) and we can slowly work on spliting the actual jar files between the client and server packages.

        Thoughts?

        Show
        Roman Shaposhnik added a comment - Here's the current proposal (which is a bit less ambitious than the original intent since hadoop is still not fully ready to be cleanly split into server/client parts at the level of jar files): we are going to have an extra package called hadoop-client hadoop-client will depend on hadoop, hadoop-hdfs, hadoop-yarn and hadoop-mapreduce since they all have client and server bits co-mingled in their respective jar files hadoop-client package will install a bunch of the symbolic links under /usr/lib/hadoop/lib/client pointing to all the jar files in the packages it depends upon That way we will have a single location to be added to the client's class path (/usr/lib/hadoop/lib/client) and we can slowly work on spliting the actual jar files between the client and server packages. Thoughts?
        Hide
        Roman Shaposhnik added a comment -

        Attaching a patch that also requires moving us to the tip of the branch-0.23.

        Please let me know what do you think.

        Once we agree that this type of hadoop-client package looks good I can transition all the pig/hive/sqoop/etc dependencies onto it.

        Show
        Roman Shaposhnik added a comment - Attaching a patch that also requires moving us to the tip of the branch-0.23. Please let me know what do you think. Once we agree that this type of hadoop-client package looks good I can transition all the pig/hive/sqoop/etc dependencies onto it.
        Hide
        Bruno Mahé added a comment - - edited

        Some notes:

        • Could you attach hadoop-client.list so I can see what it looks like?
        • Are you bumping hadoop version on purpose as part of this ticket?
        • Wouldn't a "find" be more appropriate rather than this for/continue loop ?
        • In a spec file, "Requires" statements can be split on multiple lines. It would make it easier to read it if you could split the require
        Show
        Bruno Mahé added a comment - - edited Some notes: Could you attach hadoop-client.list so I can see what it looks like? Are you bumping hadoop version on purpose as part of this ticket? Wouldn't a "find" be more appropriate rather than this for/continue loop ? In a spec file, "Requires" statements can be split on multiple lines. It would make it easier to read it if you could split the require
        Hide
        Roman Shaposhnik added a comment -

        @Bruno,

        1. I've attached the current version of the hadoop-client.list. The thing about it, of course, is that it has a flexibility of changing as Hadoop developers see fit – hence we have to ask Hadoop build for it instead of maintaining our own copy.
        2. Correct, we need need the bump specifically because of this ticket. The hadoop-client is there and the fix for MAPREDUCE-3996 (although not the one that is proposed on that JIRA)
        3. Hm. Not sure about the find – if you don't mind please paste the code you're thinking about and we can decide
        4. Will absolutely split Requires – didn't know about that

        Ok, at this point, barring your find suggestion I take it that the general idea is acceptable?

        Show
        Roman Shaposhnik added a comment - @Bruno, I've attached the current version of the hadoop-client.list. The thing about it, of course, is that it has a flexibility of changing as Hadoop developers see fit – hence we have to ask Hadoop build for it instead of maintaining our own copy. Correct, we need need the bump specifically because of this ticket. The hadoop-client is there and the fix for MAPREDUCE-3996 (although not the one that is proposed on that JIRA) Hm. Not sure about the find – if you don't mind please paste the code you're thinking about and we can decide Will absolutely split Requires – didn't know about that Ok, at this point, barring your find suggestion I take it that the general idea is acceptable?
        Hide
        Bruno Mahé added a comment -

        Thanks!

        So given that :

        • we are both aware the zookeeper jar wouldn't be picked up as is because of the versionless zk jar in hadoop. But you said you would fix that in a coming ticket
        • I am too busy/lazy to come up with the right snippet for 3.
        • You will fix the requires in an updated patch

        +1

        Show
        Bruno Mahé added a comment - Thanks! So given that : we are both aware the zookeeper jar wouldn't be picked up as is because of the versionless zk jar in hadoop. But you said you would fix that in a coming ticket I am too busy/lazy to come up with the right snippet for 3. You will fix the requires in an updated patch +1

          People

          • Assignee:
            Roman Shaposhnik
            Reporter:
            Roman Shaposhnik
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development