Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Invalid
    • Affects Version/s: 0.4.0
    • Fix Version/s: 0.4.0
    • Component/s: general
    • Labels:
      None

      Description

      Hue 2.0.0 is a major revision of Hue. It is licensed under APL and is available from GitHub:
      http://cloud.github.com/downloads/cloudera/hue/release-notes-2.0.0-beta.html

      1. BIGTOP-527-bruno-feedback.patch.txt
        7 kB
        Roman Shaposhnik
      2. BIGTOP-527-3.patch.txt
        100 kB
        Roman Shaposhnik
      3. BIGTOP-527.patch.txt
        87 kB
        Roman Shaposhnik
      4. BIGTOP-527.2.patch.txt
        91 kB
        Roman Shaposhnik

        Activity

        Hide
        Owen O'Malley added a comment -

        Only Apache projects can be added to Bigtop.

        Show
        Owen O'Malley added a comment - Only Apache projects can be added to Bigtop.
        Hide
        Roman Shaposhnik added a comment -

        Owen, the consensus is far from being reached. Please don't pull the trigger prematurely. Especially on the JIRAs that are not assigned to you. Thanks.

        Show
        Roman Shaposhnik added a comment - Owen, the consensus is far from being reached. Please don't pull the trigger prematurely. Especially on the JIRAs that are not assigned to you. Thanks.
        Hide
        Bruno Mahé added a comment -

        Can we leave this ticket open until consensus is reached? If no one object, I will re-open it within a few days.

        Show
        Bruno Mahé added a comment - Can we leave this ticket open until consensus is reached? If no one object, I will re-open it within a few days.
        Hide
        Roman Shaposhnik added a comment -

        Attaching a first cut at the patch. Things that are still missing (and will be added later):

        • Puppet deployment code
        • Package testing manifest updates
        • Unification of init.d scripts

        Please let me know what you guys think.

        Show
        Roman Shaposhnik added a comment - Attaching a first cut at the patch. Things that are still missing (and will be added later): Puppet deployment code Package testing manifest updates Unification of init.d scripts Please let me know what you guys think.
        Hide
        Bruno Mahé added a comment -

        Great! Thanks a lot!
        I will take a look at it tonight/tomorrow and will give it a try on Mageia as well.

        Show
        Bruno Mahé added a comment - Great! Thanks a lot! I will take a look at it tonight/tomorrow and will give it a try on Mageia as well.
        Hide
        bc Wong added a comment -

        Would it make more sense to have one package instead of one per-app plus core?

        Show
        bc Wong added a comment - Would it make more sense to have one package instead of one per-app plus core?
        Hide
        Bruno Mahé added a comment -

        It depends on:

        • The dependencies. If one app depends on Apache Hive, another one on Apache Flume and another one on some other Apache project, then installing Hue means pulling all these dependencies. Making these transitive dependencies optional or not required would be quite inconvenient for users since it would not only require another step to install that dependency, but it would also force the user to know the name of that dependency (ie. which package do I need?)
        • The usage pattern of Hue. People may not want to pull all of Hue, the same way they may not want to pull the entire LibreOffice on their machine with all its modules and language support. Although LibreOffice is kind of on the other end of the spectrum with ~100 packages on my Fedora 16

        So for now, given the size of the patch and the effort required for such modification, what about focusing on integrating Hue as is and then figuring out how to proceed with the apps later? I can open a ticket if needed.

        Show
        Bruno Mahé added a comment - It depends on: The dependencies. If one app depends on Apache Hive, another one on Apache Flume and another one on some other Apache project, then installing Hue means pulling all these dependencies. Making these transitive dependencies optional or not required would be quite inconvenient for users since it would not only require another step to install that dependency, but it would also force the user to know the name of that dependency (ie. which package do I need?) The usage pattern of Hue. People may not want to pull all of Hue, the same way they may not want to pull the entire LibreOffice on their machine with all its modules and language support. Although LibreOffice is kind of on the other end of the spectrum with ~100 packages on my Fedora 16 So for now, given the size of the patch and the effort required for such modification, what about focusing on integrating Hue as is and then figuring out how to proceed with the apps later? I can open a ticket if needed.
        Hide
        Roman Shaposhnik added a comment -

        @bc,

        I guess I'm also interested in finding out the usage pattern. Or to put it slightly differently – how independent the Hue apps really are? Is there something like a core set of apps in Hue that you must have in order for it to be usable? From the outside it seems that perhaps we can bucket in the following way:

        1. hue-coreapps (hue-about, hue-help, hue-proxy, hue-useradmin, hue-filebrowser, hue-jobbrowser, hue-shell)
        2. hue-beeswax (since it depends on hive, etc).
        3. hue-jobsub (this one I don't really feel strongly about either way)

        Oh, and since we're on this subject of radical changes – I also would like to move all the bits to under /usr/lib/hue to be more consistent with the rest of the Bigtop packaging. Any objections?

        Show
        Roman Shaposhnik added a comment - @bc, I guess I'm also interested in finding out the usage pattern. Or to put it slightly differently – how independent the Hue apps really are? Is there something like a core set of apps in Hue that you must have in order for it to be usable? From the outside it seems that perhaps we can bucket in the following way: hue-coreapps (hue-about, hue-help, hue-proxy, hue-useradmin, hue-filebrowser, hue-jobbrowser, hue-shell) hue-beeswax (since it depends on hive, etc). hue-jobsub (this one I don't really feel strongly about either way) Oh, and since we're on this subject of radical changes – I also would like to move all the bits to under /usr/lib/hue to be more consistent with the rest of the Bigtop packaging. Any objections?
        Hide
        Bruno Mahé added a comment -

        Oh, and since we're on this subject of radical changes – I also would like to move all the bits to under /usr/lib/hue to be more consistent with the rest of the Bigtop packaging. Any objections?

        +1.
        The same must be applied to its database (moving it to /var/lib)

        Show
        Bruno Mahé added a comment - Oh, and since we're on this subject of radical changes – I also would like to move all the bits to under /usr/lib/hue to be more consistent with the rest of the Bigtop packaging. Any objections? +1. The same must be applied to its database (moving it to /var/lib)
        Hide
        bc Wong added a comment -

        Moving to /usr/lib sounds reasonable.

        @Bruno, a sane admin would typically want to install a monolithic Hue due to ease of management. Note that Hue has its own mechanism to disable (or enable) certain apps.

        it would also force the user to know the name of that dependency

        Not if you have packaging. And if it's a non-packaging install, the admin will have to know the dependency chain anyways. Hue itself will still run and function even if the dependencies aren't there. For example, if you don't have Hive, Beeswax will show you an error. At this point, you can go install Hive if you care about Beeswax. Or you can disable Beeswax if you don't care.

        People may not want to pull all of Hue

        99% of the time, people want to. (I personally haven't heard of any installation that pick-and-chooses which app to install.) We're trading off 4MB of disk space with major usability and maintenance improvement. I think it's well worth it.

        Show
        bc Wong added a comment - Moving to /usr/lib sounds reasonable. @Bruno, a sane admin would typically want to install a monolithic Hue due to ease of management. Note that Hue has its own mechanism to disable (or enable) certain apps. it would also force the user to know the name of that dependency Not if you have packaging. And if it's a non-packaging install, the admin will have to know the dependency chain anyways. Hue itself will still run and function even if the dependencies aren't there. For example, if you don't have Hive, Beeswax will show you an error. At this point, you can go install Hive if you care about Beeswax. Or you can disable Beeswax if you don't care. People may not want to pull all of Hue 99% of the time, people want to. (I personally haven't heard of any installation that pick-and-chooses which app to install.) We're trading off 4MB of disk space with major usability and maintenance improvement. I think it's well worth it.
        Hide
        Bruno Mahé added a comment -

        @Bruno, a sane admin would typically want to install a monolithic Hue due to ease of management. Note that Hue has its own mechanism to disable (or enable) certain apps.

        What do you mean exactly?
        If each app pulls a different project (one pulls Apache Hive, another Apache Hadoop, another Apache Oozie and another Apache HBase), I am pretty sure any sane admin would be horrified about that.

        Not if you have packaging.

        We are talking about the case of having a non-explicit dependency between packages.
        So how would the user know about the packages?

        For example, if you don't have Hive, Beeswax will show you an error. At this point, you can go install Hive if you care about Beeswax. Or you can disable Beeswax if you don't care.

        The point of packaging is to make it easy for users. If I want to install Beeswax, I obviously want Apache Hive to be pulled in as well since using Beeswax without Apache Hive does not make sense.
        And what kind of message does Beeswax display? Does it tell the user which package to install for his/her platform?
        And if you don't care about Beeswax and Apache Hive, why installing it in the first place?

        99% of the time, people want to. (I personally haven't heard of any installation that pick-and-chooses which app to install.) We're trading off 4MB of disk space with major usability and maintenance improvement. I think it's well worth it.

        In that case, it may make sense to merge them. But I am still worried about all the dependencies being pulled in.

        Show
        Bruno Mahé added a comment - @Bruno, a sane admin would typically want to install a monolithic Hue due to ease of management. Note that Hue has its own mechanism to disable (or enable) certain apps. What do you mean exactly? If each app pulls a different project (one pulls Apache Hive, another Apache Hadoop, another Apache Oozie and another Apache HBase), I am pretty sure any sane admin would be horrified about that. Not if you have packaging. We are talking about the case of having a non-explicit dependency between packages. So how would the user know about the packages? For example, if you don't have Hive, Beeswax will show you an error. At this point, you can go install Hive if you care about Beeswax. Or you can disable Beeswax if you don't care. The point of packaging is to make it easy for users. If I want to install Beeswax, I obviously want Apache Hive to be pulled in as well since using Beeswax without Apache Hive does not make sense. And what kind of message does Beeswax display? Does it tell the user which package to install for his/her platform? And if you don't care about Beeswax and Apache Hive, why installing it in the first place? 99% of the time, people want to. (I personally haven't heard of any installation that pick-and-chooses which app to install.) We're trading off 4MB of disk space with major usability and maintenance improvement. I think it's well worth it. In that case, it may make sense to merge them. But I am still worried about all the dependencies being pulled in.
        Hide
        Roman Shaposhnik added a comment - - edited

        I've just inspected dependencies and it seems that my proposal of hue-coreapps and hue-beeswax would separate the dependencies out nicely. And given bc's feedback – I'd propose to add hue-jobsub to hue-coreapps. Thus the final list of packages will be:

        1. hue-common (depends on hadoop-client)
        2. hue-coreapps (depends on hue-common)
        3. hue-beeswax (depends on hue-common and hive)
        4. hue-plugins (depends on hadoop-mapreduce)
        5. hue-server (depends on hue-coreapps)

        By the way, once this happens, I'm not sure we need hue package anymore. Whether we should rename hue-server into hue then, I don't quite know.

        In general, though, I'd agree with Bruno – in the future if we start adding things like hue-pig, etc. I'd rather see them packaged as separate apps for dependencies' sake.

        bc, what do you think?

        Show
        Roman Shaposhnik added a comment - - edited I've just inspected dependencies and it seems that my proposal of hue-coreapps and hue-beeswax would separate the dependencies out nicely. And given bc's feedback – I'd propose to add hue-jobsub to hue-coreapps. Thus the final list of packages will be: hue-common (depends on hadoop-client) hue-coreapps (depends on hue-common) hue-beeswax (depends on hue-common and hive) hue-plugins (depends on hadoop-mapreduce) hue-server (depends on hue-coreapps) By the way, once this happens, I'm not sure we need hue package anymore. Whether we should rename hue-server into hue then, I don't quite know. In general, though, I'd agree with Bruno – in the future if we start adding things like hue-pig, etc. I'd rather see them packaged as separate apps for dependencies' sake. bc, what do you think?
        Hide
        bc Wong added a comment -

        If each app pulls a different project (one pulls Apache Hive, another Apache Hadoop, another Apache Oozie and another Apache HBase), I am pretty sure any sane admin would be horrified about that.

        That's an exaggeration. Hue would only needs the client-side packages. And even if what you said were true, I'd be comfortable of having having these bits installed. What's another 100MB when you're running a Hadoop cluster?

        Roman, you might still need the hue meta-package, because you don't want to educate people what all these hue-xyz things are about. And then people will have questions about "Where is the hive app?" – "Oh, you need to install hue-beeswax." I'd err on the side of usability and ease of maintenance than on upholding a puristic packaging philosophy.

        (But I definitely understand your argument, that it is more technically correct to modularize and split them up. I held that view for 2 years as well.)

        Show
        bc Wong added a comment - If each app pulls a different project (one pulls Apache Hive, another Apache Hadoop, another Apache Oozie and another Apache HBase), I am pretty sure any sane admin would be horrified about that. That's an exaggeration. Hue would only needs the client-side packages. And even if what you said were true, I'd be comfortable of having having these bits installed. What's another 100MB when you're running a Hadoop cluster? Roman, you might still need the hue meta-package, because you don't want to educate people what all these hue-xyz things are about. And then people will have questions about "Where is the hive app?" – "Oh, you need to install hue-beeswax." I'd err on the side of usability and ease of maintenance than on upholding a puristic packaging philosophy. (But I definitely understand your argument, that it is more technically correct to modularize and split them up. I held that view for 2 years as well.)
        Hide
        Peter Linnell added a comment -

        I've gotten my jira access sorted.. Sorry for the delay in chiming in.

        In general, Roman's list here:

        hue-common (depends on hadoop-client)
        hue-coreapps (depends on hue-common)
        hue-beeswax (depends on hue-common and hive)
        hue-plugins (depends on hadoop-mapreduce)
        hue-server (depends on hue-coreapps)

        Seems sensible to me. I also agree keeping the meta package is user friendly.

        I could also see combining common and coreapps, along with the server into one mega package.

        Show
        Peter Linnell added a comment - I've gotten my jira access sorted.. Sorry for the delay in chiming in. In general, Roman's list here: hue-common (depends on hadoop-client) hue-coreapps (depends on hue-common) hue-beeswax (depends on hue-common and hive) hue-plugins (depends on hadoop-mapreduce) hue-server (depends on hue-coreapps) Seems sensible to me. I also agree keeping the meta package is user friendly. I could also see combining common and coreapps, along with the server into one mega package.
        Hide
        Romain Rigaux added a comment - - edited

        For me in the current situation it looks simpler to just have one single package pulling all the Hadoop dependencies. I am not sure that saving a 'Hive' install is worth all the splitting? (all the apps are 'core' apps right now and hue-plugin could be dropped with Hadoop 2)

        Even with a big package or refined packages we would still need some manual installs (e.g. shell app might need to install Pig, HBase, Flume. File Browser might need hadoop-httpfs).

        Hue is currently structured in 2 parts only:
        apps: proxy, admin, jobbrowser, beeswax... (there is no non-core apps)
        common: desktop, common hadoop/http libs...

        We might need to create two categories of app in Hue (core and non core), that way we could have a clean packaging.

        If I want to create a new (non-core) app, how easy would it be to add/integrate it to Hue with BigTop?
        e.g. creating a:
        hue-pig
        hue-squoop
        hue-mahout
        hue-flume....

        Show
        Romain Rigaux added a comment - - edited For me in the current situation it looks simpler to just have one single package pulling all the Hadoop dependencies. I am not sure that saving a 'Hive' install is worth all the splitting? (all the apps are 'core' apps right now and hue-plugin could be dropped with Hadoop 2) Even with a big package or refined packages we would still need some manual installs (e.g. shell app might need to install Pig, HBase, Flume. File Browser might need hadoop-httpfs). Hue is currently structured in 2 parts only: apps: proxy, admin, jobbrowser, beeswax... (there is no non-core apps) common: desktop, common hadoop/http libs... We might need to create two categories of app in Hue (core and non core), that way we could have a clean packaging. If I want to create a new (non-core) app, how easy would it be to add/integrate it to Hue with BigTop? e.g. creating a: hue-pig hue-squoop hue-mahout hue-flume....
        Hide
        Roman Shaposhnik added a comment -

        Attaching a second version of the patch with Puppet and tests this time. Also reducing the # of packages to only 4:

        1. hue
        2. hue-common
        3. hue-server
        4. hue-beeswax
        Show
        Roman Shaposhnik added a comment - Attaching a second version of the patch with Puppet and tests this time. Also reducing the # of packages to only 4: hue hue-common hue-server hue-beeswax
        Hide
        Roman Shaposhnik added a comment -

        At this point the only thing that is left is smoke testing. I'll provide some in the next patch once we all agree that the current patch qualifies for inclusion from pure packaging standpoint.

        Show
        Roman Shaposhnik added a comment - At this point the only thing that is left is smoke testing. I'll provide some in the next patch once we all agree that the current patch qualifies for inclusion from pure packaging standpoint.
        Hide
        Roman Shaposhnik added a comment -

        This is a final version of the patch ready for inclusion. It contains unit tests and some of the feedback I've received.

        Show
        Roman Shaposhnik added a comment - This is a final version of the patch ready for inclusion. It contains unit tests and some of the feedback I've received.
        Hide
        Peter Linnell added a comment -

        +1 LGTM and thanks for the fixes incorporated.

        Show
        Peter Linnell added a comment - +1 LGTM and thanks for the fixes incorporated.
        Hide
        Bruno Mahé added a comment - - edited

        Great patch! Thanks a lot Roman.
        But I see some issues as blockers. So here is a -1 since Peter put a +1.
        Blocker issues:

        • It does not build on anything else than Fedora. (although it looks like some missing dependencies on the slaves)
        • +Conflicts: cloudera-desktop

          should be killed

        • I don't see any notify/subscribe in the hue puppet module (kind of a blocker, but not really)
        • bigtop-deploy/puppet/modules/hue/templates/hue.in does not have any Apache license header
        • In bigtop-packages/src/common/hue/install_hue.sh, I am pretty sure
          CONF_DIR=${CONF_DIR:-/etc/hue}

          should be

          CONF_DIR=${CONF_DIR:-/etc/hue/conf}

        The most important one would be to see some green builds on jenkins.

        Nice to have issues (can be dealt with separately):

        • There are some missing dependencies on mageia (cyrus*, /sbin/runuser...). But we can deal with that later.
        • Not necessary for this pass, but in the hue.ini being deployed by puppet, I don't see Hue's default sqlite db being overridden to a /var/lib/hue/* location
        • Was
          +ln -fs $LIB_DIR/desktop/libs/hadoop/java-lib/*plugin*jar $PREFIX/$HADOOP_DIR

          from install_hue.sh tested?

        • +# ALL_PYTHON_BORKED=`find $BUNDLED_BUILD_DIR/env/lib/python*/site-packages/ -type f -iname "*"`

          should be killed

        • Package hue should also depend on hue-common for completeness. I understand it gets pulled through beeswax, but it is easy to forget about it and make a mistake
        • Note for later:
          +        mkdir -p /usr/lib/hue/pids/ 

          should be killed (actually the whole debian init script should be killed)

        • if HUE_BASE_VERSION and HUE_PKG_VERSION, let HUE_PKG_VERSION reuse the version defined by HUE_BASE_VERSION
        Show
        Bruno Mahé added a comment - - edited Great patch! Thanks a lot Roman. But I see some issues as blockers. So here is a -1 since Peter put a +1. Blocker issues: It does not build on anything else than Fedora. (although it looks like some missing dependencies on the slaves) +Conflicts: cloudera-desktop should be killed I don't see any notify/subscribe in the hue puppet module (kind of a blocker, but not really) bigtop-deploy/puppet/modules/hue/templates/hue.in does not have any Apache license header In bigtop-packages/src/common/hue/install_hue.sh, I am pretty sure CONF_DIR=${CONF_DIR:-/etc/hue} should be CONF_DIR=${CONF_DIR:-/etc/hue/conf} The most important one would be to see some green builds on jenkins. Nice to have issues (can be dealt with separately): There are some missing dependencies on mageia (cyrus*, /sbin/runuser...). But we can deal with that later. Not necessary for this pass, but in the hue.ini being deployed by puppet, I don't see Hue's default sqlite db being overridden to a /var/lib/hue/* location Was +ln -fs $LIB_DIR/desktop/libs/hadoop/java-lib/*plugin*jar $PREFIX/$HADOOP_DIR from install_hue.sh tested? +# ALL_PYTHON_BORKED=`find $BUNDLED_BUILD_DIR/env/lib/python*/site-packages/ -type f -iname "*"` should be killed Package hue should also depend on hue-common for completeness. I understand it gets pulled through beeswax, but it is easy to forget about it and make a mistake Note for later: + mkdir -p /usr/lib/hue/pids/ should be killed (actually the whole debian init script should be killed) if HUE_BASE_VERSION and HUE_PKG_VERSION, let HUE_PKG_VERSION reuse the version defined by HUE_BASE_VERSION
        Hide
        Roman Shaposhnik added a comment -

        @Bruno,

        thanks a million for a very detailed feedback. I'm attaching a diff that takes care of some of the items, and here's a couple of points explaining the other ones:

        1. slaves need to be updated with additional packages and I'm also updating the Bigtop's README file with that info
        2. In bigtop-packages/src/common/hue/install_hue.sh (re. CONF_DIR) – that is actually on purpose (and it works )

        Once again – thanks and please take a look at the new diff

        Show
        Roman Shaposhnik added a comment - @Bruno, thanks a million for a very detailed feedback. I'm attaching a diff that takes care of some of the items, and here's a couple of points explaining the other ones: slaves need to be updated with additional packages and I'm also updating the Bigtop's README file with that info In bigtop-packages/src/common/hue/install_hue.sh (re. CONF_DIR) – that is actually on purpose (and it works ) Once again – thanks and please take a look at the new diff
        Hide
        Bruno Mahé added a comment -

        Awesome!
        +1

        Show
        Bruno Mahé added a comment - Awesome! +1

          People

          • Assignee:
            Roman Shaposhnik
            Reporter:
            Roman Shaposhnik
          • Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development