Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 1.0.0
    • Component/s: None
    • Labels:
      None

      Description

      If tez configuration is selected: supply yarn-tez to mapred.xml and select hive.execution.engine=tez and others in hive-site.xml configuration

      1. BIGTOP-1657.patch
        9 kB
        Evans Ye
      2. BIGTOP-1657.patch
        12 kB
        Evans Ye
      3. BIGTOP-1657.patch
        12 kB
        Evans Ye

        Activity

        Hide
        cos Konstantin Boudnik added a comment -

        Let's do this - otherwise auto-deployment of the Tez is screwed; hence it will be released untested.

        Show
        cos Konstantin Boudnik added a comment - Let's do this - otherwise auto-deployment of the Tez is screwed; hence it will be released untested.
        Hide
        evans_ye Evans Ye added a comment - - edited

        Since tez require libraries to be uploaded to hdfs, I add the upload logic into init-hcfs.json and init-hcfs.groovy.
        The patch also added tez conf into hadoop-env.sh, set hive.execution.engine to tez, and set mapreduce framework to yarn-tez when tez is deployed.
        This makes the testing of this patch a little bit complicated. Here's the steps test I used:

        • ./gradlew hadoop-yum bigtop-utils-yum to build new hadoop and bigtop-utils after patch applied
        • update the bigtop-deploy/vm/vagrant-puppet-docker/vagrantconfig.yaml:
          ...
          components: [hadoop, yarn, hive, tez]
          ...
          enable_local_repo: true
          
        • cd bigtop-deploy/vm/vagrant-puppet-docekr; ./docker-hadoop -d -c 1 (takes 4mins in my environment)
        • vagrant ssh bigtop1
        • su - jenkins
        • hadoop fs -put /etc/passwd
        • hive (get into the hive shell)
        • create table t1 (s string) location '/user/jenkins'
        • select * from t1 order by s;
        • Should see a fancy tez execution table
        Show
        evans_ye Evans Ye added a comment - - edited Since tez require libraries to be uploaded to hdfs, I add the upload logic into init-hcfs.json and init-hcfs.groovy. The patch also added tez conf into hadoop-env.sh, set hive.execution.engine to tez, and set mapreduce framework to yarn-tez when tez is deployed. This makes the testing of this patch a little bit complicated. Here's the steps test I used: ./gradlew hadoop-yum bigtop-utils-yum to build new hadoop and bigtop-utils after patch applied update the bigtop-deploy/vm/vagrant-puppet-docker/vagrantconfig.yaml: ... components: [hadoop, yarn, hive, tez] ... enable_local_repo: true cd bigtop-deploy/vm/vagrant-puppet-docekr; ./docker-hadoop -d -c 1 (takes 4mins in my environment) vagrant ssh bigtop1 su - jenkins hadoop fs -put /etc/passwd hive (get into the hive shell) create table t1 (s string) location '/user/jenkins' select * from t1 order by s; Should see a fancy tez execution table
        Hide
        cos Konstantin Boudnik added a comment -

        +1 please commit if this is still valid and up-to-date.
        Sorry for not reviewing it faster.

        Show
        cos Konstantin Boudnik added a comment - +1 please commit if this is still valid and up-to-date. Sorry for not reviewing it faster.
        Hide
        cos Konstantin Boudnik added a comment -

        I've tried to apply and commit the patch, but it doesn't apply anymore. Could you please rebase it on the latest master? Thanks!

        Show
        cos Konstantin Boudnik added a comment - I've tried to apply and commit the patch, but it doesn't apply anymore. Could you please rebase it on the latest master? Thanks!
        Hide
        evans_ye Evans Ye added a comment -

        Sorry I didn't see your +1 at the time.
        This should be conflict with BIGTOP-1683.
        I'll rebase it and get back to you soon.

        Show
        evans_ye Evans Ye added a comment - Sorry I didn't see your +1 at the time. This should be conflict with BIGTOP-1683 . I'll rebase it and get back to you soon.
        Hide
        evans_ye Evans Ye added a comment - - edited

        OK. new patch uploaded. The patch contains the following changes:

        • add default configuration for tez_conf_dir and tez_jars. User can override them if needed in site.yaml
        • add tez puppet module, which just simply install the tez package
        • when tez component is selected, change mapreduce_framework_name to tez. User can override it if needed in site.yaml
        • when tez component is selected, change hive.execution.engine to tez
        • when tez component is selected, export TEZ_CONF_DIR, TEZ_JARS, and HADOOP_CLASSPATH in hadoop-env.sh
        • when tez component is selected, install tez before running init-hdfs.sh, then the init-hcfs.groovy will upload tez jars on hdfs.
        • Add a retry logic in groovy when doing mkdir and copyFromLocalFile. The deploy might failed if running on a slow computer(like mine) w/o retry since namenode is still in safemode or datanode is still initiating.
        Show
        evans_ye Evans Ye added a comment - - edited OK. new patch uploaded. The patch contains the following changes: add default configuration for tez_conf_dir and tez_jars . User can override them if needed in site.yaml add tez puppet module, which just simply install the tez package when tez component is selected, change mapreduce_framework_name to tez. User can override it if needed in site.yaml when tez component is selected, change hive.execution.engine to tez when tez component is selected, export TEZ_CONF_DIR , TEZ_JARS , and HADOOP_CLASSPATH in hadoop-env.sh when tez component is selected, install tez before running init-hdfs.sh, then the init-hcfs.groovy will upload tez jars on hdfs. Add a retry logic in groovy when doing mkdir and copyFromLocalFile. The deploy might failed if running on a slow computer(like mine) w/o retry since namenode is still in safemode or datanode is still initiating.
        Hide
        oflebbe Olaf Flebbe added a comment -

        mapreduce_framework_name has to be set to yarn-tez

        Aside from this, it looks very good to me.

        Show
        oflebbe Olaf Flebbe added a comment - mapreduce_framework_name has to be set to yarn-tez Aside from this, it looks very good to me.
        Hide
        evans_ye Evans Ye added a comment -

        Hey Olaf Flebbe, thanks for spotting a bug in my patch!
        I've fixed this and upload a one.

        Show
        evans_ye Evans Ye added a comment - Hey Olaf Flebbe , thanks for spotting a bug in my patch! I've fixed this and upload a one.
        Hide
        oflebbe Olaf Flebbe added a comment -

        Yep, tested. ++1.

        Show
        oflebbe Olaf Flebbe added a comment - Yep, tested. ++1.
        Hide
        oflebbe Olaf Flebbe added a comment -

        Will commit it ...

        Show
        oflebbe Olaf Flebbe added a comment - Will commit it ...
        Hide
        oflebbe Olaf Flebbe added a comment -

        Thank you very much for your work!

        Show
        oflebbe Olaf Flebbe added a comment - Thank you very much for your work!

          People

          • Assignee:
            evans_ye Evans Ye
            Reporter:
            oflebbe Olaf Flebbe
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development