Sqoop
  1. Sqoop
  2. SQOOP-1192

Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.5
    • Component/s: None
    • Labels:
      None

      Description

      Now Sqoop will copy jar files in %SQOOP_HOME%\lib folder to the job cache every time a Sqoop job is launched. When Oozie launch a Sqoop job, this behavior can be optimized by add these jars in Oozie Sqoop sharelib. In this case, the jar files in share lib only needed be localized to each worker node once and reuse by all Sqoop job launched by Oozie. This can reduce massive disk I/O on worker node when using Sqoop by Oozie. To enable this, Sqoop need to have an option which enable the job to skip adding lib jars to the job cache. For now, this option should only be used by Oozie started Sqoop job. The patch attached introduce "--skip-dist-cache" option to enable this feature.

      1. SQOOP-1192.3.patch
        5 kB
        Shuaishuai Nie
      2. SQOOP-1192.2.patch
        4 kB
        Shuaishuai Nie
      3. SQOOP-1192.1.patch
        4 kB
        Shuaishuai Nie

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop23 #1060 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/1060/)
          SQOOP-1192: Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=84071181265f98959ffdfc41425022f8251d2429)

          • src/java/org/apache/sqoop/mapreduce/JobBase.java
          • src/java/org/apache/sqoop/SqoopOptions.java
          • src/test/com/cloudera/sqoop/TestSqoopOptions.java
          • src/java/org/apache/sqoop/mapreduce/TextExportMapper.java
          • src/java/org/apache/sqoop/tool/BaseSqoopTool.java
          • src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
          • src/docs/user/import.txt
          Show
          Hudson added a comment - SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop23 #1060 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/1060/ ) SQOOP-1192 : Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=84071181265f98959ffdfc41425022f8251d2429 ) src/java/org/apache/sqoop/mapreduce/JobBase.java src/java/org/apache/sqoop/SqoopOptions.java src/test/com/cloudera/sqoop/TestSqoopOptions.java src/java/org/apache/sqoop/mapreduce/TextExportMapper.java src/java/org/apache/sqoop/tool/BaseSqoopTool.java src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java src/docs/user/import.txt
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop20 #858 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/858/)
          SQOOP-1192: Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=84071181265f98959ffdfc41425022f8251d2429)

          • src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
          • src/java/org/apache/sqoop/tool/BaseSqoopTool.java
          • src/java/org/apache/sqoop/SqoopOptions.java
          • src/test/com/cloudera/sqoop/TestSqoopOptions.java
          • src/java/org/apache/sqoop/mapreduce/TextExportMapper.java
          • src/java/org/apache/sqoop/mapreduce/JobBase.java
          • src/docs/user/import.txt
          Show
          Hudson added a comment - SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop20 #858 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/858/ ) SQOOP-1192 : Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=84071181265f98959ffdfc41425022f8251d2429 ) src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java src/java/org/apache/sqoop/tool/BaseSqoopTool.java src/java/org/apache/sqoop/SqoopOptions.java src/test/com/cloudera/sqoop/TestSqoopOptions.java src/java/org/apache/sqoop/mapreduce/TextExportMapper.java src/java/org/apache/sqoop/mapreduce/JobBase.java src/docs/user/import.txt
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop200 #863 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/863/)
          SQOOP-1192: Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=84071181265f98959ffdfc41425022f8251d2429)

          • src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
          • src/java/org/apache/sqoop/tool/BaseSqoopTool.java
          • src/java/org/apache/sqoop/mapreduce/TextExportMapper.java
          • src/docs/user/import.txt
          • src/java/org/apache/sqoop/mapreduce/JobBase.java
          • src/test/com/cloudera/sqoop/TestSqoopOptions.java
          • src/java/org/apache/sqoop/SqoopOptions.java
          Show
          Hudson added a comment - SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop200 #863 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/863/ ) SQOOP-1192 : Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=84071181265f98959ffdfc41425022f8251d2429 ) src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java src/java/org/apache/sqoop/tool/BaseSqoopTool.java src/java/org/apache/sqoop/mapreduce/TextExportMapper.java src/docs/user/import.txt src/java/org/apache/sqoop/mapreduce/JobBase.java src/test/com/cloudera/sqoop/TestSqoopOptions.java src/java/org/apache/sqoop/SqoopOptions.java
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop100 #821 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/821/)
          SQOOP-1192: Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=84071181265f98959ffdfc41425022f8251d2429)

          • src/docs/user/import.txt
          • src/java/org/apache/sqoop/mapreduce/TextExportMapper.java
          • src/test/com/cloudera/sqoop/TestSqoopOptions.java
          • src/java/org/apache/sqoop/tool/BaseSqoopTool.java
          • src/java/org/apache/sqoop/SqoopOptions.java
          • src/java/org/apache/sqoop/mapreduce/JobBase.java
          • src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
          Show
          Hudson added a comment - SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop100 #821 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/821/ ) SQOOP-1192 : Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=84071181265f98959ffdfc41425022f8251d2429 ) src/docs/user/import.txt src/java/org/apache/sqoop/mapreduce/TextExportMapper.java src/test/com/cloudera/sqoop/TestSqoopOptions.java src/java/org/apache/sqoop/tool/BaseSqoopTool.java src/java/org/apache/sqoop/SqoopOptions.java src/java/org/apache/sqoop/mapreduce/JobBase.java src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
          Hide
          Jarek Jarcec Cecho added a comment -

          Thank you for your contribution Shuaishuai Nie!

          Show
          Jarek Jarcec Cecho added a comment - Thank you for your contribution Shuaishuai Nie !
          Hide
          ASF subversion and git services added a comment -

          Commit 84071181265f98959ffdfc41425022f8251d2429 in branch refs/heads/trunk from Jarek Jarcec Cecho
          [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=8407118 ]

          SQOOP-1192: Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib

          (Shuaishuai Nie via Jarek Jarcec Cecho)

          Show
          ASF subversion and git services added a comment - Commit 84071181265f98959ffdfc41425022f8251d2429 in branch refs/heads/trunk from Jarek Jarcec Cecho [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=8407118 ] SQOOP-1192 : Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib (Shuaishuai Nie via Jarek Jarcec Cecho)
          Hide
          Shuaishuai Nie added a comment -

          Thanks Jarek Jarcec Cecho. Updated the patch with spelling error fixed/

          Show
          Shuaishuai Nie added a comment - Thanks Jarek Jarcec Cecho . Updated the patch with spelling error fixed/
          Hide
          Shuaishuai Nie added a comment -

          Updated the document for the new option in SQOOP-1192.3.patch

          Show
          Shuaishuai Nie added a comment - Updated the document for the new option in SQOOP-1192 .3.patch
          Hide
          Shuaishuai Nie added a comment -

          Add unit test for the new option

          Show
          Shuaishuai Nie added a comment - Add unit test for the new option
          Hide
          Venkat Ranganathan added a comment -

          Thanks Shuaishuai Nie for uploading the patch. Can you add tests for it. BTW, we need to also do the same for HCatalog jars also and make sure Sqoop action includes HCat jars given that Sqoop has Hcat integration now

          And can you also please add a review board link with the revised patch

          Show
          Venkat Ranganathan added a comment - Thanks Shuaishuai Nie for uploading the patch. Can you add tests for it. BTW, we need to also do the same for HCatalog jars also and make sure Sqoop action includes HCat jars given that Sqoop has Hcat integration now And can you also please add a review board link with the revised patch

            People

            • Assignee:
              Shuaishuai Nie
              Reporter:
              Shuaishuai Nie
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development