Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-1272

Support importing mainframe sequential datasets

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.4, 1.4.5
    • Fix Version/s: 1.4.6
    • Component/s: connectors
    • Labels:

      Description

      There is a growing need to move data from mainframe to HDFS. This Jira proposes to enhance Sqoop to support moving a set of sequential mainframe datasets to HDFS. The attached document describes a design for this enhancement.

      1. MainframeImport.pdf
        38 kB
        Mariappan Asokan
      2. MainframeImport.pdf
        40 kB
        Mariappan Asokan
      3. 1272.patch
        79 kB
        Mariappan Asokan
      4. 1272.patch
        113 kB
        Mariappan Asokan
      5. 1272.patch
        116 kB
        Mariappan Asokan
      6. 1272.patch
        127 kB
        Mariappan Asokan

        Issue Links

          Activity

          Hide
          chris.teoh@gmail.com Chris Teoh added a comment -

          Hi there,

          Good job on the patch. It seems to only support partitioned datasets rather than sequential datasets. Can a JIRA be raised as an improvement to this patch to support sequential datasets? I have been working on a patch to add this support.

          Kind Regards
          Chris

          Show
          chris.teoh@gmail.com Chris Teoh added a comment - Hi there, Good job on the patch. It seems to only support partitioned datasets rather than sequential datasets. Can a JIRA be raised as an improvement to this patch to support sequential datasets? I have been working on a patch to add this support. Kind Regards Chris
          Hide
          SrinivaR Srinivas added a comment - - edited

          Hi Mariappan Asokan. This document was very useful.
          Is there any option to import mainframe PS files into Hadoop through Sqoop.

          Thanks in advance

          Show
          SrinivaR Srinivas added a comment - - edited Hi Mariappan Asokan . This document was very useful. Is there any option to import mainframe PS files into Hadoop through Sqoop. Thanks in advance
          Hide
          masokan Mariappan Asokan added a comment -

          Thanks Venkat for committing this. Thanks to all the reviewers (Gwen, Jarcec, and Venkat.)

          – Asokan

          Show
          masokan Mariappan Asokan added a comment - Thanks Venkat for committing this. Thanks to all the reviewers (Gwen, Jarcec, and Venkat.) – Asokan
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop23 #1133 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/1133/)
          SQOOP-1272: Support importing mainframe sequential datasets (venkat: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=268299ee52f016b6872f042c52268731b525ad13)

          • src/docs/user/basics.txt
          • src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java
          • src/docs/man/database-independent-args.txt
          • src/java/org/apache/sqoop/tool/SqoopTool.java
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputSplit.java
          • src/docs/user/import.txt
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputSplit.java
          • src/docs/man/hive-args.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetRecordReader.java
          • src/docs/man/hbase-args.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java
          • src/docs/user/tools.txt
          • src/docs/man/sqoop-import-mainframe.txt
          • src/test/org/apache/sqoop/manager/TestMainframeManager.java
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputFormat.java
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetImportMapper.java
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetFTPRecordReader.java
          • src/test/org/apache/sqoop/tool/TestMainframeImportTool.java
          • src/docs/man/import-args.txt
          • src/docs/user/SqoopUserGuide.txt
          • src/docs/man/mainframe-connection-args.txt
          • src/java/org/apache/sqoop/SqoopOptions.java
          • src/docs/user/intro.txt
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeImportJob.java
          • ivy/libraries.properties
          • src/docs/user/import-mainframe.txt
          • ivy.xml
          • src/docs/man/common-args.txt
          • src/docs/user/distributed-cache.txt
          • src/docs/user/mainframe-common-args.txt
          • src/java/org/apache/sqoop/manager/MainframeManager.java
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputFormat.java
          • src/docs/man/sqoop.txt
          • src/docs/user/import-mainframe-purpose.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java
          • src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java
          • src/docs/user/connecting-to-mainframe.txt
          • src/java/org/apache/sqoop/tool/MainframeImportTool.java
          • src/docs/man/import-common-args.txt
          • src/docs/user/validation-args.txt
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop23 #1133 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/1133/ ) SQOOP-1272 : Support importing mainframe sequential datasets (venkat: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=268299ee52f016b6872f042c52268731b525ad13 ) src/docs/user/basics.txt src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java src/docs/man/database-independent-args.txt src/java/org/apache/sqoop/tool/SqoopTool.java src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputSplit.java src/docs/user/import.txt src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputSplit.java src/docs/man/hive-args.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetRecordReader.java src/docs/man/hbase-args.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java src/docs/user/tools.txt src/docs/man/sqoop-import-mainframe.txt src/test/org/apache/sqoop/manager/TestMainframeManager.java src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputFormat.java src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetImportMapper.java src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetFTPRecordReader.java src/test/org/apache/sqoop/tool/TestMainframeImportTool.java src/docs/man/import-args.txt src/docs/user/SqoopUserGuide.txt src/docs/man/mainframe-connection-args.txt src/java/org/apache/sqoop/SqoopOptions.java src/docs/user/intro.txt src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeImportJob.java ivy/libraries.properties src/docs/user/import-mainframe.txt ivy.xml src/docs/man/common-args.txt src/docs/user/distributed-cache.txt src/docs/user/mainframe-common-args.txt src/java/org/apache/sqoop/manager/MainframeManager.java src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputFormat.java src/docs/man/sqoop.txt src/docs/user/import-mainframe-purpose.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java src/docs/user/connecting-to-mainframe.txt src/java/org/apache/sqoop/tool/MainframeImportTool.java src/docs/man/import-common-args.txt src/docs/user/validation-args.txt
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop200 #936 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/936/)
          SQOOP-1272: Support importing mainframe sequential datasets (venkat: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=268299ee52f016b6872f042c52268731b525ad13)

          • ivy.xml
          • src/docs/man/database-independent-args.txt
          • src/docs/user/validation-args.txt
          • src/java/org/apache/sqoop/tool/MainframeImportTool.java
          • src/docs/man/import-common-args.txt
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputFormat.java
          • src/docs/man/sqoop.txt
          • src/docs/man/import-args.txt
          • src/docs/user/import.txt
          • src/docs/man/common-args.txt
          • src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java
          • src/docs/man/hbase-args.txt
          • src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeImportJob.java
          • src/docs/user/distributed-cache.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputFormat.java
          • src/docs/user/import-mainframe.txt
          • src/docs/man/mainframe-connection-args.txt
          • src/docs/user/mainframe-common-args.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputSplit.java
          • src/docs/man/hive-args.txt
          • src/docs/user/basics.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetImportMapper.java
          • src/docs/user/import-mainframe-purpose.txt
          • src/test/org/apache/sqoop/tool/TestMainframeImportTool.java
          • src/java/org/apache/sqoop/SqoopOptions.java
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputSplit.java
          • src/docs/user/tools.txt
          • src/java/org/apache/sqoop/tool/SqoopTool.java
          • src/docs/user/intro.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
          • src/docs/user/SqoopUserGuide.txt
          • src/java/org/apache/sqoop/manager/MainframeManager.java
          • src/docs/man/sqoop-import-mainframe.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetRecordReader.java
          • src/test/org/apache/sqoop/manager/TestMainframeManager.java
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetFTPRecordReader.java
          • ivy/libraries.properties
          • src/docs/user/connecting-to-mainframe.txt
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop200 #936 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/936/ ) SQOOP-1272 : Support importing mainframe sequential datasets (venkat: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=268299ee52f016b6872f042c52268731b525ad13 ) ivy.xml src/docs/man/database-independent-args.txt src/docs/user/validation-args.txt src/java/org/apache/sqoop/tool/MainframeImportTool.java src/docs/man/import-common-args.txt src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputFormat.java src/docs/man/sqoop.txt src/docs/man/import-args.txt src/docs/user/import.txt src/docs/man/common-args.txt src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java src/docs/man/hbase-args.txt src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeImportJob.java src/docs/user/distributed-cache.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputFormat.java src/docs/user/import-mainframe.txt src/docs/man/mainframe-connection-args.txt src/docs/user/mainframe-common-args.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputSplit.java src/docs/man/hive-args.txt src/docs/user/basics.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetImportMapper.java src/docs/user/import-mainframe-purpose.txt src/test/org/apache/sqoop/tool/TestMainframeImportTool.java src/java/org/apache/sqoop/SqoopOptions.java src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputSplit.java src/docs/user/tools.txt src/java/org/apache/sqoop/tool/SqoopTool.java src/docs/user/intro.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java src/docs/user/SqoopUserGuide.txt src/java/org/apache/sqoop/manager/MainframeManager.java src/docs/man/sqoop-import-mainframe.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetRecordReader.java src/test/org/apache/sqoop/manager/TestMainframeManager.java src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetFTPRecordReader.java ivy/libraries.properties src/docs/user/connecting-to-mainframe.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Sqoop-ant-jdk-1.6-hadoop20 #930 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/930/)
          SQOOP-1272: Support importing mainframe sequential datasets (venkat: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=268299ee52f016b6872f042c52268731b525ad13)

          • src/docs/user/validation-args.txt
          • src/java/org/apache/sqoop/tool/MainframeImportTool.java
          • src/docs/user/import.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetRecordReader.java
          • src/test/org/apache/sqoop/manager/TestMainframeManager.java
          • ivy.xml
          • src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputFormat.java
          • src/java/org/apache/sqoop/SqoopOptions.java
          • src/docs/man/sqoop-import-mainframe.txt
          • src/docs/man/import-args.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputFormat.java
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java
          • src/docs/man/common-args.txt
          • src/docs/user/distributed-cache.txt
          • src/test/org/apache/sqoop/tool/TestMainframeImportTool.java
          • src/docs/user/connecting-to-mainframe.txt
          • src/docs/user/SqoopUserGuide.txt
          • src/docs/user/import-mainframe-purpose.txt
          • src/java/org/apache/sqoop/manager/MainframeManager.java
          • src/docs/user/basics.txt
          • src/docs/man/sqoop.txt
          • src/docs/man/hbase-args.txt
          • src/docs/man/database-independent-args.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
          • src/docs/user/mainframe-common-args.txt
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputSplit.java
          • src/docs/man/hive-args.txt
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetFTPRecordReader.java
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java
          • src/docs/user/import-mainframe.txt
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeImportJob.java
          • src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputSplit.java
          • src/docs/man/mainframe-connection-args.txt
          • ivy/libraries.properties
          • src/docs/man/import-common-args.txt
          • src/docs/user/tools.txt
          • src/docs/user/intro.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetImportMapper.java
          • src/java/org/apache/sqoop/tool/SqoopTool.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Sqoop-ant-jdk-1.6-hadoop20 #930 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/930/ ) SQOOP-1272 : Support importing mainframe sequential datasets (venkat: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=268299ee52f016b6872f042c52268731b525ad13 ) src/docs/user/validation-args.txt src/java/org/apache/sqoop/tool/MainframeImportTool.java src/docs/user/import.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetRecordReader.java src/test/org/apache/sqoop/manager/TestMainframeManager.java ivy.xml src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputFormat.java src/java/org/apache/sqoop/SqoopOptions.java src/docs/man/sqoop-import-mainframe.txt src/docs/man/import-args.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputFormat.java src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java src/docs/man/common-args.txt src/docs/user/distributed-cache.txt src/test/org/apache/sqoop/tool/TestMainframeImportTool.java src/docs/user/connecting-to-mainframe.txt src/docs/user/SqoopUserGuide.txt src/docs/user/import-mainframe-purpose.txt src/java/org/apache/sqoop/manager/MainframeManager.java src/docs/user/basics.txt src/docs/man/sqoop.txt src/docs/man/hbase-args.txt src/docs/man/database-independent-args.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java src/docs/user/mainframe-common-args.txt src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputSplit.java src/docs/man/hive-args.txt src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetFTPRecordReader.java src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java src/docs/user/import-mainframe.txt src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeImportJob.java src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputSplit.java src/docs/man/mainframe-connection-args.txt ivy/libraries.properties src/docs/man/import-common-args.txt src/docs/user/tools.txt src/docs/user/intro.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetImportMapper.java src/java/org/apache/sqoop/tool/SqoopTool.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Sqoop-ant-jdk-1.6-hadoop100 #895 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/895/)
          SQOOP-1272: Support importing mainframe sequential datasets (venkat: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=268299ee52f016b6872f042c52268731b525ad13)

          • src/docs/user/import.txt
          • src/java/org/apache/sqoop/SqoopOptions.java
          • src/java/org/apache/sqoop/tool/MainframeImportTool.java
          • src/docs/man/mainframe-connection-args.txt
          • src/docs/user/connecting-to-mainframe.txt
          • src/docs/user/intro.txt
          • src/docs/user/import-mainframe-purpose.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetRecordReader.java
          • src/docs/man/sqoop.txt
          • src/docs/man/import-common-args.txt
          • src/docs/man/database-independent-args.txt
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeImportJob.java
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputFormat.java
          • src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java
          • src/docs/user/tools.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputSplit.java
          • src/docs/man/import-args.txt
          • ivy/libraries.properties
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetFTPRecordReader.java
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
          • src/docs/man/sqoop-import-mainframe.txt
          • src/docs/user/basics.txt
          • src/docs/user/validation-args.txt
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputFormat.java
          • ivy.xml
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java
          • src/docs/man/common-args.txt
          • src/java/org/apache/sqoop/tool/SqoopTool.java
          • src/docs/user/import-mainframe.txt
          • src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java
          • src/java/org/apache/sqoop/manager/MainframeManager.java
          • src/test/org/apache/sqoop/tool/TestMainframeImportTool.java
          • src/docs/user/mainframe-common-args.txt
          • src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputSplit.java
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetImportMapper.java
          • src/docs/man/hive-args.txt
          • src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java
          • src/test/org/apache/sqoop/manager/TestMainframeManager.java
          • src/docs/man/hbase-args.txt
          • src/docs/user/distributed-cache.txt
          • src/docs/user/SqoopUserGuide.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Sqoop-ant-jdk-1.6-hadoop100 #895 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/895/ ) SQOOP-1272 : Support importing mainframe sequential datasets (venkat: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=268299ee52f016b6872f042c52268731b525ad13 ) src/docs/user/import.txt src/java/org/apache/sqoop/SqoopOptions.java src/java/org/apache/sqoop/tool/MainframeImportTool.java src/docs/man/mainframe-connection-args.txt src/docs/user/connecting-to-mainframe.txt src/docs/user/intro.txt src/docs/user/import-mainframe-purpose.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetRecordReader.java src/docs/man/sqoop.txt src/docs/man/import-common-args.txt src/docs/man/database-independent-args.txt src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeImportJob.java src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputFormat.java src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java src/docs/user/tools.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetInputSplit.java src/docs/man/import-args.txt ivy/libraries.properties src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetFTPRecordReader.java src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java src/docs/man/sqoop-import-mainframe.txt src/docs/user/basics.txt src/docs/user/validation-args.txt src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputFormat.java ivy.xml src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java src/docs/man/common-args.txt src/java/org/apache/sqoop/tool/SqoopTool.java src/docs/user/import-mainframe.txt src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java src/java/org/apache/sqoop/manager/MainframeManager.java src/test/org/apache/sqoop/tool/TestMainframeImportTool.java src/docs/user/mainframe-common-args.txt src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetInputSplit.java src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetImportMapper.java src/docs/man/hive-args.txt src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java src/test/org/apache/sqoop/manager/TestMainframeManager.java src/docs/man/hbase-args.txt src/docs/user/distributed-cache.txt src/docs/user/SqoopUserGuide.txt
          Hide
          venkatnrangan Venkat Ranganathan added a comment -

          Thanks Mariappan Asokan for patiently working on this issue and working through review comments. Thanks Jarek Jarcec Cecho and Gwen Shapira for reviewing this big contribution

          Show
          venkatnrangan Venkat Ranganathan added a comment - Thanks Mariappan Asokan for patiently working on this issue and working through review comments. Thanks Jarek Jarcec Cecho and Gwen Shapira for reviewing this big contribution
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 268299ee52f016b6872f042c52268731b525ad13 in sqoop's branch refs/heads/trunk from Venkat Ranganathan
          [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=268299e ]

          SQOOP-1272: Support importing mainframe sequential datasets

          (Mariappan Asokan via Venkat Ranganathan)

          Show
          jira-bot ASF subversion and git services added a comment - Commit 268299ee52f016b6872f042c52268731b525ad13 in sqoop's branch refs/heads/trunk from Venkat Ranganathan [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=268299e ] SQOOP-1272 : Support importing mainframe sequential datasets (Mariappan Asokan via Venkat Ranganathan)
          Hide
          venkatnrangan Venkat Ranganathan added a comment -

          Also, can you upload the new patch to RB so that we can review it there

          Thanks

          Show
          venkatnrangan Venkat Ranganathan added a comment - Also, can you upload the new patch to RB so that we can review it there Thanks
          Hide
          venkatnrangan Venkat Ranganathan added a comment -

          Thanks Mariappan Asokan

          Sorry, it was me who suggested it, but the approach for the connector has since been different so it is OK to not enable that flag.

          Show
          venkatnrangan Venkat Ranganathan added a comment - Thanks Mariappan Asokan Sorry, it was me who suggested it, but the approach for the connector has since been different so it is OK to not enable that flag.
          Hide
          masokan Mariappan Asokan added a comment -

          Jarcec and Venkat,
          I ran some tests connecting to a real mainframe. The tests were failing. I debugged and found out that in MainframeManager.java, I should not override the method isORMFacilitySelfManaged() to return true. The SqoopRecord class has to be generated with one field. When I removed isORMFacilitySelfManaged(), the tests ran fine. I have uploaded the patch with that change.

          Please let me know whether it is okay.

          Thanks.

          Show
          masokan Mariappan Asokan added a comment - Jarcec and Venkat, I ran some tests connecting to a real mainframe. The tests were failing. I debugged and found out that in MainframeManager.java , I should not override the method isORMFacilitySelfManaged() to return true . The SqoopRecord class has to be generated with one field. When I removed isORMFacilitySelfManaged() , the tests ran fine. I have uploaded the patch with that change. Please let me know whether it is okay. Thanks.
          Hide
          masokan Mariappan Asokan added a comment -

          Uploaded a new patch.

          Show
          masokan Mariappan Asokan added a comment - Uploaded a new patch.
          Hide
          jarcec Jarek Jarcec Cecho added a comment -

          Switching back to "Patch available" state then.

          Show
          jarcec Jarek Jarcec Cecho added a comment - Switching back to "Patch available" state then.
          Hide
          masokan Mariappan Asokan added a comment -

          Uploaded the latest patch with the updated documentation files.

          Show
          masokan Mariappan Asokan added a comment - Uploaded the latest patch with the updated documentation files.
          Hide
          jarcec Jarek Jarcec Cecho added a comment -

          We're waiting on updated patch, so I'm canceling the "Patch available" status for now.

          Show
          jarcec Jarek Jarcec Cecho added a comment - We're waiting on updated patch, so I'm canceling the "Patch available" status for now.
          Hide
          masokan Mariappan Asokan added a comment -

          Gwen,
          Unfortunately, I could not make MainframeManager inherit from org.apache.sqoop.manager.ConnManager since ConnFactory returns only com.cloudera.sqoop.manager.ConnManager. I made the following changes:

          • Updated ivy.xml and libraries.properties to add dependency on Apache commons-net.
          • Moved mainframe related files to sub-package mainframe.

          I am still working on adding documentation. Please see the review board for the latest patch. Thanks.

          Show
          masokan Mariappan Asokan added a comment - Gwen, Unfortunately, I could not make MainframeManager inherit from org.apache.sqoop.manager.ConnManager since ConnFactory returns only com.cloudera.sqoop.manager.ConnManager. I made the following changes: Updated ivy.xml and libraries.properties to add dependency on Apache commons-net. Moved mainframe related files to sub-package mainframe. I am still working on adding documentation. Please see the review board for the latest patch. Thanks.
          Hide
          gwenshap Gwen Shapira added a comment -

          Agree about SqoopOptions. Unfortunately the hierarchy structure looks a bit messy

          And yes, please add commons-net as an explicit dependency, since we depend on it directly. I'd rather not assume that Hadoop will always use this library.

          Show
          gwenshap Gwen Shapira added a comment - Agree about SqoopOptions. Unfortunately the hierarchy structure looks a bit messy And yes, please add commons-net as an explicit dependency, since we depend on it directly. I'd rather not assume that Hadoop will always use this library.
          Hide
          masokan Mariappan Asokan added a comment -

          Gwen,
          Thanks for your suggestion. I was able to send a review request. To answer your questions:

          Any reason MainframeManager inherits from the deprecated com.cloudera.sqoop.manager.ConnManager and not from org.apache.sqoop.manager.ConnManager? I noticed you also use the deprecated SqoopOptions.

          I will change MainframeManager to inherit from org.apache.sqoop.manager.ConnManager. It is not a problem. However, I still have to use deprecated SqoopOptions because importing to HBase or Accumulo requires using HBaseImportJob or AccumuloImportJob respectively. The constructors for these classes can take only deprecated SqoopOptions.

          MainframeFTPClientUtils depends on org.apache.commons.net. I didn't see commons-net added as a dependency to ivy.xml.

          When the dependencies for Apache Hadoop jar files are picked up, commons-net is picked up automatically. If you think it is a good idea to update ivy.xml in Sqoop to make it independent of that, I will add it to ivy.xml.

          Jarcec,
          To answer your questions:

          Can we add documentation (stored in src/docs/user) for the new tool?

          Good suggestion. I will add it in the next version of the patch.

          Can move files Mainframe* from org.apache.sqoop.mapreduce.* to a special sub-package mainframe forming org.apache.sqoop.mapreduce.mainframe?

          I thought about it initially. However, it was not clear whether I should do it considering the existing directory structure. There are several *ImportJob.java, *InputFormat.java, and *Mapper.java classes already in org.apache.sqoop.mapreduce. Sure, I can create org.apache.sqoop.mapreduce.mainframe if it avoids cluttering of org.apache.sqoop.mapreduce.

          Show
          masokan Mariappan Asokan added a comment - Gwen, Thanks for your suggestion. I was able to send a review request. To answer your questions: Any reason MainframeManager inherits from the deprecated com.cloudera.sqoop.manager.ConnManager and not from org.apache.sqoop.manager.ConnManager? I noticed you also use the deprecated SqoopOptions. I will change MainframeManager to inherit from org.apache.sqoop.manager.ConnManager. It is not a problem. However, I still have to use deprecated SqoopOptions because importing to HBase or Accumulo requires using HBaseImportJob or AccumuloImportJob respectively. The constructors for these classes can take only deprecated SqoopOptions. MainframeFTPClientUtils depends on org.apache.commons.net. I didn't see commons-net added as a dependency to ivy.xml. When the dependencies for Apache Hadoop jar files are picked up, commons-net is picked up automatically. If you think it is a good idea to update ivy.xml in Sqoop to make it independent of that, I will add it to ivy.xml. Jarcec, To answer your questions: Can we add documentation (stored in src/docs/user) for the new tool? Good suggestion. I will add it in the next version of the patch. Can move files Mainframe* from org.apache.sqoop.mapreduce.* to a special sub-package mainframe forming org.apache.sqoop.mapreduce.mainframe? I thought about it initially. However, it was not clear whether I should do it considering the existing directory structure. There are several *ImportJob.java, *InputFormat.java, and *Mapper.java classes already in org.apache.sqoop.mapreduce. Sure, I can create org.apache.sqoop.mapreduce.mainframe if it avoids cluttering of org.apache.sqoop.mapreduce.
          Hide
          gwenshap Gwen Shapira added a comment -

          Mariappan,

          I believe you can set either a person or a group as a reviewer. Set the group to "Sqoop" and you are good.

          Show
          gwenshap Gwen Shapira added a comment - Mariappan, I believe you can set either a person or a group as a reviewer. Set the group to "Sqoop" and you are good.
          Hide
          masokan Mariappan Asokan added a comment -

          Jarcec,
          I am not able to update the reviewer (which is mandatory) when I try to publish the code for review. I need some help. Thanks.

          Show
          masokan Mariappan Asokan added a comment - Jarcec, I am not able to update the reviewer (which is mandatory) when I try to publish the code for review. I need some help. Thanks.
          Hide
          gwenshap Gwen Shapira added a comment -

          Few notes:

          • Any reason MainframeManager inherits from the deprecated com.cloudera.sqoop.manager.ConnManager and not from org.apache.sqoop.manager.ConnManager? I noticed you also use the deprecated SqoopOptions.
          • MainframeFTPClientUtils depends on org.apache.commons.net. I didn't see commons-net added as a dependency to ivy.xml.

          Other than that, looks good.

          Show
          gwenshap Gwen Shapira added a comment - Few notes: Any reason MainframeManager inherits from the deprecated com.cloudera.sqoop.manager.ConnManager and not from org.apache.sqoop.manager.ConnManager? I noticed you also use the deprecated SqoopOptions. MainframeFTPClientUtils depends on org.apache.commons.net. I didn't see commons-net added as a dependency to ivy.xml. Other than that, looks good.
          Hide
          jarcec Jarek Jarcec Cecho added a comment -

          Couple of high level notes:

          • Can we add documentation (stored in src/docs/user) for the new tool?
          • Can move files Mainframe* from org.apache.sqoop.mapreduce.* to a special sub-package mainframe forming org.apache.sqoop.mapreduce.mainframe?
          Show
          jarcec Jarek Jarcec Cecho added a comment - Couple of high level notes: Can we add documentation (stored in src/docs/user ) for the new tool? Can move files Mainframe* from org.apache.sqoop.mapreduce.* to a special sub-package mainframe forming org.apache.sqoop.mapreduce.mainframe ?
          Hide
          jarcec Jarek Jarcec Cecho added a comment -

          Instructions on how to upload patch to review board are in our How to contribute wiki page.

          Show
          jarcec Jarek Jarcec Cecho added a comment - Instructions on how to upload patch to review board are in our How to contribute wiki page.
          Hide
          jarcec Jarek Jarcec Cecho added a comment -

          It's quite sizable patch Mariappan Asokan, would you mind uploading it to review board?

          Show
          jarcec Jarek Jarcec Cecho added a comment - It's quite sizable patch Mariappan Asokan , would you mind uploading it to review board?
          Hide
          masokan Mariappan Asokan added a comment -

          Hi Gwen,
          Thanks for your comments. For mainframe imports, there are various means to transfer data. FTP is one of them for which an open source implementation of a client library is available. Others use proprietary technologies for which there are no open source client libraries available. The purpose of this Jira is to provide an implementation using only open source client libraries. That is the reason why FTP was chosen. I agree, in its current form, the implementation is no different than importing data from platforms other than mainframes. However, any future enhancements may make this implementation specific to mainframes.

          Show
          masokan Mariappan Asokan added a comment - Hi Gwen, Thanks for your comments. For mainframe imports, there are various means to transfer data. FTP is one of them for which an open source implementation of a client library is available. Others use proprietary technologies for which there are no open source client libraries available. The purpose of this Jira is to provide an implementation using only open source client libraries. That is the reason why FTP was chosen. I agree, in its current form, the implementation is no different than importing data from platforms other than mainframes. However, any future enhancements may make this implementation specific to mainframes.
          Hide
          gwenshap Gwen Shapira added a comment -

          Completely agree that there's a need for mainframe imports.

          This looks like it can work as a generic FTP import. Do you agree? If so, perhaps a small rename will make it even more useful?

          Show
          gwenshap Gwen Shapira added a comment - Completely agree that there's a need for mainframe imports. This looks like it can work as a generic FTP import. Do you agree? If so, perhaps a small rename will make it even more useful?
          Hide
          masokan Mariappan Asokan added a comment -

          Uploaded the patch.

          Show
          masokan Mariappan Asokan added a comment - Uploaded the patch.
          Hide
          masokan Mariappan Asokan added a comment -

          Updated the latest design document. Will upload the patch shortly.

          Show
          masokan Mariappan Asokan added a comment - Updated the latest design document. Will upload the patch shortly.
          Hide
          jarcec Jarek Jarcec Cecho added a comment -

          I believe that mainframe connector would be great addition to Sqoop. I've read the design document and the high level approach seems good to me. Please go ahead with the implementation Mariappan Asokan!

          Show
          jarcec Jarek Jarcec Cecho added a comment - I believe that mainframe connector would be great addition to Sqoop. I've read the design document and the high level approach seems good to me. Please go ahead with the implementation Mariappan Asokan !

            People

            • Assignee:
              masokan Mariappan Asokan
              Reporter:
              masokan Mariappan Asokan
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development