Sqoop
  1. Sqoop
  2. SQOOP-999

Support bulk load from HDFS to PostgreSQL using COPY ... FROM

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.3
    • Fix Version/s: 1.4.4
    • Component/s: connectors/postgresql
    • Labels:
      None
    • Environment:

      PostgreSQL

      Description

      Initially taken from here:
      https://issues.cloudera.org/browse/SQOOP-184
      As Cloudera site is no longer used:

      Sqoop currently supports bulk export from PostgreSQL but not bulk import. PostgreSQL support COPY <table> FROM <filename> syntax and readily accepts CSV data. We should support this so there is --direct support in both directions.

      A note from myself:
      the pg_bulkupload feature is already implemented:
      https://issues.apache.org/jira/browse/SQOOP-390
      pg_builupload is a third-party tool, I guess supporting COPY would be a better case.

      1. export-to-postgresql-with-copy.1.patch
        18 kB
        Masatake Iwasaki
      2. export-to-postgresql-with-copy.patch
        13 kB
        Masatake Iwasaki

        Issue Links

          Activity

          Hide
          Masatake Iwasaki added a comment -

          It is preferable to use not psql (outer process) but Copy API of PostgreSQL JDBC Driver in order to reduce external dependency. I am going to consider its feasibility.
          http://jdbc.postgresql.org/documentation/publicapi/org/postgresql/copy/CopyManager.html

          Show
          Masatake Iwasaki added a comment - It is preferable to use not psql (outer process) but Copy API of PostgreSQL JDBC Driver in order to reduce external dependency. I am going to consider its feasibility. http://jdbc.postgresql.org/documentation/publicapi/org/postgresql/copy/CopyManager.html
          Hide
          Jarek Jarcec Cecho added a comment -

          Thank you Masatake Iwasaki for investigating feasibility of this feature request! The PostgreSQL JDBC driver seems to be distributed with BSD license that is compatible with ASL per the wiki, so I think that it should be fine to use this API.

          Show
          Jarek Jarcec Cecho added a comment - Thank you Masatake Iwasaki for investigating feasibility of this feature request! The PostgreSQL JDBC driver seems to be distributed with BSD license that is compatible with ASL per the wiki , so I think that it should be fine to use this API.
          Hide
          Masatake Iwasaki added a comment -

          Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without test and consideration for corner cases.
          Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?

          Show
          Masatake Iwasaki added a comment - Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without test and consideration for corner cases. Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?
          Hide
          Jarek Jarcec Cecho added a comment -

          Hi Masatake Iwasaki, would you mind uploading your patch to the review board?

          Show
          Jarek Jarcec Cecho added a comment - Hi Masatake Iwasaki , would you mind uploading your patch to the review board ?
          Hide
          Masatake Iwasaki added a comment -

          Hi Jarek Jarcec Cecho, I created review request and uploaded the patch.

          Show
          Masatake Iwasaki added a comment - Hi Jarek Jarcec Cecho , I created review request and uploaded the patch.
          Hide
          Venkat Ranganathan added a comment -

          Thanks Masatake Iwasaki - Typically one would create a new review request in the board and add a link to the JIRA so that JIRA users can navigate to the Review board link from the JIRA. I have done that

          Show
          Venkat Ranganathan added a comment - Thanks Masatake Iwasaki - Typically one would create a new review request in the board and add a link to the JIRA so that JIRA users can navigate to the Review board link from the JIRA. I have done that
          Hide
          Masatake Iwasaki added a comment -
          Show
          Masatake Iwasaki added a comment - Thanks Venkat Ranganathan .
          Hide
          Masatake Iwasaki added a comment -

          Attaching the latest version of the patch. Thanks for the reviews.

          Show
          Masatake Iwasaki added a comment - Attaching the latest version of the patch. Thanks for the reviews.
          Hide
          ASF subversion and git services added a comment -

          Commit fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84 in branch refs/heads/trunk from Jarek Jarcec Cecho
          [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=fb29b8f ]

          SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM

          (Masatake Iwasaki via Jarek Jarcec Cecho)

          Show
          ASF subversion and git services added a comment - Commit fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84 in branch refs/heads/trunk from Jarek Jarcec Cecho [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=fb29b8f ] SQOOP-999 : Support bulk load from HDFS to PostgreSQL using COPY ... FROM (Masatake Iwasaki via Jarek Jarcec Cecho)
          Hide
          Jarek Jarcec Cecho added a comment -

          Thank you Masatake Iwasaki for your contribution!

          Show
          Jarek Jarcec Cecho added a comment - Thank you Masatake Iwasaki for your contribution!
          Hide
          Hudson added a comment -

          Integrated in Sqoop-ant-jdk-1.6-hadoop200 #787 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/787/)
          SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM (Revision fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84)

          Result = SUCCESS
          jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84
          Files :

          • src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java
          • src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportJob.java
          • src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java
          • ivy.xml
          • ivy/libraries.properties
          • src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportMapper.java
          Show
          Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop200 #787 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/787/ ) SQOOP-999 : Support bulk load from HDFS to PostgreSQL using COPY ... FROM (Revision fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84) Result = SUCCESS jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84 Files : src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportJob.java src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java ivy.xml ivy/libraries.properties src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportMapper.java
          Hide
          Hudson added a comment -

          Integrated in Sqoop-ant-jdk-1.6-hadoop20 #767 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/767/)
          SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM (Revision fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84)

          Result = SUCCESS
          jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84
          Files :

          • src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportJob.java
          • src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java
          • src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java
          • ivy.xml
          • ivy/libraries.properties
          • src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportMapper.java
          Show
          Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop20 #767 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/767/ ) SQOOP-999 : Support bulk load from HDFS to PostgreSQL using COPY ... FROM (Revision fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84) Result = SUCCESS jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84 Files : src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportJob.java src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java ivy.xml ivy/libraries.properties src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportMapper.java
          Hide
          Hudson added a comment -

          Integrated in Sqoop-ant-jdk-1.6-hadoop23 #966 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/966/)
          SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM (Revision fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84)

          Result = SUCCESS
          jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84
          Files :

          • src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportMapper.java
          • src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java
          • src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java
          • ivy.xml
          • src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportJob.java
          • ivy/libraries.properties
          Show
          Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop23 #966 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/966/ ) SQOOP-999 : Support bulk load from HDFS to PostgreSQL using COPY ... FROM (Revision fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84) Result = SUCCESS jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84 Files : src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportMapper.java src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java ivy.xml src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportJob.java ivy/libraries.properties
          Hide
          Hudson added a comment -

          Integrated in Sqoop-ant-jdk-1.6-hadoop100 #739 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/739/)
          SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM (Revision fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84)

          Result = SUCCESS
          jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84
          Files :

          • src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java
          • src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportJob.java
          • src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportMapper.java
          • ivy.xml
          • src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java
          • ivy/libraries.properties
          Show
          Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop100 #739 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/739/ ) SQOOP-999 : Support bulk load from HDFS to PostgreSQL using COPY ... FROM (Revision fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84) Result = SUCCESS jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=fb29b8f9fcd45c98857fe44cfc3fe294f2fc6f84 Files : src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportJob.java src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportMapper.java ivy.xml src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java ivy/libraries.properties

            People

            • Assignee:
              Masatake Iwasaki
              Reporter:
              Ruslan Al-Fakikh
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development