Sqoop
  1. Sqoop
  2. SQOOP-448

boolean fields get nullified during postgres direct import into hive.

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.0
    • Fix Version/s: 1.4.2
    • Component/s: connectors/postgresql
    • Labels:

      Description

      These diffs address this issue - https://issues.cloudera.org/browse/SQOOP-43.

      I changed DirectPostgresManager.java to special-case rewrite COPY.... command for boolean fields so that hive sees "true"/"false". The only shortcoming with my patch is that because PG jdbc driver can not tell bit data types from booleans, direct import will break with SQL cast errors if someone has real bit fields. Not sure how often people use those, but it will break direct import for that one person who does. If you have an idea on how I can make this better, let me know. Diffs are attached

      1. sqoop.patch
        3 kB
        Boris Partensky
      2. sqoop_diffs1.txt
        3 kB
        Boris Partensky
      3. sqoop_diffs2.txt
        3 kB
        Boris Partensky

        Issue Links

          Activity

          Hide
          Boris Partensky added a comment -

          diffs for the fix

          Show
          Boris Partensky added a comment - diffs for the fix
          Hide
          Bilung Lee added a comment -

          Thanks for your patch, Boris! Could you please upload your patch to the review board (http://reviews.apache.org)? This would facilitate the review process. Thanks!

          Show
          Bilung Lee added a comment - Thanks for your patch, Boris! Could you please upload your patch to the review board ( http://reviews.apache.org)? This would facilitate the review process. Thanks!
          Hide
          Bilung Lee added a comment -

          With the getColumnTypeNames method to be introduced in SQOOP-352, it would help distinguish between boolean (with type name "bool") and bit (with type name "bit").

          Similar to "bool" column modified as "case when %s=true then 'true' when %s=false then 'false' end as %s", "bit" column can be modified as "case when %s=B'1' then 'true' when %s=B'0' then 'false' end as %s".

          Show
          Bilung Lee added a comment - With the getColumnTypeNames method to be introduced in SQOOP-352 , it would help distinguish between boolean (with type name "bool") and bit (with type name "bit"). Similar to "bool" column modified as "case when %s=true then 'true' when %s=false then 'false' end as %s", "bit" column can be modified as "case when %s=B'1' then 'true' when %s=B'0' then 'false' end as %s".
          Hide
          Boris Partensky added a comment -

          Thanks a log Bilung for this pointer. I changed my code to use type names instead and currently testing it here internally. Will upload the diffs as soon as they are verified here. I tried merging with SQOOP-352 diffs but turns out that those were created against totally different packages - I have
          com.cloudera.sqoop and those are made in org.apache. And the code is slightly different.

          Show
          Boris Partensky added a comment - Thanks a log Bilung for this pointer. I changed my code to use type names instead and currently testing it here internally. Will upload the diffs as soon as they are verified here. I tried merging with SQOOP-352 diffs but turns out that those were created against totally different packages - I have com.cloudera.sqoop and those are made in org.apache. And the code is slightly different.
          Hide
          Bilung Lee added a comment -

          Thanks for your contribution! The patch will eventually be committed into the trunk, and thus the convention is to base the patch on the version of code in the trunk.

          Show
          Bilung Lee added a comment - Thanks for your contribution! The patch will eventually be committed into the trunk, and thus the convention is to base the patch on the version of code in the trunk.
          Hide
          Boris Partensky added a comment -

          Hi, I am having problems submitting my diffs to reviews.apache.org.

          I specified Sqoop as repository, "https://svn.apache.org/repos/asf/incubator/sqoop/trunk/src/java" as base, that's where in my working copy I generated the diffs. When I try to submit the diffs, it tells me -
          The file 'https://svn.apache.org/repos/asf/incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java' (r1298521) could not be found in the repository

          Show
          Boris Partensky added a comment - Hi, I am having problems submitting my diffs to reviews.apache.org. I specified Sqoop as repository, "https://svn.apache.org/repos/asf/incubator/sqoop/trunk/src/java" as base, that's where in my working copy I generated the diffs. When I try to submit the diffs, it tells me - The file 'https://svn.apache.org/repos/asf/incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java' (r1298521) could not be found in the repository
          Hide
          Bilung Lee added a comment -

          This might be related to how you generate your patch.

          Below are the steps you may follow:

          1. Check out the code from the trunk as shown in the link below:
          http://cwiki.apache.org/confluence/display/SQOOP/Setting+up+Development+Environment#SettingupDevelopmentEnvironment-BuildingtheSources

          2. After making the changes, generate the patch as shown in the link below:
          http://cwiki.apache.org/confluence/display/SQOOP/How+to+Contribute#HowtoContribute-ProvidingPatches

          3. Upload the patch to the review board:

          • If you are using SVN to generate patch, choose "Sqoop" as repository and type in "." as base directory.
          • If you are using GIT to generate patch, choose "sqoop-git" as repository.
          Show
          Bilung Lee added a comment - This might be related to how you generate your patch. Below are the steps you may follow: 1. Check out the code from the trunk as shown in the link below: http://cwiki.apache.org/confluence/display/SQOOP/Setting+up+Development+Environment#SettingupDevelopmentEnvironment-BuildingtheSources 2. After making the changes, generate the patch as shown in the link below: http://cwiki.apache.org/confluence/display/SQOOP/How+to+Contribute#HowtoContribute-ProvidingPatches 3. Upload the patch to the review board: If you are using SVN to generate patch, choose "Sqoop" as repository and type in "." as base directory. If you are using GIT to generate patch, choose "sqoop-git" as repository.
          Hide
          Boris Partensky added a comment -
          Show
          Boris Partensky added a comment - Thanks. https://reviews.apache.org/r/4263
          Hide
          Boris Partensky added a comment -

          fix

          Show
          Boris Partensky added a comment - fix
          Hide
          Boris Partensky added a comment -

          final patch, reviewed on rb

          Show
          Boris Partensky added a comment - final patch, reviewed on rb
          Hide
          Bilung Lee added a comment -

          Patch committed. Thanks, Boris!

          Show
          Bilung Lee added a comment - Patch committed. Thanks, Boris!
          Hide
          Hudson added a comment -

          Integrated in Sqoop-ant-jdk-1.6 #97 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6/97/)
          SQOOP-448 boolean fields get nullified during postgres direct import into hive. (Revision 1300345)

          Result = SUCCESS
          blee : http://svn.apache.org/viewvc/?view=rev&rev=1300345
          Files :

          • /incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java
          Show
          Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6 #97 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6/97/ ) SQOOP-448 boolean fields get nullified during postgres direct import into hive. (Revision 1300345) Result = SUCCESS blee : http://svn.apache.org/viewvc/?view=rev&rev=1300345 Files : /incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java
          Hide
          Boris Partensky added a comment -

          a bug has been discovered in OR condition in getCopyCommand method.

          Show
          Boris Partensky added a comment - a bug has been discovered in OR condition in getCopyCommand method.
          Hide
          Boris Partensky added a comment -

          address a string length logic bug

          Show
          Boris Partensky added a comment - address a string length logic bug
          Hide
          Bilung Lee added a comment -

          Thanks for the quick response and update. The problem is corrected now. Thanks!

          Show
          Bilung Lee added a comment - Thanks for the quick response and update. The problem is corrected now. Thanks!
          Hide
          Hudson added a comment -

          Integrated in Sqoop-ant-jdk-1.6 #98 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6/98/)
          SQOOP-448 boolean fields get nullified during postgres direct import into hive.
          (fix a problem in previous patch) (Revision 1301118)

          Result = SUCCESS
          blee : http://svn.apache.org/viewvc/?view=rev&rev=1301118
          Files :

          • /incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java
          Show
          Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6 #98 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6/98/ ) SQOOP-448 boolean fields get nullified during postgres direct import into hive. (fix a problem in previous patch) (Revision 1301118) Result = SUCCESS blee : http://svn.apache.org/viewvc/?view=rev&rev=1301118 Files : /incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java

            People

            • Assignee:
              Boris Partensky
              Reporter:
              Boris Partensky
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development