Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16667

PostgreSQL metastore handling of CLOB types for COLUMNS_V2.TYPE_NAME and other field is incorrect

Log workAgile BoardRank to TopRank to BottomVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.3.0, 3.0.0
    • Component/s: None
    • Labels:
      None

      Description

      The CLOB JDO type introduced with HIVE-12274 does not work correctly with PostgreSQL. The value is written out-of-band and the LOB handle is written,as an INT, into the table. SELECTs return the INT value, which should had been read via the lo_get PG built-in, and then cast into string.

      Furthermore, the behavior is different between fields upgraded from earlier metastore versions (they retain their string storage) vs. values inserted after the upgrade (inserted as LOB roots).

      Teh code in MetasoreDirectSql.getPartitionsFromPartitionIds/extractSqlClob expects the underlying JDO/Datanucleus to map the column to a Clob but that does not happen, the value is a Java String containing the int which is the LOB root saved by PG.

      This manifests at runtime with errors like:

      hive> select * from srcpart;
      Failed with exception java.io.IOException:java.lang.IllegalArgumentException: Error: type expected at the position 0 of '24030:24031' but '24030' is found.
      

      the 24030:24031 should be 'string:string'.

      repro:

      CREATE TABLE srcpart (key STRING COMMENT 'default', value STRING COMMENT 'default') PARTITIONED BY (ds STRING, hr STRING) STORED AS TEXTFILE;
      LOAD DATA LOCAL INPATH "${hiveconf:test.data.dir}/kv1.txt" OVERWRITE INTO TABLE srcpart PARTITION (ds="2008-04-09", hr="11");
      select * from srcpart;
      

      I did not see the issue being hit by non-partitioned/textfile tables, but that is just the luck of the path taken by the code. Inspection of my PG metastore shows all the CLOB fields suffering from this issue.

        Attachments

        1. HIVE-16667.2.patch
          2 kB
          Naveen Gangam
        2. HIVE-16667.3.patch
          2 kB
          Naveen Gangam
        3. HIVE-16667.patch
          3 kB
          Naveen Gangam
        4. HiveCLIOutput.txt
          98 kB
          Naveen Gangam
        5. PostgresDBOutput.txt
          3 kB
          Naveen Gangam

        Issue Links

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

            • Assignee:
              ngangam Naveen Gangam Assign to me
              Reporter:
              rusanu Remus Rusanu

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment