Uploaded image for project: 'Apache Trafodion (Retired)'
  1. Apache Trafodion (Retired)
  2. TRAFODION-1165

LP Bug: 1443482 - Accessing hive table with ucs2 encoded field returns 0 rows.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.2.0, 2.3
    • sql-exe

    Description

      When accessing hive table with ucs2 encoded field, our implementation will return 0 rows.
      This is caused by using of “strchr()”, see ExHdfsScanTcb::extractAndTransformAsciiSourceToSqlRow(),
      strchr() returns at ‘\0’ before hit line delimiter ‘\n’, however the '\0' may just be a 0x00 part of ucs2 character, and the line is considered invalid.

      Scripts to reproduce:

      create table sck(
      userId int not null,
      name varchar(20) character set UCS2
      );

      insert into sck values (1001, _ucs2'JBL'), (1002, _ucs2'YS '), (1003, _ucs2'8#RTG');

      unload into '/ucs2test' select * from sck;

      create external table hsck
      (
      id int,
      name string
      ) row format delimited fields terminated by '|'
      location '/ucs2test';

      select * from hive.hive.hsck;
      Assigned to LaunchPad User khaled Bouaziz

      Attachments

        Activity

          People

            ovis_poly liu ming
            howard Howard Qin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: