Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47483

Add support for aggregation and join operations on arrays of collated strings

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.0.0
    • 4.0.0
    • SQL

    Description

      Example of aggregation sequence:

      create table t(a array<string collate utf8_binary_lcase>) using parquet;
      
      
      insert into t(a) values(array('a' collate utf8_binary_lcase));
      insert into t(a) values(array('A' collate utf8_binary_lcase));
      
      
      select distinct a from t; 

      Example of join sequence:

      create table l(a array<string collate utf8_binary_lcase>) using parquet;
      create table r(a array<string collate utf8_binary_lcase>) using parquet;
      
      
      insert into l(a) values(array('a' collate utf8_binary_lcase));
      insert into r(a) values(array('A' collate utf8_binary_lcase));
      
      
      select * from l join r where l.a = r.a; 

      Both runs should yield one row since the arrays are considered equal.

      Attachments

        Issue Links

          Activity

            People

              nikolamand-db Nikola Mandic
              nikolamand-db Nikola Mandic
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: