Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19731

IN Operator should support arrays

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Incomplete
    • 1.6.2, 2.0.0, 2.1.0
    • None
    • SQL

    Description

      When the column type and array member type match, the IN operator should still operate on the array. This is useful for UDFs and Predicate SubQueries that return arrays.

      (This isn't necessarily extensible to all collections, but certainly applies to arrays.)

      Example:
      select 5 in array(1,2,3) Should return false instead of parseException, since the type of the array and the type of the column match.

      create table test (val int);
      insert into test values (1);
      select * from test;
      ------+

      val

      ------+

      1

      ------+
      select val from test where array_contains(array(1,2,3), val);
      ------+

      val

      ------+

      1

      ------+

      select val from test where val in (array(1,2,3));
      Error: org.apache.spark.sql.AnalysisException: cannot resolve '(test.`val` IN (array(1, 2, 3)))' due to data type mismatch: Arguments must be same type; line 1 pos 31;
      'Project ['val]
      +- 'Filter val#433 IN (array(1, 2, 3))
      +- MetastoreRelation test (state=,code=0)

      select val from test where val in (select array(1,2,3));
      Error: org.apache.spark.sql.AnalysisException: cannot resolve '(test.`val` = `array(1, 2, 3)`)' due to data type mismatch: differing types in '(test.`val` = `array(1, 2, 3)`)' (int and array<int>).;;
      'Project ['val]
      +- 'Filter predicate-subquery#434 (val#435 = array(1, 2, 3)#436)
      : +- Project array(1, 2, 3) AS array(1, 2, 3)#436
      : +- OneRowRelation$
      +- MetastoreRelation test (state=,code=0)

      select val from test where val in (select explode(array(1,2,3)));
      ------+

      val

      ------+

      1

      ------+

      Note: See SPARK-19730 for how a predicate subquery breaks when applied to the DataSourceAPI

      Attachments

        Activity

          People

            Unassigned Unassigned
            azeroth2b Shawn Lavelle
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: