Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21228

InSet incorrect handling of structs

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.2.1, 2.3.0
    • Component/s: SQL
    • Labels:
      None

      Description

      In InSet it's possible that hset contains GenericInternalRows while child returns UnsafeRows (and vice versa). InSet uses hset.contains (both in doCodeGen and eval) which will always be false in this case.

      The following code reproduces the problem:

      spark.conf.set("spark.sql.optimizer.inSetConversionThreshold", "2") // the default is 10 which requires a longer query text to repro
      
      spark.range(1, 10).selectExpr("named_struct('a', id, 'b', id) as a").createOrReplaceTempView("A")
      
      sql("select * from (select min(a) as minA from A) A where minA in (named_struct('a', 1L, 'b', 1L),named_struct('a', 2L, 'b', 2L),named_struct('a', 3L, 'b', 3L))").show // the Aggregate here will return UnsafeRows while the list of structs that will become hset will be GenericInternalRows
      +----+
      |minA|
      +----+
      +----+
      

      In.doCodeGen uses compareStructs and seems to work. In.eval might not work but not sure how to reproduce.

      spark.conf.set("spark.sql.optimizer.inSetConversionThreshold", "3") // now it will not use InSet
      sql("select * from (select min(a) as minA from A) A where minA in (named_struct('a', 1L, 'b', 1L),named_struct('a', 2L, 'b', 2L),named_struct('a', 3L, 'b', 3L))").show
      
      +-----+
      | minA|
      +-----+
      |[1,1]|
      +-----+
      

      Solution could be either to do safe<->unsafe conversion in InSet or not trigger InSet optimization at all in this case.
      Need to investigate if In.eval is affected.

        Attachments

          Activity

            People

            • Assignee:
              bograd Bogdan Raducanu
              Reporter:
              bograd Bogdan Raducanu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: