Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23316

AnalysisException after max iteration reached for IN query

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.0, 2.4.0
    • Fix Version/s: 2.3.0
    • Component/s: SQL
    • Labels:
      None
    • Target Version/s:

      Description

      Query to reproduce:

      spark.range(10).where("(id,id) in (select id, null from range(3))").show
      
      18/02/02 11:32:31 WARN BaseSessionStateBuilder$$anon$1: Max iterations (100) reached for batch Resolution
      org.apache.spark.sql.AnalysisException: cannot resolve '(named_struct('id', `id`, 'id', `id`) IN (listquery()))' due to data type mismatch:
      The data type of one or more elements in the left hand side of an IN subquery
      is not compatible with the data type of the output of the subquery
      Mismatched columns:
      []
      Left side:
      [bigint, bigint].
      Right side:
      [bigint, bigint].;;
      

      The error message includes the last plan which contains ~100 useless Projects.
      Does not happen in branch-2.2.
      It has something to do with TypeCoercion, it is doing a futile attempt to change nullability.

        Attachments

          Activity

            People

            • Assignee:
              bograd Bogdan Raducanu
              Reporter:
              bograd Bogdan Raducanu
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: