Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38404

Spark does not find CTE inside nested CTE

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.2.0, 3.2.1
    • 3.3.1, 3.4.0
    • SQL
    • None
    • Tested on:

      • MacOS Monterrey 12.2.1 (21D62)
      • python 3.9.10
      • pip 22.0.3
      • pyspark 3.2.0 & 3.2.1 (SQL query does not work) and pyspark 3.0.1 and 3.1.3 (SQL query works)

    Description

      Hello! 

      Seems that when defining CTEs and using them inside another CTE in Spark SQL, Spark thinks the inner call for the CTE is a table or view, which is not found and then it errors with `Table or view not found: <CTE name>`

      Steps to reproduce

      1. `pip install pyspark==3.2.0` (also happens with 3.2.1)
      2. start pyspark console by typing `pyspark` in the terminal
      3. Try to run the following SQL with `spark.sql(sql)`

       

        WITH mock_cte__users    AS (
                 SELECT 1 AS id
             ),
             model_under_test          AS (
                   WITH users    AS (
                            SELECT *
                              FROM mock_cte__users
                        )
                 SELECT *
                   FROM users
             )
      SELECT *
        FROM model_under_test;

      Spark will fail with 

       

      pyspark.sql.utils.AnalysisException: Table or view not found: mock_cte__users; line 8 pos 29; 

      I don't know if this is a regression or an expected behavior of the new 3.2.* versions. This fix introduced in 3.2.0 might be related: https://issues.apache.org/jira/browse/SPARK-36447

       

       

      Attachments

        Activity

          People

            petertoth Peter Toth
            watxaut Joan Heredia Rius
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: