Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21212

Can't use Count(*) with Order Clause

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • 2.1.0
    • None
    • SQL
    • None
    • Windows; external data provided through data source api

    Description

      I don't think this should fail the query:

      _Notes: VALUE is a column of table TABLE. columns and table names redacted. I can generate a simplified test case if needed, but this is easy to reproduce. _

      jdbc:hive2://user:port/> select count(*) from table where value between 1498240079000 and cast(now() as bigint)*1000 order by value;
      
      Error: org.apache.spark.sql.AnalysisException: cannot resolve '`value`' given input columns: [count(1)]; line 1 pos 113;
      'Sort ['value ASC NULLS FIRST], true
      +- Aggregate [count(1) AS count(1)#718L]
         +- Filter ((value#413L >= 1498240079000) && (value#413L <= (cast(current_timestamp() as bigint) * cast(1000 as bigint))))
            +- SubqueryAlias table
               +- Relation[field1#411L,field2#412,value#413L,field3#414,field4#415,field5#416,field6#417,field7#418,field8#419,field9#420] com.redacted@16004579 (state=,code=0)
      

      Arguably, the optimizer could ignore the "order by" clause, but I leave that to more informed minds than my own.

      Attachments

        Activity

          People

            Unassigned Unassigned
            azeroth2b Shawn Lavelle
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: