Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-4968

[SparkSQL] java.lang.UnsupportedOperationException when hive partition doesn't exist and order by and limit are used

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.1
    • Fix Version/s: 1.2.1, 1.3.0
    • Component/s: SQL
    • Labels:
      None
    • Environment:

      Spark 1.1.1
      scala - 2.10.2
      hive metastore db - pgsql
      OS- Linux

      Description

      Create table with partitions
      run query for partition which doesn't exist and contains order by and limit

      I am running queries in hiveContext

      1. Create hive table

      create table if not exists testTable (ID1 BIGINT, ID2 BIGINT,Start_Time STRING, End_Time STRING) PARTITIONED BY (Region STRING,Market STRING)
      ROW FORMAT DELIMITED
      FIELDS TERMINATED BY ','
      LINES TERMINATED BY '\n'
      STORED AS TEXTFILE;
      

      2. Create data

      1,2,"2014-11-01","2014-11-02"
      2,3,"2014-11-01","2014-11-02"
      3,4,"2014-11-01","2014-11-02"
      

      3. Load data in hive

      LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE testTable PARTITION (Region="North", market='market1');
      

      4. run query

      SELECT * FROM testTable WHERE market = 'market2' ORDER BY End_Time DESC LIMIT 100;
      
      
      Error trace
      java.lang.UnsupportedOperationException: empty collection
      	at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:863)
      	at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:863)
      	at scala.Option.getOrElse(Option.scala:120)
      	at org.apache.spark.rdd.RDD.reduce(RDD.scala:863)
      	at org.apache.spark.rdd.RDD.takeOrdered(RDD.scala:1136)
      	at org.apache.spark.sql.execution.TakeOrdered.executeCollect(basicOperators.scala:171)
      	at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:438)
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              sb58 Shekhar Bansal
            • Votes:
              2 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: