Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-4213

SparkSQL - ParquetFilters - No support for LT, LTE, GT, GTE operators

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 1.2.0
    • 1.2.0
    • SQL
    • None
    • CDH5.2, Hive 0.13.1, Spark 1.2 snapshot (commit hash 76386e1a23c)

    Description

      When I issue a hql query against a HiveContext where my predicate uses a column of string type with one of LT, LTE, GT, or GTE operator, I get the following error:

      scala.MatchError: StringType (of class org.apache.spark.sql.catalyst.types.StringType$)

      Looking at the code in org.apache.spark.sql.parquet.ParquetFilters, StringType is absent from the corresponding functions for creating these filters.

      To reproduce, in a Hive 0.13.1 shell, I created the following table (at a specified DB):

      create table sparkbug (
      id int,
      event string
      ) stored as parquet;

      Insert some sample data:

      insert into table sparkbug select 1, '2011-06-18' from <some table> limit 1;
      insert into table sparkbug select 2, '2012-01-01' from <some table> limit 1;

      Launch a spark shell and create a HiveContext to the metastore where the table above is located.

      import org.apache.spark.sql._
      import org.apache.spark.sql.SQLContext
      import org.apache.spark.sql.hive.HiveContext
      val hc = new HiveContext(sc)
      hc.setConf("spark.sql.shuffle.partitions", "10")
      hc.setConf("spark.sql.hive.convertMetastoreParquet", "true")
      hc.setConf("spark.sql.parquet.compression.codec", "snappy")
      import hc._
      hc.hql("select * from <db>.sparkbug where event >= '2011-12-01'")

      A scala.MatchError will appear in the output.

      Attachments

        Activity

          People

            Unassigned Unassigned
            terry.siu Terry Siu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: