Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6742

Spark pushes down filters in old parquet path that reference partitioning columns

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.2.1
    • 1.4.0
    • SQL
    • None

    Description

      Create a table with multiple fields partitioned on 'market' column. run a query like :

      SELECT start_sp_time, end_sp_time, imsi, imei, enb_common_enbid FROM csl_data_parquet WHERE (((technology = 'FDD') AND (bandclass = '800') AND (region = 'R15') AND (market = 'LA metro')) OR ((technology = 'FDD') AND (bandclass = '1900') AND (region = 'R15') AND (market = 'Indianapolis'))) AND start_sp_time >= 1.4158368E9 AND end_sp_time < 1.4159232E9 AND dt >= '2014-11-13-00-00' AND dt < '2014-11-14-00-00' ORDER BY end_sp_time DESC LIMIT 100

      The or filter is pushed down in this case , resulting in column not found exception from parquet

      Attachments

        Activity

          People

            saucam Yash Datta
            saucam Yash Datta
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: