Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-14172

Hive table partition predicate not passed down correctly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Incomplete
    • 1.6.1
    • None
    • SQL

    Description

      When the hive sql contains nondeterministic fields, spark plan will not push down the partition predicate to the HiveTableScan. For example:

      -- consider following query which uses a random function to sample rows
      SELECT *
      FROM table_a
      WHERE partition_col = 'some_value'
      AND rand() < 0.01;
      

      The spark plan will not push down the partition predicate to HiveTableScan which ends up scanning all partitions data from the table.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yingjizhang Yingji Zhang
              Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: