Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20210

Simple Fetch optimizer should lead to MapReduce when filter on non-partition column and conversion is minimal

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.3.0, 2.3.1, 2.3.2
    • 4.0.0-alpha-1
    • Query Planning
    • None

    Description

      When conversion is minimal, simple fetch can be used only when filter on partition column or no filter at all. But it lead to simple fetch even if filter on non-partition column. Unit test " select * from srcpart where key > 100 limit 10 " in the nonmr_fetch.q demonstration this issue – the unit test is not correct indeed(it should be Map Reduce  but in test it is Simple Fetch).

      This issue lead to a serious problem when data size is huge. When conversion is more and filter on both partition column and non-partition column, it will not chech hive.fetch.task.conversion.threshold, which lead to the query to takes hours to finish. This issue doesn't exist in 1.2.1, how it works should be a magic...

      Attachments

        1. HIVE-20210.patch
          8 kB
          Xilang Yan
        2. HIVE-20210.4.patch
          227 kB
          Xilang Yan
        3. HIVE-20210.3.patch
          200 kB
          Xilang Yan
        4. HIVE-20210.2.patch
          8 kB
          Xilang Yan

        Activity

          People

            xilangyan Xilang Yan
            xilangyan Xilang Yan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: