Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4113

Optimize select count(1) with RCFile and Orc

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.13.0
    • File Formats
    • None

    Description

      select count(1) loads up every column & every row when used with RCFile.

      "select count(1) from store_sales_10_rc" gives

      Job 0: Map: 5  Reduce: 1   Cumulative CPU: 31.73 sec   HDFS Read: 234914410 HDFS Write: 8 SUCCESS
      

      Where as, "select count(ss_sold_date_sk) from store_sales_10_rc;" reads far less

      Job 0: Map: 5  Reduce: 1   Cumulative CPU: 29.75 sec   HDFS Read: 28145994 HDFS Write: 8 SUCCESS
      

      Which is 11% of the data size read by the COUNT(1).

      This was tracked down to the following code in RCFile.java

            } else {
              // TODO: if no column name is specified e.g, in select count(1) from tt;
              // skip all columns, this should be distinguished from the case:
              // select * from tt;
              for (int i = 0; i < skippedColIDs.length; i++) {
                skippedColIDs[i] = false;
              }
      

      Attachments

        1. HIVE-4113-0.patch
          55 kB
          Brock Noland
        2. HIVE-4113.patch
          51 kB
          Brock Noland
        3. HIVE-4113.patch
          55 kB
          Brock Noland
        4. HIVE-4113.9.patch
          449 kB
          Yin Huai
        5. HIVE-4113.8.patch
          130 kB
          Yin Huai
        6. HIVE-4113.7.patch
          77 kB
          Yin Huai
        7. HIVE-4113.6.patch
          61 kB
          Yin Huai
        8. HIVE-4113.5.patch
          60 kB
          Yin Huai
        9. HIVE-4113.4.patch
          64 kB
          Yin Huai
        10. HIVE-4113.3.patch
          60 kB
          Yin Huai
        11. HIVE-4113.2.patch
          54 kB
          Yin Huai
        12. HIVE-4113.11.patch
          458 kB
          Yin Huai
        13. HIVE-4113.10.patch
          449 kB
          Yin Huai
        14. HIVE-4113.1.patch
          55 kB
          Yin Huai

        Issue Links

          Activity

            People

              yhuai Yin Huai
              gopalv Gopal Vijayaraghavan
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: