Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2809

improve ByteSwap with builtin function or SSE or AVX2

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: Impala 2.6.0
    • Fix Version/s: None
    • Component/s: Backend
    • Labels:
      None

      Description

      when the size is not matching all switch cases, it will get into the inefficient for loop.

          uint8_t* d = reinterpret_cast<uint8_t*>(dst);
          const uint8_t* s = reinterpret_cast<const uint8_t*>(src);
          for (int i = 0; i < len; ++i) {
            d[i] = s[len - i - 1];
          }
      

      9.5% CPU time spend on ByteSwap in below query:

      select sum(l_quantity), sum(l_extendedprice), sum(l_discount), sum(l_tax) from lineitem;
      

      Function / Call Stack Effective Time Spin Time Overhead Time Module Function (Full) Source File Start Address
      [Outside any known module] 12.6% 0s 0s [Outside any known module] 0
      impala::BitUtil::ByteSwap 9.5% 0s 0s impalad impala::BitUtil::ByteSwap(void*, void const*, int) bit-util.h 0xce7961
      snappy::RawUncompress 8.1% 0s 0s impalad snappy::RawUncompress(snappy::Source*, char*) 0xd6e6a0
      impala::HdfsParquetScanner::BaseScalarColumnReader::NextLevels<(bool)0> 5.8% 0s 0s impalad bool impala::HdfsParquetScanner::BaseScalarColumnReader::NextLevels<(bool)0>(void) hdfs-parquet-scanner.cc 0xce3bc0
      impala::HdfsParquetScanner::ScalarColumnReader<impala::DecimalValue<long>, (bool)1>::ReadNonRepeatedValue 5.7% 0s 0s impalad impala::HdfsParquetScanner::ScalarColumnReader<impala::DecimalValue<long>, (bool)1>::ReadNonRepeatedValue(impala::MemPool*, impala::Tuple*, bool*) hdfs-parquet-scanner.cc 0xce78f0
      GetValue<int> 5.2% 0s 0s impalad GetValue<int> bit-stream-utils.inline.h 0xce2938
      impala::DictDecoder<impala::DecimalValue<long>>::GetValue 4.5% 0s 0s impalad impala::DictDecoder<impala::DecimalValue<long>>::GetValue(impala::DecimalValue<long>*) dict-encoding.h 0xce7a40
      ReadRow<false> 4.3% 0s 0s impalad ReadRow<false> hdfs-parquet-scanner.cc 0xce5d0e
      impala::RleDecoder::Get<int> 3.8% 0s 0s impalad bool impala::RleDecoder::Get<int>(int*) rle-encoding.h 0xce28f0
      impala::HdfsParquetScanner::AssembleRows<(bool)0, (bool)0> 3.4% 0s 0s impalad bool impala::HdfsParquetScanner::AssembleRows<(bool)0, (bool)0>(impala::TupleDescriptor const*, std::vector<impala::HdfsParquetScanner::ColumnReader*, std::allocator<impala::HdfsParquetScanner::ColumnReader*>> const&, int, int, impala::CollectionValueBuilder*) hdfs-parquet-scanner.cc 0xce5b80
      impala::BitUtil::TrailingBits 3.3% 0s 0s impalad impala::BitUtil::TrailingBits(unsigned long, int) bit-util.h 0xce2954
      Get<unsigned char> 3.3% 0s 0s impalad Get<unsigned char> rle-encoding.h 0xce3cb0
      gettimeofday 3.2% 0s 0s libc-2.3.4.so gettimeofday 0x81660

        Attachments

          Activity

            People

            • Assignee:
              hayabusa Youwei Wang
              Reporter:
              zuowang_impala_c24e Zuo Wang
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: