Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3638

Remove lazy creation of LLVM codegen module

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.2
    • Impala 2.8.0
    • Backend

    Description

      Query

      select *
      FROM (
        SELECT Rank() OVER (
            ORDER BY l_extendedprice
              ,l_quantity
              ,l_discount
              ,l_tax
            ) AS rank
        FROM lineitem
        WHERE l_shipdate < '1992-05-09'
        ) a
      WHERE rank < 10
      

      Plan

      03:SELECT
      |  predicates: rank() < 10
      |  hosts=20 per-host-mem=unavailable
      |  tuple-ids=6,5 row-size=66B cardinality=59999897
      |
      02:ANALYTIC
      |  functions: rank()
      |  order by: l_extendedprice ASC, l_quantity ASC, l_discount ASC, l_tax ASC
      |  window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
      |  hosts=20 per-host-mem=unavailable
      |  tuple-ids=6,5 row-size=66B cardinality=599998971
      |
      04:MERGING-EXCHANGE [UNPARTITIONED]
      |  order by: l_extendedprice ASC, l_quantity ASC, l_discount ASC, l_tax ASC
      |  hosts=20 per-host-mem=unavailable
      |  tuple-ids=6 row-size=58B cardinality=599998971
      |
      01:SORT
      |  order by: l_extendedprice ASC, l_quantity ASC, l_discount ASC, l_tax ASC
      |  hosts=20 per-host-mem=736.00MB
      |  tuple-ids=6 row-size=58B cardinality=599998971
      |
      00:SCAN HDFS [tpch_1000_decimal_parquet.lineitem, RANDOM]
         partitions=1/1 files=880 size=216.61GB
         predicates: l_shipdate < '1992-05-09'
         table stats: 5999989709 rows total
         column stats: all
         hosts=20 per-host-mem=440.00MB
         tuple-ids=0 row-size=58B cardinality=599998971
      

      Fragment not getting codegened

        Fragment start latencies: Count: 20, 25th %-ile: 1ms, 50th %-ile: 2ms, 75th %-ile: 2ms, 90th %-ile: 2ms, 95th %-ile: 3ms, 99.9th %-ile: 3ms
          Per Node Peak Memory Usage: d2406.halxg.cloudera.com:22000(1.38 GB) d2413.halxg.cloudera.com:22000(1.52 GB) d2405.halxg.cloudera.com:22000(1.34 GB) d2414.halxg.cloudera.com:22000(1.51 GB) d2416.halxg.cloudera.com:22000(1.48 GB) d2404.halxg.cloudera.com:22000(1.37 GB) d2420.halxg.cloudera.com:22000(1.40 GB) d2410.halxg.cloudera.com:22000(1.61 GB) d2412.halxg.cloudera.com:22000(1.22 GB) d2419.halxg.cloudera.com:22000(1.38 GB) d2409.halxg.cloudera.com:22000(1.41 GB) d2407.halxg.cloudera.com:22000(1.10 GB) d2411.halxg.cloudera.com:22000(1.34 GB) d2418.halxg.cloudera.com:22000(1.27 GB) d2408.halxg.cloudera.com:22000(1.56 GB) d2421.halxg.cloudera.com:22000(1.35 GB) d2403.halxg.cloudera.com:22000(1.28 GB) d2415.halxg.cloudera.com:22000(1.53 GB) d2417.halxg.cloudera.com:22000(1.31 GB) d2402.halxg.cloudera.com:22000(1.25 GB) 
           - FiltersReceived: 0 (0)
           - FinalizationTimer: 0.000ns
          Coordinator Fragment F01:(Total: 8m48s, non-child: 3.044ms, % non-child: 0.00%)
            MemoryUsage(16s000ms): 16.00 KB, 33.59 MB, 47.13 MB, 47.41 MB, 46.89 MB, 46.71 MB, 46.93 MB, 46.69 MB, 46.52 MB, 46.67 MB, 46.34 MB, 46.76 MB, 46.63 MB, 46.58 MB, 46.44 MB, 46.64 MB, 46.91 MB, 47.02 MB, 46.60 MB, 46.62 MB, 46.65 MB, 46.60 MB, 46.80 MB, 46.88 MB, 46.69 MB, 46.61 MB, 46.88 MB, 46.58 MB, 46.71 MB, 46.76 MB, 46.49 MB, 46.36 MB, 46.07 MB
             - AverageThreadTokens: 0.00 
             - BloomFilterBytes: 0
             - PeakMemoryUsage: 59.87 MB (62781648)
             - PerHostPeakMemUsage: 0
             - PrepareTime: 138.114us
             - RowsProduced: 9 (9)
             - TotalCpuTime: 8m49s
             - TotalNetworkReceiveTime: 0.000ns
             - TotalNetworkSendTime: 0.000ns
             - TotalStorageWaitTime: 0.000ns
            BlockMgr:
               - BlockWritesOutstanding: 0 (0)
               - BlocksCreated: 71 (71)
               - BlocksRecycled: 1.33K (1333)
               - BufferedPins: 0 (0)
               - BytesWritten: 0
               - MaxBlockSize: 8.00 MB (8388608)
               - MemoryLimit: 242.23 GB (260091396096)
               - PeakMemoryUsage: 736.00 MB (771751936)
               - TotalBufferWaitTime: 0.000ns
               - TotalEncryptionTime: 0.000ns
               - TotalIntegrityCheckTime: 0.000ns
               - TotalReadBlockTime: 0.000ns
            SELECT_NODE (id=3):(Total: 8m48s, non-child: 8s327ms, % non-child: 1.57%)
               - PeakMemoryUsage: 9.01 MB (9449472)
               - RowsReturned: 9 (9)
               - RowsReturnedRate: 0
            ANALYTIC_EVAL_NODE (id=2):(Total: 8m40s, non-child: 5m42s, % non-child: 65.78%)
               - EvaluationTime: 5m42s
               - GetNewBlockTime: 5.963ms
               - PeakMemoryUsage: 26.34 MB (27621376)
               - PinTime: 0.000ns
               - RowsReturned: 169.58M (169575660)
               - RowsReturnedRate: 325.75 K/sec
               - UnpinTime: 3.156ms
            EXCHANGE_NODE (id=4):(Total: 2m58s, non-child: 2m35s, % non-child: 87.26%)
              BytesReceived(16s000ms): 0, 26.87 MB, 135.18 MB, 253.93 MB, 373.50 MB, 493.48 MB, 613.34 MB, 733.28 MB, 852.89 MB, 972.60 MB, 1.07 GB, 1.18 GB, 1.30 GB, 1.42 GB, 1.53 GB, 1.65 GB, 1.77 GB, 1.89 GB, 2.00 GB, 2.12 GB, 2.24 GB, 2.35 GB, 2.47 GB, 2.59 GB, 2.70 GB, 2.82 GB, 2.93 GB, 3.05 GB, 3.17 GB, 3.28 GB, 3.40 GB, 3.52 GB, 3.64 GB
               - BytesReceived: 3.70 GB (3972098336)
               - ConvertRowBatchTime: 0.000ns
               - DeserializeRowBatchTimer: 15s052ms
               - FirstBatchArrivalWaitTime: 22s690ms
               - MergeGetNext: 2m35s
               - MergeGetNextBatch: 814.096ms
               - PeakMemoryUsage: 0
               - RowsReturned: 169.58M (169575660)
               - RowsReturnedRate: 952.03 K/sec
               - SendersBlockedTimer: 8m24s
               - SendersBlockedTotalTimer(*): 2h47m
      

      Wrapping the query in a count is enough to codegen the fragment with the analytic function

      select count(*) from (select *
      FROM (
        SELECT Rank() OVER (
            ORDER BY l_extendedprice
              ,l_quantity
              ,l_discount
              ,l_tax
            ) AS rank
        FROM lineitem
        WHERE l_shipdate < '1992-05-09'
        ) a
      WHERE rank < 10)a
      

      Fix should speedup the query by 2x

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            kwho Michael Ho
            mmokhtar Mostafa Mokhtar
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment