Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-62

performance issue when sending data node-to-node

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Bug
    • Impala 0.5
    • Impala 0.7
    • None

    Description

      A query like:
      select c_custkey from customer order by 1 limit 1;
      needs to scan the data from every data node and then send all the results to the coordinator for ordering. This means that the coordinator node should be (assuming enough memory) CPU bound or inbound network bound. Currently my observations are that the coordinator node inbound network rate is ~2MB/s for this query. This is far below the 1Gbps (118MB/s) rate of the wire. This bottleneck needs to be investigated.

      Perf data from coordinator node. Note KBIn and cpu values.

      $ collectl -oT -i2
      waiting for 2 second sample...
      
      #         <----CPU[HYPER]-----><----------Disks-----------><----------Network---------->
      #Time     cpu sys inter  ctxsw KBRead  Reads KBWrit Writes   KBIn  PktIn  KBOut  PktOut 
      15:20:03    0   0  1620   1711   2056      5    498    120   1933   2198     67     828 
      15:20:05    1   0  1806   1775   1072      3    490    118   2132   2414     61     838 
      15:20:07    0   0  1422   1574   1024      2    532    117   1781   2014     52     722 
      15:20:09    0   0  1699   1777   2112      5    494    119   2049   2325     58     808 
      15:20:11    0   0  1648   1705   1024      2    502    118   2177   2454     61     842 
      15:20:13    0   0  1553   1713   1024      2    494    118   1894   2140     66     810 
      15:20:15    0   0  1665   1759   2112      5    498    120   2172   2451     66     849 
      15:20:17    0   0  1827   2303   1024      2    498    118   2142   2431     63     842 
      15:20:19    0   0  1642   1669   2048      4    498    120   1948   2216     55     764 
      15:20:21    0   0  1698   1683   1024      2    514    119   2129   2412     60     828 
      15:20:23    0   0  1528   1693   1024      2    498    119   1955   2213     69     853 
      15:20:25    0   0  1701   1727   2112      5    492    119   2054   2327     57     795 
      15:20:27    0   0  1597   1710   1024      2    494    117   2140   2421     59     821 
      15:20:29    0   0  1579   1742   1024      2    502    120   2007   2271     62     783 
      15:20:31    0   0  1600   1716   2048      4    486    116   2096   2362     59     818 
      15:20:33    0   0  1520   1743   1024      2    494    119   1868   2113     66     813 
      15:20:35    0   0  1603   1774   2048      4    496    119   2104   2364     60     828 
      15:20:37    0   0  1600   1687   1088      3    498    116   2171   2447     60     842 
      15:20:39    0   0  1537   1701   1024      2    502    120   1920   2173     55     758 
      15:20:41    0   0  1605   1714   2048      4    498    119   2140   2412     60     828 
      15:20:43    0   0  1484   1618   1024      2    492    118   2062   2323     70     870 
      15:20:45    0   0  1619   1755   1024      2    496    119   1953   2200     62     790 
      

      Here is an example. For 30M bigint values (the only projection here), it takes 4m29s to send to the coordinator

      > select s_suppkey from supplier order by 1 limit 1;
      Query: select s_suppkey from supplier order by 1 limit 1
      Query finished, fetching results ...
      1
      Returned 1 row(s) in 269.39s
      
      > select count(*) from supplier;
      Query: select count(*) from supplier
      Query finished, fetching results ...
      30000000
      Returned 1 row(s) in 2.51s
      
      > select max(s_suppkey) from supplier;
      Query: select max(s_suppkey) from supplier
      Query finished, fetching results ...
      30000000
      Returned 1 row(s) in 2.46s
      

      Here is the profile for the problem query
      select s_suppkey from supplier order by 1 limit 1;

      Query (id=383aa85f9f154673:8c513bcc05d4c8e2):
         - PlanningTime: 3ms
        Query 383aa85f9f154673:8c513bcc05d4c8e2:(987ms 0.00%)
          Aggregate Profile:
          Coordinator Fragment:(4m28s 0.00%)
             - RowsProduced: 1
            CodeGen:
               - CodegenTime: 216K clock cycles
               - CompileTime: 60ms
               - LoadTime: 6ms
               - ModuleFileSize: 44.61 KB
            SORT_NODE (id=1):(4m28s 0.45%)
               - MemoryUsed: 0.00 
               - RowsReturned: 1
               - RowsReturnedRate: 0
            EXCHANGE_NODE (id=2):(4m27s 99.55%)
               - BytesReceived: 572.32 MB
               - ConvertRowBatchTime: 512ms
               - DeserializeRowBatchTimer: 382ms
               - MemoryUsed: 0.00 
               - RowsReturned: 30.00M
               - RowsReturnedRate: 112.19 K/sec
          Averaged Fragment 1:(79ms 0.00%)
            completion times: min:3m25s  max:4m28s  mean: 4m13s  stddev:18s567ms
            execution rates: min:1.92 MB/sec  max:2.40 MB/sec  mean:2.03 MB/sec  stddev:142.38 KB/sec
            split sizes:  min: 394.86 MB, max: 643.30 MB, avg: 517.19 MB, stddev: 62.19 MB
             - RowsProduced: 3.75M
            CodeGen:
               - CodegenTime: 1ms
               - CompileTime: 79ms
               - LoadTime: 5ms
               - ModuleFileSize: 44.61 KB
            DataStreamSender:
               - BytesSent: 71.54 MB
               - DataSinkTime: 4m13s
               - SerializeBatchTime: 193ms
               - ThriftTransmitTime: 4m12s
            HDFS_SCAN_NODE (id=0):(70ms 83.25%)
               - BytesRead: 517.19 MB
               - DelimiterParseTime: 656ms
               - MaterializeTupleTime: 136ms
               - MemoryUsed: 0.00 
               - PerDiskReadThroughput: 1.43 GB/sec
               - RowsReturned: 3.75M
               - RowsReturnedRate: 66.91 M/sec
               - ScanRangesComplete: 6
               - ScannerThreadsReadTime: 356ms
               - TotalReadThroughput: 2.04 MB/sec
          Fragment 1:
            Instance 383aa85f9f154673:8c513bcc05d4c8e4:(68ms 16.75%)
              Hdfs split stats (:<# splits>/): 0:1/134.22M 1:1/3.40M 3:1/3.38M 6:1/134.22M 7:1/3.47M 8:1/1.59M 9:2/268.44M 
               - RowsProduced: 3.80M
              CodeGen:
                 - CodegenTime: 1ms
                 - CompileTime: 75ms
                 - LoadTime: 6ms
                 - ModuleFileSize: 44.61 KB
              DataStreamSender:
                 - BytesSent: 72.58 MB
                 - DataSinkTime: 4m20s
                 - SerializeBatchTime: 207ms
                 - ThriftTransmitTime: 4m20s
              HDFS_SCAN_NODE (id=0):(56ms 83.25%)
                File Formats: TEXT/NONE:8 
                 - BytesRead: 523.28 MB
                 - DelimiterParseTime: 686ms
                 - MaterializeTupleTime: 136ms
                 - MemoryUsed: 0.00 
                 - PerDiskReadThroughput: 1.37 GB/sec
                 - RowsReturned: 3.80M
                 - RowsReturnedRate: 66.89 M/sec
                 - ScanRangesComplete: 8
                 - ScannerThreadsReadTime: 371ms
                 - TotalReadThroughput: 2.00 MB/sec
            Instance 383aa85f9f154673:8c513bcc05d4c8e5:(178ms 1.04%)
              Hdfs split stats (:<# splits>/): 0:1/3.44M 1:2/135.70M 3:2/268.44M 7:1/1.46M 8:1/1.56M 9:1/3.45M 
               - RowsProduced: 2.86M
              CodeGen:
                 - CodegenTime: 1ms
                 - CompileTime: 81ms
                 - LoadTime: 5ms
                 - ModuleFileSize: 44.61 KB
              DataStreamSender:
                 - BytesSent: 54.55 MB
                 - DataSinkTime: 3m25s
                 - SerializeBatchTime: 153ms
                 - ThriftTransmitTime: 3m24s
              HDFS_SCAN_NODE (id=0):(176ms 98.96%)
                File Formats: TEXT/NONE:8 
                 - BytesRead: 394.86 MB
                 - DelimiterParseTime: 482ms
                 - MaterializeTupleTime: 96ms
                 - MemoryUsed: 0.00 
                 - PerDiskReadThroughput: 1.61 GB/sec
                 - RowsReturned: 2.86M
                 - RowsReturnedRate: 16.19 M/sec
                 - ScanRangesComplete: 8
                 - ScannerThreadsReadTime: 239ms
                 - TotalReadThroughput: 1.92 MB/sec
            Instance 383aa85f9f154673:8c513bcc05d4c8e6:(59ms 17.60%)
              Hdfs split stats (:<# splits>/): 0:1/134.22M 2:1/134.22M 7:2/137.63M 9:1/134.22M 
               - RowsProduced: 3.73M
              CodeGen:
                 - CodegenTime: 1ms
                 - CompileTime: 80ms
                 - LoadTime: 5ms
                 - ModuleFileSize: 44.61 KB
              DataStreamSender:
                 - BytesSent: 71.20 MB
                 - DataSinkTime: 4m18s
                 - SerializeBatchTime: 198ms
                 - ThriftTransmitTime: 4m18s
              HDFS_SCAN_NODE (id=0):(49ms 82.40%)
                File Formats: TEXT/NONE:5 
                 - BytesRead: 515.25 MB
                 - DelimiterParseTime: 648ms
                 - MaterializeTupleTime: 136ms
                 - MemoryUsed: 0.00 
                 - PerDiskReadThroughput: 1.35 GB/sec
                 - RowsReturned: 3.73M
                 - RowsReturnedRate: 75.78 M/sec
                 - ScanRangesComplete: 5
                 - ScannerThreadsReadTime: 373ms
                 - TotalReadThroughput: 1.99 MB/sec
            Instance 383aa85f9f154673:8c513bcc05d4c8e7:(51ms 21.08%)
              Hdfs split stats (:<# splits>/): 0:1/134.22M 4:2/137.64M 6:1/134.22M 8:1/2.76M 9:1/134.22M 
               - RowsProduced: 3.76M
              CodeGen:
                 - CodegenTime: 1ms
                 - CompileTime: 80ms
                 - LoadTime: 5ms
                 - ModuleFileSize: 44.61 KB
              DataStreamSender:
                 - BytesSent: 71.71 MB
                 - DataSinkTime: 4m20s
                 - SerializeBatchTime: 183ms
                 - ThriftTransmitTime: 4m19s
              HDFS_SCAN_NODE (id=0):(40ms 78.92%)
                File Formats: TEXT/NONE:6 
                 - BytesRead: 517.90 MB
                 - DelimiterParseTime: 641ms
                 - MaterializeTupleTime: 135ms
                 - MemoryUsed: 0.00 
                 - PerDiskReadThroughput: 1.54 GB/sec
                 - RowsReturned: 3.76M
                 - RowsReturnedRate: 92.57 M/sec
                 - ScanRangesComplete: 6
                 - ScannerThreadsReadTime: 329ms
                 - TotalReadThroughput: 1.99 MB/sec
            Instance 383aa85f9f154673:8c513bcc05d4c8e8:(53ms 21.41%)
              Hdfs split stats (:<# splits>/): 1:1/134.22M 2:2/137.68M 3:1/134.22M 9:1/134.22M 10:1/134.22M 
               - RowsProduced: 4.66M
              CodeGen:
                 - CodegenTime: 1ms
                 - CompileTime: 81ms
                 - LoadTime: 5ms
                 - ModuleFileSize: 44.61 KB
              DataStreamSender:
                 - BytesSent: 88.88 MB
                 - DataSinkTime: 4m28s
                 - SerializeBatchTime: 232ms
                 - ThriftTransmitTime: 4m27s
              HDFS_SCAN_NODE (id=0):(42ms 78.59%)
                File Formats: TEXT/NONE:6 
                 - BytesRead: 643.30 MB
                 - DelimiterParseTime: 901ms
                 - MaterializeTupleTime: 177ms
                 - MemoryUsed: 0.00 
                 - PerDiskReadThroughput: 1.29 GB/sec
                 - RowsReturned: 4.66M
                 - RowsReturnedRate: 110.87 M/sec
                 - ScanRangesComplete: 6
                 - ScannerThreadsReadTime: 487ms
                 - TotalReadThroughput: 2.40 MB/sec
            Instance 383aa85f9f154673:8c513bcc05d4c8e9:(67ms 15.97%)
              Hdfs split stats (:<# splits>/): 0:1/396.66K 2:1/134.22M 5:2/137.66M 8:1/134.22M 9:1/134.22M 
               - RowsProduced: 3.74M
              CodeGen:
                 - CodegenTime: 1ms
                 - CompileTime: 79ms
                 - LoadTime: 5ms
                 - ModuleFileSize: 44.61 KB
              DataStreamSender:
                 - BytesSent: 71.41 MB
                 - DataSinkTime: 4m18s
                 - SerializeBatchTime: 198ms
                 - ThriftTransmitTime: 4m18s
              HDFS_SCAN_NODE (id=0):(56ms 84.03%)
                File Formats: TEXT/NONE:6 
                 - BytesRead: 515.66 MB
                 - DelimiterParseTime: 621ms
                 - MaterializeTupleTime: 129ms
                 - MemoryUsed: 0.00 
                 - PerDiskReadThroughput: 1.38 GB/sec
                 - RowsReturned: 3.74M
                 - RowsReturnedRate: 65.70 M/sec
                 - ScanRangesComplete: 6
                 - ScannerThreadsReadTime: 363ms
                 - TotalReadThroughput: 1.99 MB/sec
            Instance 383aa85f9f154673:8c513bcc05d4c8ea:(84ms 12.98%)
              Hdfs split stats (:<# splits>/): 2:1/134.22M 7:2/137.66M 8:1/134.22M 9:1/134.22M 
               - RowsProduced: 3.73M
              CodeGen:
                 - CodegenTime: 1ms
                 - CompileTime: 80ms
                 - LoadTime: 5ms
                 - ModuleFileSize: 44.61 KB
              DataStreamSender:
                 - BytesSent: 71.22 MB
                 - DataSinkTime: 4m18s
                 - SerializeBatchTime: 187ms
                 - ThriftTransmitTime: 4m18s
              HDFS_SCAN_NODE (id=0):(73ms 87.02%)
                File Formats: TEXT/NONE:5 
                 - BytesRead: 515.28 MB
                 - DelimiterParseTime: 632ms
                 - MaterializeTupleTime: 135ms
                 - MemoryUsed: 0.00 
                 - PerDiskReadThroughput: 1.52 GB/sec
                 - RowsReturned: 3.73M
                 - RowsReturnedRate: 50.80 M/sec
                 - ScanRangesComplete: 5
                 - ScannerThreadsReadTime: 331ms
                 - TotalReadThroughput: 1.99 MB/sec
            Instance 383aa85f9f154673:8c513bcc05d4c8eb:(75ms 13.57%)
              Hdfs split stats (:<# splits>/): 0:2/268.44M 5:1/134.22M 6:1/134.22M 
               - RowsProduced: 3.71M
              CodeGen:
                 - CodegenTime: 2ms
                 - CompileTime: 80ms
                 - LoadTime: 5ms
                 - ModuleFileSize: 44.61 KB
              DataStreamSender:
                 - BytesSent: 70.76 MB
                 - DataSinkTime: 4m16s
                 - SerializeBatchTime: 185ms
                 - ThriftTransmitTime: 4m16s
              HDFS_SCAN_NODE (id=0):(65ms 86.43%)
                File Formats: TEXT/NONE:4 
                 - BytesRead: 512.00 MB
                 - DelimiterParseTime: 640ms
                 - MaterializeTupleTime: 141ms
                 - MemoryUsed: 0.00 
                 - PerDiskReadThroughput: 1.39 GB/sec
                 - RowsReturned: 3.71M
                 - RowsReturnedRate: 56.51 M/sec
                 - ScanRangesComplete: 4
                 - ScannerThreadsReadTime: 359ms
                 - TotalReadThroughput: 2.00 MB/sec
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            henryr Henry Robinson
            grahn Greg Rahn
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment