Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4392

Add HdfsTableSink memory consumption to the query profile

    Details

      Description

      Memory consumed by HdfsTableSink is tracked by not captured in the query profile, makes it look as if there is untracked memory.

      from the Memz tab

      Process: Limit=201.73 GB Total=4.78 GB Peak=8.66 GB
        Free Disk IO Buffers: Total=1.40 GB Peak=1.84 GB
        RequestPool=root.mmokhtar: Total=2.90 GB Peak=7.80 GB
          Query(aa40c1753a40a8be:9c27185a00000000): Total=2.90 GB Peak=3.20 GB
            Fragment aa40c1753a40a8be:9c27185a00000006: Total=2.90 GB Peak=3.20 GB
              HDFS_SCAN_NODE (id=0): Total=1.51 GB Peak=1.97 GB
              HdfsTableSink: Total=1.38 GB Peak=1.38 GB
                HdfsTableSink Exprs: Total=124.02 MB Peak=124.02 MB
            Block Manager: Limit=161.38 GB Total=0 Peak=0
        RequestPool=root.jenkins: Total=0 Peak=284.69 MB
      

      From query profile

              HdfsTableSink:(Total: 9m47s, non-child: 9m47s, % non-child: 100.00%)
                 - BytesWritten: 6.79 GB (7290559999)
                 - CompressTimer: 58s128ms
                 - EncodeTimer: 8m6s
                 - FilesCreated: 69 (69)
                 - FinalizePartitionFileTimer: 1m18s
                 - HdfsWriteTimer: 1m16s
                 - PartitionsCreated: 1 (1)
                 - RowsInserted: 426.90K (426901)
                 - TmpFileCreateTimer: 1s253ms
      

        Activity

        Hide
        tarmstrong Tim Armstrong added a comment -

        I'm pretty sure this is a regression introduced by "IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks" - before that the MemTrackers were hooked up to the runtime profiles.

        Show
        tarmstrong Tim Armstrong added a comment - I'm pretty sure this is a regression introduced by " IMPALA-3567 : Part 1: groundwork to make Join build sides DataSinks" - before that the MemTrackers were hooked up to the runtime profiles.
        Hide
        mmokhtar Mostafa Mokhtar added a comment -

        Tim Armstrong

        Should the HdfsTableSink memory be tracked by the BlockMgr?

            Fragment F01:
              Instance 1b4cb27cecf8423b:5bcf9ac400000004 (host=vd0234.halxg.cloudera.com:22000):(Total: 715.072ms, non-child: 0.000ns, % non-child: 0.00%)
                MemoryUsage(4s000ms): 47.14 MB, 128.44 MB, 235.66 MB, 300.78 MB, 396.31 MB, 475.09 MB, 551.86 MB, 628.97 MB, 720.93 MB, 804.99 MB, 891.56 MB, 969.85 MB, 1.03 GB, 1.11 GB, 1.19 GB, 1.26 GB, 1.32 GB, 1.38 GB, 1.44 GB, 1.50 GB, 1.55 GB, 1.60 GB, 1.65 GB, 1.71 GB, 1.78 GB, 1.86 GB, 1.95 GB, 2.03 GB, 2.11 GB, 2.20 GB, 2.27 GB, 2.35 GB, 2.42 GB, 2.50 GB, 2.57 GB, 2.59 GB, 2.60 GB, 2.61 GB, 2.66 GB, 2.80 GB, 2.94 GB, 3.09 GB, 3.25 GB, 3.38 GB, 3.51 GB, 3.61 GB, 3.76 GB
                ThreadUsage(4s000ms): 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
                 - AverageThreadTokens: 1.00 
                 - BloomFilterBytes: 0
                 - PeakMemoryUsage: 3.86 GB (4143801672)
                 - PerHostPeakMemUsage: 3.86 GB (4143801672)
                 - RowsProduced: 31.74M (31738704)
                 - TotalCpuTime: 2m42s
                 - TotalNetworkReceiveTime: 26s562ms
                 - TotalNetworkSendTime: 0.000ns
                 - TotalStorageWaitTime: 0.000ns
                Fragment Instance Lifecycle Timings:
                   - ExecTime: 3m8s
                     - ExecTreeExecTime: 26s612ms
                   - OpenTime: 692.279ms
                     - ExecTreeOpenTime: 691.825ms
                   - PrepareTime: 22.763ms
                     - ExecTreePrepareTime: 22.082ms
                BlockMgr:
                   - BlockWritesOutstanding: 0 (0)
                   - BlocksCreated: 0 (0)
                   - BlocksRecycled: 0 (0)
                   - BufferedPins: 0 (0)
                   - BytesWritten: 0
                   - MaxBlockSize: 8.00 MB (8388608)
                   - MemoryLimit: 161.38 GB (173281017856)
                   - PeakMemoryUsage: 0
                   - ScratchFileUsedBytes: 0
                   - TotalBufferWaitTime: 0.000ns
                   - TotalEncryptionTime: 0.000ns
                   - TotalReadBlockTime: 0.000ns
                CodeGen:(Total: 22.012ms, non-child: 22.012ms, % non-child: 100.00%)
                   - CodegenTime: 0.000ns
                   - CompileTime: 0.000ns
                   - LoadTime: 0.000ns
                   - ModuleBitcodeSize: 1.90 MB (1993816)
                   - NumFunctions: 0 (0)
                   - NumInstructions: 0 (0)
                   - OptimizationTime: 0.000ns
                   - PrepareTime: 21.512ms
                HdfsTableSink:(Total: 2m41s, non-child: 2m41s, % non-child: 100.00%)
                   - BytesWritten: 1.91 GB (2054317637)
                   - CompressTimer: 5s824ms
                   - EncodeTimer: 1m34s
                   - FilesCreated: 235 (235)
                   - FinalizePartitionFileTimer: 36s263ms
                   - HdfsWriteTimer: 17s952ms
                   - PartitionsCreated: 235 (235)
                   - RowsInserted: 31.74M (31738704)
                   - TmpFileCreateTimer: 4s836ms
                EXCHANGE_NODE (id=1):(Total: 27s272ms, non-child: 2s708ms, % non-child: 9.93%)
                  BytesReceived(4s000ms): 9.11 MB, 67.71 MB, 137.14 MB, 209.44 MB, 296.60 MB, 375.02 MB, 447.38 MB, 521.17 MB, 597.50 MB, 670.74 MB, 746.97 MB, 822.89 MB, 892.92 MB, 967.01 MB, 1.02 GB, 1.08 GB, 1.14 GB, 1.19 GB, 1.25 GB, 1.30 GB, 1.35 GB, 1.40 GB, 1.44 GB, 1.49 GB, 1.55 GB, 1.61 GB, 1.69 GB, 1.76 GB, 1.84 GB, 1.91 GB, 1.99 GB, 2.05 GB, 2.12 GB, 2.21 GB, 2.29 GB, 2.32 GB, 2.32 GB, 2.33 GB, 2.34 GB, 2.34 GB, 2.34 GB, 2.34 GB, 2.34 GB, 2.34 GB, 2.34 GB, 2.34 GB, 2.34 GB
                   - BytesReceived: 2.34 GB (2512691829)
                   - ConvertRowBatchTime: 590.198ms
                   - DeserializeRowBatchTimer: 5s236ms
                   - FirstBatchArrivalWaitTime: 691.815ms
                   - PeakMemoryUsage: 0
                   - RowsReturned: 31.74M (31738704)
                   - RowsReturnedRate: 1.16 M/sec
                   - SendersBlockedTimer: 43s782ms
                   - SendersBlockedTotalTimer(*): 4m27s
        
        Show
        mmokhtar Mostafa Mokhtar added a comment - Tim Armstrong Should the HdfsTableSink memory be tracked by the BlockMgr? Fragment F01: Instance 1b4cb27cecf8423b:5bcf9ac400000004 (host=vd0234.halxg.cloudera.com:22000):(Total: 715.072ms, non-child: 0.000ns, % non-child: 0.00%) MemoryUsage(4s000ms): 47.14 MB, 128.44 MB, 235.66 MB, 300.78 MB, 396.31 MB, 475.09 MB, 551.86 MB, 628.97 MB, 720.93 MB, 804.99 MB, 891.56 MB, 969.85 MB, 1.03 GB, 1.11 GB, 1.19 GB, 1.26 GB, 1.32 GB, 1.38 GB, 1.44 GB, 1.50 GB, 1.55 GB, 1.60 GB, 1.65 GB, 1.71 GB, 1.78 GB, 1.86 GB, 1.95 GB, 2.03 GB, 2.11 GB, 2.20 GB, 2.27 GB, 2.35 GB, 2.42 GB, 2.50 GB, 2.57 GB, 2.59 GB, 2.60 GB, 2.61 GB, 2.66 GB, 2.80 GB, 2.94 GB, 3.09 GB, 3.25 GB, 3.38 GB, 3.51 GB, 3.61 GB, 3.76 GB ThreadUsage(4s000ms): 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 - AverageThreadTokens: 1.00 - BloomFilterBytes: 0 - PeakMemoryUsage: 3.86 GB (4143801672) - PerHostPeakMemUsage: 3.86 GB (4143801672) - RowsProduced: 31.74M (31738704) - TotalCpuTime: 2m42s - TotalNetworkReceiveTime: 26s562ms - TotalNetworkSendTime: 0.000ns - TotalStorageWaitTime: 0.000ns Fragment Instance Lifecycle Timings: - ExecTime: 3m8s - ExecTreeExecTime: 26s612ms - OpenTime: 692.279ms - ExecTreeOpenTime: 691.825ms - PrepareTime: 22.763ms - ExecTreePrepareTime: 22.082ms BlockMgr: - BlockWritesOutstanding: 0 (0) - BlocksCreated: 0 (0) - BlocksRecycled: 0 (0) - BufferedPins: 0 (0) - BytesWritten: 0 - MaxBlockSize: 8.00 MB (8388608) - MemoryLimit: 161.38 GB (173281017856) - PeakMemoryUsage: 0 - ScratchFileUsedBytes: 0 - TotalBufferWaitTime: 0.000ns - TotalEncryptionTime: 0.000ns - TotalReadBlockTime: 0.000ns CodeGen:(Total: 22.012ms, non-child: 22.012ms, % non-child: 100.00%) - CodegenTime: 0.000ns - CompileTime: 0.000ns - LoadTime: 0.000ns - ModuleBitcodeSize: 1.90 MB (1993816) - NumFunctions: 0 (0) - NumInstructions: 0 (0) - OptimizationTime: 0.000ns - PrepareTime: 21.512ms HdfsTableSink:(Total: 2m41s, non-child: 2m41s, % non-child: 100.00%) - BytesWritten: 1.91 GB (2054317637) - CompressTimer: 5s824ms - EncodeTimer: 1m34s - FilesCreated: 235 (235) - FinalizePartitionFileTimer: 36s263ms - HdfsWriteTimer: 17s952ms - PartitionsCreated: 235 (235) - RowsInserted: 31.74M (31738704) - TmpFileCreateTimer: 4s836ms EXCHANGE_NODE (id=1):(Total: 27s272ms, non-child: 2s708ms, % non-child: 9.93%) BytesReceived(4s000ms): 9.11 MB, 67.71 MB, 137.14 MB, 209.44 MB, 296.60 MB, 375.02 MB, 447.38 MB, 521.17 MB, 597.50 MB, 670.74 MB, 746.97 MB, 822.89 MB, 892.92 MB, 967.01 MB, 1.02 GB, 1.08 GB, 1.14 GB, 1.19 GB, 1.25 GB, 1.30 GB, 1.35 GB, 1.40 GB, 1.44 GB, 1.49 GB, 1.55 GB, 1.61 GB, 1.69 GB, 1.76 GB, 1.84 GB, 1.91 GB, 1.99 GB, 2.05 GB, 2.12 GB, 2.21 GB, 2.29 GB, 2.32 GB, 2.32 GB, 2.33 GB, 2.34 GB, 2.34 GB, 2.34 GB, 2.34 GB, 2.34 GB, 2.34 GB, 2.34 GB, 2.34 GB, 2.34 GB - BytesReceived: 2.34 GB (2512691829) - ConvertRowBatchTime: 590.198ms - DeserializeRowBatchTimer: 5s236ms - FirstBatchArrivalWaitTime: 691.815ms - PeakMemoryUsage: 0 - RowsReturned: 31.74M (31738704) - RowsReturnedRate: 1.16 M/sec - SendersBlockedTimer: 43s782ms - SendersBlockedTotalTimer(*): 4m27s
        Hide
        tarmstrong Tim Armstrong added a comment -

        No, it should appear under HdfsTableSink, and did before my change.

        Show
        tarmstrong Tim Armstrong added a comment - No, it should appear under HdfsTableSink, and did before my change.
        Hide
        tarmstrong Tim Armstrong added a comment -

        Change subject: IMPALA-4392: restore PeakMemoryUsage to DataSink profiles
        ......................................................................

        IMPALA-4392: restore PeakMemoryUsage to DataSink profiles

        The join build sink patches refactored the DataSink interface and
        inadvertently removed this counter from the profile.
        The problem was that the sink MemTracker was not initialized with the
        sink's profile.

        The fix is for the sink to create the MemTracker itself.

        Testing:
        Ran core tests. Manually checked profile to make sure the counter
        appeared in HdfsTableSink, DataStreamSender, etc.

        Change-Id: Iaa5db623a84c47d5904033ec26aece74f500a2c9
        Reviewed-on: http://gerrit.cloudera.org:8080/4969
        Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        tarmstrong Tim Armstrong added a comment - Change subject: IMPALA-4392 : restore PeakMemoryUsage to DataSink profiles ...................................................................... IMPALA-4392 : restore PeakMemoryUsage to DataSink profiles The join build sink patches refactored the DataSink interface and inadvertently removed this counter from the profile. The problem was that the sink MemTracker was not initialized with the sink's profile. The fix is for the sink to create the MemTracker itself. Testing: Ran core tests. Manually checked profile to make sure the counter appeared in HdfsTableSink, DataStreamSender, etc. Change-Id: Iaa5db623a84c47d5904033ec26aece74f500a2c9 Reviewed-on: http://gerrit.cloudera.org:8080/4969 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            tarmstrong Tim Armstrong
            Reporter:
            mmokhtar Mostafa Mokhtar
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development