Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7078

Selective Avro scans of wide tables use more memory than necessary

    Details

      Description

      We have some anecdotal evidence that this got worse somewhere between 2.7 and 2.10, but it may be because of a change in timing rather than a direct consequence of a code change. The scan uses ~500MB of memory, but should probably be running with less.

      Analysis revealed that most of the memory was from the tuple data buffers, which were multiple mb because they had space for 1024 wide rows.

      There are a couple of things we could do:

      • Directly improve Avro selective scans to avoid transferring excessive memory (i.e. something analogous to what IMPALA-4923 did for Parquet)
      • Tweak the row batch queueing to reduce unnecessary queueing. If the consumer is running slow, lots of batches pile up in the queue. We saw the regression on a system with 24 disks, so the queue was ~280 entries, which is excessive.

      Repro is as follows:

      -- In Hive
      create table wide_250_cols (
        col0 STRING, col1 STRING, col2 STRING, col3 STRING, col4 STRING, col5 STRING,
        col6 STRING, col7 STRING, col8 STRING, col9 STRING, col10 STRING, col11 STRING,
        col12 STRING, col13 STRING, col14 STRING, col15 STRING, col16 STRING, col17 STRING,
        col18 STRING, col19 STRING, col20 STRING, col21 STRING, col22 STRING, col23 STRING,
        col24 STRING, col25 STRING, col26 STRING, col27 STRING, col28 STRING, col29 STRING,
        col30 STRING, col31 STRING, col32 STRING, col33 STRING, col34 STRING, col35 STRING,
        col36 STRING, col37 STRING, col38 STRING, col39 STRING, col40 STRING, col41 STRING,
        col42 STRING, col43 STRING, col44 STRING, col45 STRING, col46 STRING, col47 STRING,
        col48 STRING, col49 STRING, col50 STRING, col51 STRING, col52 STRING, col53 STRING,
        col54 STRING, col55 STRING, col56 STRING, col57 STRING, col58 STRING, col59 STRING,
        col60 STRING, col61 STRING, col62 STRING, col63 STRING, col64 STRING, col65 STRING,
        col66 STRING, col67 STRING, col68 STRING, col69 STRING, col70 STRING, col71 STRING,
        col72 STRING, col73 STRING, col74 STRING, col75 STRING, col76 STRING, col77 STRING,
        col78 STRING, col79 STRING, col80 STRING, col81 STRING, col82 STRING, col83 STRING,
        col84 STRING, col85 STRING, col86 STRING, col87 STRING, col88 STRING, col89 STRING,
        col90 STRING, col91 STRING, col92 STRING, col93 STRING, col94 STRING, col95 STRING,
        col96 STRING, col97 STRING, col98 STRING, col99 STRING, col100 STRING, col101 STRING,
        col102 STRING, col103 STRING, col104 STRING, col105 STRING, col106 STRING, col107 STRING,
        col108 STRING, col109 STRING, col110 STRING, col111 STRING, col112 STRING, col113 STRING,
        col114 STRING, col115 STRING, col116 STRING, col117 STRING, col118 STRING, col119 STRING,
        col120 STRING, col121 STRING, col122 STRING, col123 STRING, col124 STRING, col125 STRING,
        col126 STRING, col127 STRING, col128 STRING, col129 STRING, col130 STRING, col131 STRING,
        col132 STRING, col133 STRING, col134 STRING, col135 STRING, col136 STRING, col137 STRING,
        col138 STRING, col139 STRING, col140 STRING, col141 STRING, col142 STRING, col143 STRING,
        col144 STRING, col145 STRING, col146 STRING, col147 STRING, col148 STRING, col149 STRING,
        col150 STRING, col151 STRING, col152 STRING, col153 STRING, col154 STRING, col155 STRING,
        col156 STRING, col157 STRING, col158 STRING, col159 STRING, col160 STRING, col161 STRING,
        col162 STRING, col163 STRING, col164 STRING, col165 STRING, col166 STRING, col167 STRING,
        col168 STRING, col169 STRING, col170 STRING, col171 STRING, col172 STRING, col173 STRING,
        col174 STRING, col175 STRING, col176 STRING, col177 STRING, col178 STRING, col179 STRING,
        col180 STRING, col181 STRING, col182 STRING, col183 STRING, col184 STRING, col185 STRING,
        col186 STRING, col187 STRING, col188 STRING, col189 STRING, col190 STRING, col191 STRING,
        col192 STRING, col193 STRING, col194 STRING, col195 STRING, col196 STRING, col197 STRING,
        col198 STRING, col199 STRING, col200 STRING, col201 STRING, col202 STRING, col203 STRING,
        col204 STRING, col205 STRING, col206 STRING, col207 STRING, col208 STRING, col209 STRING,
        col210 STRING, col211 STRING, col212 STRING, col213 STRING, col214 STRING, col215 STRING,
        col216 STRING, col217 STRING, col218 STRING, col219 STRING, col220 STRING, col221 STRING,
        col222 STRING, col223 STRING, col224 STRING, col225 STRING, col226 STRING, col227 STRING,
        col228 STRING, col229 STRING, col230 STRING, col231 STRING, col232 STRING, col233 STRING,
        col234 STRING, col235 STRING, col236 STRING, col237 STRING, col238 STRING, col239 STRING,
        col240 STRING, col241 STRING, col242 STRING, col243 STRING, col244 STRING, col245 STRING,
        col246 STRING, col247 STRING, col248 STRING, col249 STRING
      ) stored as avro;
      
      -- In Hive
      SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;
      SET mapred.output.compression.type=BLOCK;
      SET hive.exec.compress.output=true;
      SET avro.output.codec=snappy;
      
      insert into wide_250_cols
      select
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment
       
      from tpch.lineitem;
      
      -- In Impala
      invalidate metadata wide_250_cols;
      set num_scanner_threads=0; set batch_size=0; set live_summary=1; use default;
      select * from wide_250_cols where col225 = 'foo' order by col244; summary;
      /*
      +---------------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------+
      | Operator        	| #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem  | Est. Peak Mem | Detail            	|
      +---------------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------+
      | 02:MERGING-EXCHANGE | 1  	| 0ns  	| 0ns  	| 0 	| -1     	| 0 B   	| 0 B       	| UNPARTITIONED     	|
      | 01:SORT         	| 1  	| 0ns  	| 0ns  	| 0 	| -1     	| 24.52 MB  | 12.00 MB  	|                   	|
      | 00:SCAN HDFS    	| 1  	| 68.63s   | 68.63s   | 0 	| -1     	| 433.95 MB | 704.00 MB 	| default.wide_250_cols |
      +---------------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------+
      - Setting batch_size reduces peak memory
      - Setting num_scanner_threads to lower value reduces peak memory
      - Setting --max_row_batches lower reduces peak memory.
      - Setting --max_row_batches higher increase peak memory.
      - Logging the batches added to the queue shows they have 3.94MB of memory in the MemPool
        and sometimes 1 8mb I/O buffer.
      4 STRING, col15 STRING, col16 STRING, col17 STRING,
        col18 STRING, col19 STRING, col20 STRING, col21 STRING, col22 STRING, col23 STRING,
        col24 STRING, col25 STRING, col26 STRING, col27 STRING, col28 STRING, col29 STRING,
        col30 STRING, col31 STRING, col32 STRING, col33 STRING, col34 STRING, col35 STRING,
        col36 STRING, col37 STRING, col38 STRING, col39 STRING, col40 STRING, col41 STRING,
        col42 STRING, col43 STRING, col44 STRING, col45 STRING, col46 STRING, col47 STRING,
        col48 STRING, col49 STRING, col50 STRING, col51 STRING, col52 STRING, col53 STRING,
        col54 STRING, col55 STRING, col56 STRING, col57 STRING, col58 STRING, col59 STRING,
        col60 STRING, col61 STRING, col62 STRING, col63 STRING, col64 STRING, col65 STRING,
        col66 STRING, col67 STRING, col68 STRING, col69 STRING, col70 STRING, col71 STRING,
        col72 STRING, col73 STRING, col74 STRING, col75 STRING, col76 STRING, col77 STRING,
        col78 STRING, col79 STRING, col80 STRING, col81 STRING, col82 STRING, col83 STRING,
        col84 STRING, col85 STRING, col86 STRING, col87 STRING, col88 STRING, col89 STRING,
        col90 STRING, col91 STRING, col92 STRING, col93 STRING, col94 STRING, col95 STRING,
        col96 STRING, col97 STRING, col98 STRING, col99 STRING, col100 STRING, col101 STRING,
        col102 STRING, col103 STRING, col104 STRING, col105 STRING, col106 STRING, col107 STRING,
        col108 STRING, col109 STRING, col110 STRING, col111 STRING, col112 STRING, col113 STRING,
        col114 STRING, col115 STRING, col116 STRING, col117 STRING, col118 STRING, col119 STRING,
        col120 STRING, col121 STRING, col122 STRING, col123 STRING, col124 STRING, col125 STRING,
        col126 STRING, col127 STRING, col128 STRING, col129 STRING, col130 STRING, col131 STRING,
        col132 STRING, col133 STRING, col134 STRING, col135 STRING, col136 STRING, col137 STRING,
        col138 STRING, col139 STRING, col140 STRING, col141 STRING, col142 STRING, col143 STRING,
        col144 STRING, col145 STRING, col146 STRING, col147 STRING, col148 STRING, col149 STRING,
        col150 STRING, col151 STRING, col152 STRING, col153 STRING, col154 STRING, col155 STRING,
        col156 STRING, col157 STRING, col158 STRING, col159 STRING, col160 STRING, col161 STRING,
        col162 STRING, col163 STRING, col164 STRING, col165 STRING, col166 STRING, col167 STRING,
        col168 STRING, col169 STRING, col170 STRING, col171 STRING, col172 STRING, col173 STRING,
        col174 STRING, col175 STRING, col176 STRING, col177 STRING, col178 STRING, col179 STRING,
        col180 STRING, col181 STRING, col182 STRING, col183 STRING, col184 STRING, col185 STRING,
        col186 STRING, col187 STRING, col188 STRING, col189 STRING, col190 STRING, col191 STRING,
        col192 STRING, col193 STRING, col194 STRING, col195 STRING, col196 STRING, col197 STRING,
        col198 STRING, col199 STRING, col200 STRING, col201 STRING, col202 STRING, col203 STRING,
        col204 STRING, col205 STRING, col206 STRING, col207 STRING, col208 STRING, col209 STRING,
        col210 STRING, col211 STRING, col212 STRING, col213 STRING, col214 STRING, col215 STRING,
        col216 STRING, col217 STRING, col218 STRING, col219 STRING, col220 STRING, col221 STRING,
        col222 STRING, col223 STRING, col224 STRING, col225 STRING, col226 STRING, col227 STRING,
        col228 STRING, col229 STRING, col230 STRING, col231 STRING, col232 STRING, col233 STRING,
        col234 STRING, col235 STRING, col236 STRING, col237 STRING, col238 STRING, col239 STRING,
        col240 STRING, col241 STRING, col242 STRING, col243 STRING, col244 STRING, col245 STRING,
        col246 STRING, col247 STRING, col248 STRING, col249 STRING
      ) stored as avro;
      
      -- In Hive
      SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;
      SET mapred.output.compression.type=BLOCK;
      insert into wide_250_cols
      select
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment, l_comment,
        l_comment, l_comment, l_comment, l_comment, l_comment
       
      from tpch.lineitem;
      
      -- In Impala
      invalidate metadata wide_250_cols;
      set num_scanner_threads=0; set batch_size=0; set live_summary=1; use default;
      select * from wide_250_cols where col225 = 'foo' order by col244; summary;
      /*
      +---------------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------+
      | Operator        	| #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem  | Est. Peak Mem | Detail            	|
      +---------------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------+
      | 02:MERGING-EXCHANGE | 1  	| 0ns  	| 0ns  	| 0 	| -1     	| 0 B   	| 0 B       	| UNPARTITIONED     	|
      | 01:SORT         	| 1  	| 0ns  	| 0ns  	| 0 	| -1     	| 24.52 MB  | 12.00 MB  	|                   	|
      | 00:SCAN HDFS    	| 1  	| 68.63s   | 68.63s   | 0 	| -1     	| 433.95 MB | 704.00 MB 	| default.wide_250_cols |
      +---------------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------+
      - Setting batch_size reduces peak memory
      - Setting num_scanner_threads to lower value reduces peak memory
      - Setting --max_row_batches lower reduces peak memory.
      - Setting --max_row_batches higher increase peak memory.
      - Logging the batches added to the queue shows they have 3.94MB of memory in the MemPool
        and sometimes 1 8mb I/O buffer.
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tarmstrong Tim Armstrong
                Reporter:
                tarmstrong Tim Armstrong
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: