Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.9.0
-
None
-
ghx-label-2
Description
On gerrit-verify-dryrun jobs, while attempting to submit a change to update the Kudu version seems to cause the free-pool-test to run out of memory.
The free-pool-test makes some large allocations (I think around 7gb in total), but when there are other processes running, it seems the gerrit jobs may be getting close to the 15gb CommitLimit on these aws hosts.
Here's the output from the kern.log
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153878] java invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153882] java cpuset=/ mems_allowed=0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153884] CPU: 1 PID: 19555 Comm: java Not tainted 3.13.0-100-generic #147-Ubuntu
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153886] Hardware name: Xen HVM domU, BIOS 4.2.amazon 02/16/2017
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153887] 0000000000000000 ffff88066708f970 ffffffff8172a4bb ffff88047b3b1800
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153891] 0000000000000000 ffff88066708f9f8 ffffffff81724a5a 0000000000000000
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153894] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153897] Call Trace:
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153904] [<ffffffff8172a4bb>] dump_stack+0x64/0x82
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153908] [<ffffffff81724a5a>] dump_header+0x7f/0x1f1
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153912] [<ffffffff81155d11>] oom_kill_process+0x201/0x360
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153917] [<ffffffff812dcab5>] ? security_capable_noaudit+0x15/0x20
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153919] [<ffffffff811564a1>] out_of_memory+0x471/0x4b0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153922] [<ffffffff8115c7bc>] __alloc_pages_nodemask+0xa6c/0xb90
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153926] [<ffffffff8119ae83>] alloc_pages_current+0xa3/0x160
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153930] [<ffffffff811527c7>] __page_cache_alloc+0x97/0xc0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153932] [<ffffffff81154235>] filemap_fault+0x185/0x410
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153936] [<ffffffff8117944f>] __do_fault+0x6f/0x530
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153941] [<ffffffff810135db>] ? __switch_to+0x16b/0x4f0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153943] [<ffffffff8117d2a2>] handle_mm_fault+0x482/0xf00
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153947] [<ffffffff81090df7>] ? hrtimer_try_to_cancel+0x47/0x100
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153950] [<ffffffff8172df0e>] ? schedule_hrtimeout_range_clock+0xce/0x170
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153954] [<ffffffff81736644>] __do_page_fault+0x184/0x560
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153957] [<ffffffff8120a45f>] ? ep_poll+0x2ff/0x330
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153961] [<ffffffff8109d2f0>] ? wake_up_state+0x20/0x20
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153964] [<ffffffff81736a3a>] do_page_fault+0x1a/0x70
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153966] [<ffffffff8120b5cc>] ? SyS_epoll_wait+0xac/0x100
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153968] [<ffffffff81732d68>] page_fault+0x28/0x30
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153970] Mem-Info:
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153971] Node 0 DMA per-cpu:
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153973] CPU 0: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153974] CPU 1: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153975] CPU 2: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153976] CPU 3: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153977] CPU 4: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153978] CPU 5: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153979] CPU 6: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153980] CPU 7: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153981] CPU 8: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153982] CPU 9: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153984] CPU 10: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153985] CPU 11: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153986] CPU 12: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153987] CPU 13: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153988] CPU 14: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153989] CPU 15: hi: 0, btch: 1 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153990] Node 0 DMA32 per-cpu:
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153991] CPU 0: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153992] CPU 1: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153993] CPU 2: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153995] CPU 3: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153996] CPU 4: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153997] CPU 5: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153998] CPU 6: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153999] CPU 7: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154001] CPU 8: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154002] CPU 9: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154003] CPU 10: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154004] CPU 11: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154005] CPU 12: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154006] CPU 13: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154007] CPU 14: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154008] CPU 15: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154009] Node 0 Normal per-cpu:
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154010] CPU 0: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154011] CPU 1: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154012] CPU 2: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154013] CPU 3: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154014] CPU 4: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154015] CPU 5: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154016] CPU 6: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154017] CPU 7: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154018] CPU 8: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154019] CPU 9: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154020] CPU 10: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154021] CPU 11: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154023] CPU 12: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154024] CPU 13: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154025] CPU 14: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154026] CPU 15: hi: 186, btch: 31 usd: 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028] active_anon:7546116 inactive_anon:3718 isolated_anon:0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028] active_file:405 inactive_file:19 isolated_file:0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028] unevictable:5 dirty:193 writeback:0 unstable:0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028] free:47219 slab_reclaimable:15089 slab_unreclaimable:22997
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028] mapped:6952 shmem:7403 pagetables:22940 bounce:0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028] free_cma:0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154031] Node 0 DMA free:15904kB min:32kB low:40kB high:48kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154035] lowmem_reserve[]: 0 3744 30129 30129
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154038] Node 0 DMA32 free:113892kB min:8392kB low:10488kB high:12588kB active_anon:3689064kB inactive_anon:1468kB active_file:260kB inactive_file:20kB unevictable:4kB isolated(anon):0kB isolated(file):0kB present:3915776kB managed:3836720kB mlocked:4kB dirty:104kB writeback:0kB mapped:4468kB shmem:4496kB slab_reclaimable:6640kB slab_unreclaimable:9296kB kernel_stack:3832kB pagetables:10816kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:451 all_unreclaimable? yes
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154041] lowmem_reserve[]: 0 0 26385 26385
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154044] Node 0 Normal free:59080kB min:59152kB low:73940kB high:88728kB active_anon:26495400kB inactive_anon:13404kB active_file:1360kB inactive_file:56kB unevictable:16kB isolated(anon):0kB isolated(file):0kB present:27525120kB managed:27019008kB mlocked:16kB dirty:668kB writeback:0kB mapped:23340kB shmem:25116kB slab_reclaimable:53716kB slab_unreclaimable:82692kB kernel_stack:33392kB pagetables:80944kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2483 all_unreclaimable? yes
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154047] lowmem_reserve[]: 0 0 0 0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154049] Node 0 DMA: 0*4kB 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15904kB
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154058] Node 0 DMA32: 167*4kB (UEM) 1991*8kB (UEM) 590*16kB (UEM) 275*32kB (UEM) 204*64kB (UEM) 139*128kB (UEM) 66*256kB (UEM) 32*512kB (EM) 11*1024kB (UEM) 2*2048kB (EM) 0*4096kB = 114324kB
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154068] Node 0 Normal: 15098*4kB (UEM) 36*8kB (EM) 3*16kB (EM) 1*32kB (E) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 60760kB
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154076] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154077] 7651 total pagecache pages
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154078] 0 pages in swap cache
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154079] Swap cache stats: add 0, delete 0, find 0/0
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154080] Free swap = 0kB
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154081] Total swap = 0kB
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154082] 7864221 pages RAM
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154083] 0 pages HighMem/MovableOnly
May 9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154084] 126528 pages reserved
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344897] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344918] [ 740] 0 740 4868 49 13 0 0 upstart-udev-br
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344920] [ 747] 0 747 12521 234 27 0 -1000 systemd-udevd
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344922] [ 912] 0 912 3814 51 12 0 0 upstart-socket-
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344924] [ 960] 0 960 2554 574 8 0 0 dhclient
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344926] [ 1256] 0 1256 3818 55 13 0 0 upstart-file-br
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344928] [ 1402] 0 1402 3633 41 12 0 0 getty
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344930] [ 1405] 0 1405 3633 40 12 0 0 getty
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344932] [ 1407] 101 1407 65017 688 29 0 0 rsyslogd
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344934] [ 1410] 0 1410 3633 42 12 0 0 getty
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344935] [ 1411] 0 1411 3633 40 12 0 0 getty
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344937] [ 1413] 0 1413 3633 39 10 0 0 getty
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344939] [ 1437] 0 1437 15344 172 34 0 -1000 sshd
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344941] [ 1465] 0 1465 4783 40 13 0 0 atd
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344943] [ 1466] 0 1466 5912 53 17 0 0 cron
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344944] [ 1481] 0 1481 1091 36 7 0 0 acpid
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344946] [ 1490] 102 1490 9802 100 24 0 0 dbus-daemon
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344948] [ 1501] 0 1501 10861 89 25 0 0 systemd-logind
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344950] [ 1507] 0 1507 4863 112 14 0 0 irqbalance
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344952] [ 1608] 0 1608 26411 252 54 0 0 sshd
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344954] [ 1729] 1000 1729 26999 847 56 0 0 sshd
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344956] [ 1936] 0 1936 3633 39 12 0 0 getty
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344957] [ 1937] 0 1937 3195 38 12 0 0 getty
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344959] [ 2445] 106 2445 7863 151 19 0 0 ntpd
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344961] [ 3564] 108 3564 33045 1466 55 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344962] [ 3566] 108 3566 33073 5407 65 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344964] [ 3567] 108 3567 33045 331 54 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344966] [ 3568] 108 3568 33045 528 52 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344968] [ 3569] 108 3569 33253 534 54 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344969] [ 3570] 108 3570 25222 376 49 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344971] [ 3797] 1000 3797 2656388 54163 197 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344973] [ 3840] 1000 3840 2826 97 9 0 0 bash
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344975] [14391] 0 14391 26410 245 53 0 0 sshd
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344977] [14454] 1000 14454 26410 252 51 0 0 sshd
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344979] [14455] 1000 14455 5660 837 16 0 0 bash
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344981] [18767] 1000 18767 421689 75153 280 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344982] [18818] 1000 18818 427646 72071 276 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344984] [18845] 1000 18845 422808 76361 287 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344986] [18986] 1000 18986 413940 85847 272 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344988] [19231] 1000 19231 423114 68900 237 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344989] [19255] 1000 19255 422304 98381 297 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344991] [19281] 1000 19281 453916 50170 285 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344993] [19307] 1000 19307 421858 73572 251 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344994] [20004] 1000 20004 2625480 62614 227 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344996] [20059] 1000 20059 2417786 642515 1907 0 0 kudu-tserver
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344998] [20075] 1000 20075 170158 4632 136 0 0 kudu-master
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344999] [20091] 1000 20091 2534506 681406 2028 0 0 kudu-tserver
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345001] [20100] 1000 20100 2480736 678210 1996 0 0 kudu-tserver
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345003] [21180] 1000 21180 2812 85 10 0 0 bash
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345004] [21194] 1000 21194 2131760 52636 186 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345006] [21277] 1000 21277 3354 114 11 0 0 bash
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345008] [21291] 1000 21291 2176412 96677 367 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345010] [21441] 1000 21441 3354 114 12 0 0 bash
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345011] [21455] 1000 21455 2171189 84863 323 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345013] [21619] 1000 21619 3354 115 11 0 0 bash
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345015] [21633] 1000 21633 2174403 126755 412 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345016] [21773] 1000 21773 3354 115 10 0 0 bash
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345018] [21787] 1000 21787 2165621 105043 368 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345019] [22327] 1000 22327 699864 64647 359 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345021] [22650] 108 22650 34060 1947 62 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345023] [22651] 108 22651 34067 1837 61 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345025] [22695] 108 22695 34564 5014 67 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345026] [22696] 108 22696 34668 5491 67 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345028] [22701] 1000 22701 499743 165103 749 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345030] [22966] 1000 22966 404221 82336 258 0 0 java
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345031] [49266] 108 49266 34579 5136 67 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345033] [49267] 108 49267 34487 5004 67 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345034] [49434] 108 49434 34298 1980 61 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345036] [49435] 108 49435 34140 1836 60 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345037] [49438] 108 49438 34575 5159 67 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345039] [49439] 108 49439 34505 5146 67 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345041] [49441] 108 49441 34590 5236 67 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345042] [49442] 108 49442 34553 5045 67 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345044] [49486] 108 49486 34507 5106 67 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345046] [49487] 108 49487 34573 5240 67 0 0 postgres
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345048] [12270] 1000 12270 3345 105 11 0 0 bash
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345049] [12634] 1000 12634 106243 2424 105 0 0 statestored
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345051] [12642] 1000 12642 2177880 69774 301 0 0 catalogd
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345052] [12708] 1000 12708 2599753 71304 505 0 0 impalad
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345054] [12775] 1000 12775 2599480 69011 503 0 0 impalad
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345055] [12844] 1000 12844 2599354 70697 503 0 0 impalad
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345057] [13564] 1000 13564 3411 159 12 0 0 bash
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345059] [13565] 1000 13565 2623 114 11 0 0 make
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345060] [13568] 1000 13568 6091 149 16 0 0 ctest
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345062] [70617] 0 70617 26410 246 55 0 0 sshd
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345063] [71103] 1000 71103 26444 247 53 0 0 sshd
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345065] [71113] 1000 71113 5628 806 16 0 0 bash
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345066] [ 2745] 1000 2745 2286602 253086 810 0 0 buffered-tuple-
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345068] [ 3922] 1000 3921 5818436 3452952 6822 0 0 free-pool-test
May 9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345070] Out of memory: Kill process 3922 (free-pool-test) score 448 or sacrifice child
and shortly after, the output of meminfo:
ubuntu@ip-172-31-7-2:~/Impala/logs/be_tests$ cat /proc/meminfo MemTotal: 30871632 kB MemFree: 7011044 kB Buffers: 40700 kB Cached: 5438488 kB SwapCached: 0 kB Active: 21124408 kB Inactive: 2125936 kB Active(anon): 17793440 kB Inactive(anon): 7516 kB Active(file): 3330968 kB Inactive(file): 2118420 kB Unevictable: 20 kB Mlocked: 20 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 1132 kB Writeback: 0 kB AnonPages: 17781632 kB Mapped: 138604 kB Shmem: 29780 kB Slab: 276868 kB SReclaimable: 182520 kB SUnreclaim: 94348 kB KernelStack: 39152 kB PageTables: 66140 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 15435816 kB Committed_AS: 49437784 kB VmallocTotal: 34359738367 kB VmallocUsed: 72396 kB VmallocChunk: 34359655000 kB HardwareCorrupted: 0 kB AnonHugePages: 15386624 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 38912 kB DirectMap2M: 3237888 kB DirectMap1G: 28311552 kB
We should probably have larger VMs for these jobs, but may also need to consider reducing the mem needed for BE tests.