This is an old revision of the document!
2025-02-06 mail server crash
Mail server crashed. Reason not fully clear, was working until around 15:05. Mail host was responding to pings but ssh and all mail/LDAP related stuff was not working.
Node was moved to pve-gustav (which ran a different qemu-kvm version, wee below) the evening before.
Somebody rebooted the MV around 15:08:
Feb 6 15:08:10 mail kernel: [ 0.000000] Linux version 5.10.0-33-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.226-1 (2024-10-03)
<pre> Feb 6 15:08:10 <b>mail </b>kernel: [ 26.811753] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.811783] Call Trace: Feb 6 15:08:10 <b>mail </b>kernel: [ 26.813554] dump_stack+0x6b/0x83 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.814022] dump_header+0x4a/0x1f4 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.814448] oom_kill_process.cold+0xb/0x10 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.814827] out_of_memory+0x1bd/0x4e0 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.815223] alloc_pages_slowpath.constprop.0+0xc02/0xcc0 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.815580] alloc_pages_nodemask+0x2de/0x310 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.815942] pagecache_get_page+0x175/0x390 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.816294] filemap_fault+0x6a2/0x900 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.816655] ? xas_load+0x5/0x80 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.817069] ext4_filemap_fault+0x2d/0x50 [ext4] Feb 6 15:08:10 <b>mail </b>kernel: [ 26.817430] __do_fault+0x37/0x170 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.817754] handle_mm_fault+0x124d/0x1c00 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.818145] do_user_addr_fault+0x1b8/0x400 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.818484] exc_page_fault+0x78/0x160 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.818785] ? asm_exc_page_fault+0x8/0x30 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.819121] asm_exc_page_fault+0x1e/0x30 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.819439] RIP: 0033:0x7f8e1fe9d386 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.819740] Code: Unable to access opcode bytes at RIP 0x7f8e1fe9d35c. Feb 6 15:08:10 <b>mail </b>kernel: [ 26.820066] RSP: 002b:00007fff7fde5570 EFLAGS: 00010202 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.820387] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000004 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.820707] RDX: 00007f8e1feb09a0 RSI: 0000000000000000 RDI: 00005576b32a0270 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.821115] RBP: 00005576b32a0270 R08: 00005576b32a0270 R09: 00007f8e1fe73be0 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.821465] R10: 00005576b32a0170 R11: 0000000000000070 R12: 00007fff7fde55bc Feb 6 15:08:10 <b>mail </b>kernel: [ 26.821764] R13: 0000000000000004 R14: 0000000000000000 R15: 00007fff7fde5890 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.822111] Mem-Info: Feb 6 15:08:10 <b>mail </b>kernel: [ 26.822431] active_anon:62 inactive_anon:4025 isolated_anon:0 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.822431] active_file:132 inactive_file:35 isolated_file:0 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.822431] unevictable:0 dirty:0 writeback:0 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.822431] slab_reclaimable:3337 slab_unreclaimable:7300 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.822431] mapped:138 shmem:739 pagetables:315 bounce:0 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.822431] free:11707 free_pcp:0 free_cma:0 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.824222] Node 0 active_anon:248kB inactive_anon:16100kB active_file:284kB inactive_file:88kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:552kB dirty:0kB writeback:0kB shmem:2956kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:2832kB all_unreclaimable? no Feb 6 15:08:10 <b>mail </b>kernel: [ 26.824849] Node 0 DMA free:4128kB min:788kB low:984kB high:1180kB reserved_highatomic:0KB active_anon:0kB inactive_anon:4kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:5076kB mlocked:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.825570] lowmem_reserve[]: 0 844 844 844 844 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.825933] Node 0 DMA32 free:42896kB min:42940kB low:53672kB high:64404kB reserved_highatomic:0KB active_anon:248kB inactive_anon:16096kB active_file:48kB inactive_file:292kB unevictable:0kB writepending:0kB present:1032040kB managed:652036kB mlocked:0kB pagetables:1260kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.827042] lowmem_reserve[]: 0 0 0 0 0 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.827439] Node 0 DMA: 26*4kB (M) 13*8kB (UM) 13*16kB (M) 10*32kB (UM) 1*64kB (M) 4*128kB (UM) 1*256kB (U) 1*512kB (M) 0*1024kB 1*2048kB (M) 0*4096kB = 4128kB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.828236] Node 0 DMA32: 476*4kB (UME) 201*8kB (ME) 136*16kB (ME) 87*32kB (UME) 86*64kB (UME) 48*128kB (UME) 19*256kB (UME) 3*512kB (UM) 1*1024kB (U) 8*2048kB (UME) 0*4096kB = 43928kB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.829155] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.829594] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.830021] 913 total pagecache pages Feb 6 15:08:10 <b>mail </b>kernel: [ 26.830427] 0 pages in swap cache Feb 6 15:08:10 <b>mail </b>kernel: [ 26.830779] Swap cache stats: add 0, delete 0, find 0/0 Feb 6 15:08:10 <b>mail </b>kernel: [ 26.831302] Free swap = 0kB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.831726] Total swap = 0kB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.832224] 262008 pages RAM Feb 6 15:08:10 <b>mail </b>kernel: [ 26.832793] 0 pages HighMem/MovableOnly Feb 6 15:08:10 <b>mail </b>kernel: [ 26.833272] 97730 pages reserved Feb 6 15:08:10 <b>mail </b>kernel: [ 26.833657] 0 pages hwpoisoned Feb 6 15:08:10 <b>mail </b>kernel: [ 26.834067] Unreclaimable slab info: Feb 6 15:08:10 <b>mail </b>kernel: [ 26.834458] Name Used Total Feb 6 15:08:10 <b>mail </b>kernel: [ 26.834929] ext4_system_zone 3KB 3KB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.835368] scsi_sense_cache 400KB 400KB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.835764] RAWv6 30KB 30KB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.836221] UDPv6 94KB 94KB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.836631] mqueue_inode_cache 31KB 31KB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.837034] UNIX 382KB 382KB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.837435] RAW 32KB 32KB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.837808] hugetlbfs_inode_cache 30KB 30KB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.838234] eventpoll_pwq 47KB 47KB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.838639] request_queue 411KB 506KB Feb 6 15:08:10 <b>mail </b>kernel: [ 26.839075] biovec-max 480KB 480KB </pre>
Going in to the Proxmox web interface and attach the console to see if there is any output on the terminal revealed some kind of memory issues:
Fix: Disable the option for memory hotplug!
hotplug: disk,network,usb,cpu # was additionally with memory
pve-donna.cluster: ii pve-qemu-kvm 8.1.5-5 amd64 Full virtualization on x86 hardware pve-emil.cluster: ii pve-qemu-kvm 8.1.5-5 amd64 Full virtualization on x86 hardware pve-franz.cluster: ii pve-qemu-kvm 9.0.2-4 amd64 Full virtualization on x86 hardware pve-gustav.cluster: ii pve-qemu-kvm 8.1.5-5 amd64 Full virtualization on x86 hardware pve-hans.cluster: ii pve-qemu-kvm 8.1.5-5 amd64 Full virtualization on x86 hardware