Mojira Archive
MC-183518

High CPU usage on hypervisor host possibly due to excessive syscall usage by Minecraft Server

Wanted to host a Minecraft server in a VM for a friend.
I've noticed that the CPU usage on the hypervisor host is really high 100-200%, while the VM is showing pretty low CPU usage.
Running dstat showed that the VM is doing a lot of interrupts and context switching even when the server is completely empty.

Host's top when the Minecraft server is stopped:

Dstat output after starting the server, server is empty and there's not much going on except for the Minecraft Java process on the VM:

On the Hypervisor host, the CPU usage for qemu-kvm skyrocketed:

Let's check thread info from ps:

root@minecraft-igor-vm-1:~# ps -eLl | grep 416
4 S   999   416     1   416  0  81   1 - 1201481 futex_ ?      00:00:00 java
1 S   999   416     1   485  0  81   1 - 1201481 futex_ ?      00:00:03 java
1 S   999   416     1   488  0  81   1 - 1201481 futex_ ?      00:00:01 GC Thread#0
1 S   999   416     1   489  0  81   1 - 1201481 futex_ ?      00:00:00 G1 Main Marker
1 S   999   416     1   490  0  81   1 - 1201481 futex_ ?      00:00:00 G1 Conc#0
1 S   999   416     1   501  0  81   1 - 1201481 futex_ ?      00:00:00 G1 Refine#0
1 S   999   416     1   504  0  81   1 - 1201481 futex_ ?      00:00:00 G1 Young RemSet
1 S   999   416     1   518  0  81   1 - 1201481 futex_ ?      00:00:00 VM Thread
1 S   999   416     1   520  0  81   1 - 1201481 futex_ ?      00:00:00 Reference Handl
1 S   999   416     1   521  0  81   1 - 1201481 futex_ ?      00:00:00 Finalizer
1 S   999   416     1   523  0  81   1 - 1201481 futex_ ?      00:00:00 Signal Dispatch
1 S   999   416     1   524  1  81   1 - 1201481 futex_ ?      00:00:15 C2 CompilerThre
1 S   999   416     1   525  0  81   1 - 1201481 futex_ ?      00:00:03 C1 CompilerThre
1 S   999   416     1   526  0  81   1 - 1201481 futex_ ?      00:00:00 Sweeper thread
1 S   999   416     1   527  0  81   1 - 1201481 futex_ ?      00:00:00 Service Thread
1 R   999   416     1   528  0  81   1 - 1201481 -    ?        00:00:00 VM Periodic Tas
1 S   999   416     1   529  0  81   1 - 1201481 futex_ ?      00:00:00 Common-Cleaner
1 S   999   416     1   602  0  81   1 - 1201481 futex_ ?      00:00:01 GC Thread#1
1 S   999   416     1   603  2  81   1 - 1201481 futex_ ?      00:00:25 Server-Worker-1
1 S   999   416     1   604  0  81   1 - 1201481 futex_ ?      00:00:00 G1 Refine#1
1 S   999   416     1   605  0  81   1 - 1201481 futex_ ?      00:00:00 Snooper Timer
1 S   999   416     1   606  0  81   1 - 1201481 futex_ ?      00:00:00 Server Infinisl
1 S   999   416     1   607  6  81   1 - 1201481 futex_ ?      00:00:59 Server thread
1 S   999   416     1   609  0  81   1 - 1201481 futex_ ?      00:00:00 ObjectCleanerTh
1 S   999   416     1   610  0  81   1 - 1201481 ep_pol ?      00:00:00 Netty Server IO
1 S   999   416     1   628  0  81   1 - 1201481 futex_ ?      00:00:00 chunk IO worker
1 S   999   416     1   629  0  81   1 - 1201481 futex_ ?      00:00:00 poi IO worker
1 S   999   416     1   630  0  81   1 - 1201481 futex_ ?      00:00:00 chunk IO worker
1 S   999   416     1   631  0  81   1 - 1201481 futex_ ?      00:00:00 poi IO worker
1 S   999   416     1   632  0  81   1 - 1201481 futex_ ?      00:00:00 chunk IO worker
1 S   999   416     1   633  0  81   1 - 1201481 futex_ ?      00:00:00 poi IO worker
1 S   999   416     1   634  0  81   1 - 1201481 poll_s ?      00:00:00 RCON Listener #
1 S   999   416     1   635  0  81   1 - 1201481 futex_ ?      00:00:00 Server Watchdog

Let's measure syscall usage on the system:

syscount-perf -d 3 -c
Tracing for 3 seconds...
SYSCALL              COUNT
exit_group               1
nanosleep                1
wait4                    1
close                    2
epoll_wait               3
inotify_add_watch        4
newstat                  5
poll                     7
getrusage               10
ioctl                  632
sched_yield          18531
futex                37244

Let's double check if our minecraft server is to blame:

syscount-perf -c -p 416
Tracing PID 416...
SYSCALL              COUNT
epoll_wait               3
poll                     5
getrusage               10
sched_yield          14747
futex                29638

Let's get a look into strace from 607:

futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99535}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99533}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99491}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99442}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99500}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99484}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99474}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99545}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99520}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99291}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99496}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99542}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99450}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99507}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99489}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99460}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99454}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99442}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99500}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99449}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99453}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99488}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99481}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99520}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99461}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99577}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99527}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99512}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99462}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99449}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0
futex(0x7f8c08feabd8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=99468}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f8c08feab88, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield()                           = 0

Now let's try Minecraft Server 1.13.2:
CPU usage on host is low:

No excessive context switching or interrupts on guest:

Threads:

4 S   999   841     1   841  0  81   1 - 1255053 futex_ ?      00:00:00 java
1 S   999   841     1   842  1  81   1 - 1255053 futex_ ?      00:00:03 java
1 S   999   841     1   843  0  81   1 - 1255053 futex_ ?      00:00:00 GC Thread#0
1 S   999   841     1   844  0  81   1 - 1255053 futex_ ?      00:00:00 G1 Main Marker
1 S   999   841     1   845  0  81   1 - 1255053 futex_ ?      00:00:00 G1 Conc#0
1 S   999   841     1   846  0  81   1 - 1255053 futex_ ?      00:00:00 G1 Refine#0
1 S   999   841     1   847  0  81   1 - 1255053 futex_ ?      00:00:00 G1 Young RemSet
1 S   999   841     1   848  0  81   1 - 1255053 futex_ ?      00:00:00 VM Thread
1 S   999   841     1   849  0  81   1 - 1255053 futex_ ?      00:00:00 Reference Handl
1 S   999   841     1   850  0  81   1 - 1255053 futex_ ?      00:00:00 Finalizer
1 S   999   841     1   851  0  81   1 - 1255053 futex_ ?      00:00:00 Signal Dispatch
1 S   999   841     1   852 12  81   1 - 1255053 futex_ ?      00:00:25 C2 CompilerThre
1 S   999   841     1   853  1  81   1 - 1255053 futex_ ?      00:00:02 C1 CompilerThre
1 S   999   841     1   854  0  81   1 - 1255053 futex_ ?      00:00:00 Sweeper thread
1 S   999   841     1   855  0  81   1 - 1255053 futex_ ?      00:00:00 Service Thread
1 S   999   841     1   856  0  81   1 - 1255053 futex_ ?      00:00:00 VM Periodic Tas
1 S   999   841     1   857  0  81   1 - 1255053 futex_ ?      00:00:00 Common-Cleaner
1 S   999   841     1   859  0  81   1 - 1255053 futex_ ?      00:00:00 GC Thread#1
1 S   999   841     1   860  0  81   1 - 1255053 futex_ ?      00:00:00 G1 Refine#1
1 S   999   841     1  2369  0  81   1 - 1255053 futex_ ?      00:00:00 Snooper Timer
1 S   999   841     1  2370  0  81   1 - 1255053 futex_ ?      00:00:00 Server Infinisl
1 S   999   841     1  2371  5  81   1 - 1255053 futex_ ?      00:00:09 Server thread
1 S   999   841     1  2373  0  81   1 - 1255053 futex_ ?      00:00:00 ObjectCleanerTh
1 S   999   841     1  2374  0  81   1 - 1255053 ep_pol ?      00:00:00 Netty Server IO
1 S   999   841     1  2375  0  81   1 - 1255053 futex_ ?      00:00:00 WorldGen-Schedu
1 S   999   841     1  2376  7  81   1 - 1255053 futex_ ?      00:00:13 WorldGen-Worker
1 S   999   841     1  2377  0  81   1 - 1255053 poll_s ?      00:00:00 RCON Listener #
1 S   999   841     1  2378  0  81   1 - 1255053 futex_ ?      00:00:00 Server Watchdog
1 S   999   841     1  2379  0  81   1 - 1255053 futex_ ?      00:00:01 File IO Thread

Now let's check syscall count:

syscount-perf -d 3 -c
Tracing for 3 seconds...
SYSCALL              COUNT
exit_group               1
wait4                    1
close                    2
epoll_wait               3
poll                     6
getrusage               10
ioctl                  632
futex                 5726

This issue seems to be connected with: MC-149018
Maybe some multithreading changes or something messed up the performance for VMs?
It looks like it all started in 1.14.

Let me know if there's anything more I can add to the report to help you with resolving this.

Fixed

Mateusz Przybyłowicz

[Mojang] slicedlime

2020-05-11, 07:49 PM

2024-07-31, 03:12 PM

2024-07-31, 03:12 PM

29

19

Community Consensus

Important

Platform

Performance

1.15.2 - 1.16.4 Pre-release 11.15.2, 20w28a, 1.16.2, 1.16.3, 1.16.4 Pre-release 1

24w33a