Mojira Archive
MC-106551

Concurrency Issue in ThreadedAnvilChunkStorage / crash when using "/save-all flush"

In the method which logs "ThreadedAnvilChunkStorage ({}): All chunks are saved", a single element attempts to be popped from a Map.

The problem is that between checking for an element in this Map and the element being popped, if /save-all flush is executed, there is a chance that it will be popped and removed by another thread, causing the map to be empty, and the save thread to crash.


Everything from this point down is from research by [Mod] Pokechu22.

Actually encountering this on a real server is very rare (it took me 600 tries with a bot and I've reproduced it randomly a few other times out of a bunch of other times I've run /save-all flush (all to test this bug)), but it does eventually happen. See crash-2016-09-08_20.10.34-server.txt and crash-2016-09-08_20.10.32-server.txt (which, oddly, were generated at roughly the same time - I've never seen the watchdog do that with other crashes. I'm pretty sure that 20.10.34 has the actual relevant stacktrace, though) Also, here's the very end of the log:

[20:10:25] [Server thread/INFO]: ThreadedAnvilChunkStorage (world): All chunks are saved
[20:10:25] [Server thread/INFO]: ThreadedAnvilChunkStorage (DIM-1): All chunks are saved
[20:10:25] [Server thread/INFO]: ThreadedAnvilChunkStorage (DIM1): All chunks are saved
[20:10:25] [Server thread/INFO]: [TestBot: Saved the world]
[20:10:26] [Server thread/INFO]: [TestBot: 16641 blocks filled]
[20:10:27] [Server thread/INFO]: ThreadedAnvilChunkStorage (world): All chunks are saved
[20:10:27] [Server thread/INFO]: ThreadedAnvilChunkStorage (DIM-1): All chunks are saved
[20:10:27] [Server thread/INFO]: ThreadedAnvilChunkStorage (DIM1): All chunks are saved
[20:10:27] [Server thread/INFO]: [TestBot: Saved the world]
[20:10:28] [Server thread/INFO]: [TestBot: 16641 blocks filled]
[20:10:30] [Server thread/INFO]: ThreadedAnvilChunkStorage (world): All chunks are saved
[20:10:30] [Server thread/INFO]: ThreadedAnvilChunkStorage (DIM-1): All chunks are saved
[20:10:30] [Server thread/INFO]: ThreadedAnvilChunkStorage (DIM1): All chunks are saved
[20:10:30] [Server thread/INFO]: [TestBot: Saved the world]
[20:10:31] [Server thread/INFO]: [TestBot: 16641 blocks filled]
[20:10:31] [Server Watchdog/FATAL]: A single server tick took 60.00 seconds (should be max 0.05)
[20:10:31] [Server Watchdog/FATAL]: Considering it to be crashed, server will forcibly shutdown.
[20:10:32] [Server Watchdog/ERROR]: This crash report has been saved to: C:\pokechu22\testserver_1.11\.\crash-reports\crash-2016-09-08_20.10.32-server.txt
[20:10:32] [Server Shutdown Thread/INFO]: Stopping server

Also, of note, occasionally /save-all flush results in multiple "All chunks are saved" messages. It's random and somewhat rare, but I think it happens more often when chunks are being loaded/unloaded than when standing still. These duplicate messages are a second indicator of this concurrency issue. When it happens, it looks like this:

[19:18:05] [Server thread/INFO]: [TestBot: Saved the world]
[19:18:07] [File IO Thread/INFO]: ThreadedAnvilChunkStorage (world): All chunks are saved
[19:18:07] [Server thread/INFO]: ThreadedAnvilChunkStorage (world): All chunks are saved
[19:18:07] [Server thread/INFO]: ThreadedAnvilChunkStorage (DIM-1): All chunks are saved
[19:18:07] [Server thread/INFO]: ThreadedAnvilChunkStorage (DIM1): All chunks are saved

This issue can also result in /save-all flush producing "An unknown error occurred while attempting to perform this command". While the stacktrace is not logged in that condition normally, a 1.10.2 server modified to print the stacktrace (only modification) resulted in this:

java.util.NoSuchElementException
        at java.util.concurrent.ConcurrentHashMap$KeyIterator.next(ConcurrentHashMap.java:3416)
        at atj.c(SourceFile:127)
        at atj.b(SourceFile:163)
        at lr.c(SourceFile:178)
        at ls.q(SourceFile:903)
        at ay.a(SourceFile:47)
        at k.a(k.java:89)
        at k.a(k.java:74)
        at me.d(SourceFile:825)
        at me.a(SourceFile:812)
        at im.a(SourceFile:37)
        at im.a(SourceFile:9)
        at fl$1.run(SourceFile:13)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at h.a(SourceFile:45)
        at net.minecraft.server.MinecraftServer.D(SourceFile:598)
        at ld.D(SourceFile:338)
        at net.minecraft.server.MinecraftServer.C(SourceFile:554)
        at net.minecraft.server.MinecraftServer.run(SourceFile:458)
        at java.lang.Thread.run(Thread.java:745)

My bot basically just repeated these commands, with the /fill s being there to have more dirty chunks (which might increase the chance of it happening, but also might not):

/fill ~-64 ~-1 ~-64 ~64 ~-1 ~64 minecraft:stained_hardened_clay color=red
/save-all flush
/fill ~-64 ~-1 ~-64 ~64 ~-1 ~64 minecraft:stained_hardened_clay color=blue
/save-all flush