Poison

System.gc()

有不少观点认为调用 System.gc() 是一个不好的习惯,当我看到 JDK 8 中 FileChannelImplmap 方法实现时,其中对 System.gc() 的调用让我感到诧异。在 FileChannelImpl 中存在如下代码 FileChannelImpl.java at jdk8-b120:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
try {
// If no exception was thrown from map0, the address is valid
addr = map0(imode, mapPosition, mapSize);
} catch (OutOfMemoryError x) {
// An OutOfMemoryError may indicate that we've exhausted memory
// so force gc and re-attempt map
System.gc();
try {
Thread.sleep(100);
} catch (InterruptedException y) {
Thread.currentThread().interrupt();
}
try {
addr = map0(imode, mapPosition, mapSize);
} catch (OutOfMemoryError y) {
// After a second OOME, fail
throw new IOException("Map failed", y);
}
}

可以看到在调用 map0 方法抛出 OutOfMemoryError 时会调用 System.gc() 方法尝试回收内存,而 JDK 源码的质量一向是比较高的,为何作者在此处调用了 System.gc() 呢?会对异常恢复有帮助吗?于是询问了该问题,主要与堆外内存有关,参考回答如下:

The FileChannelImpl’s case is different. map0 may fail due to insufficient native memory or, in case of 32 bit systems, when running out of address space. In these cases, the heap memory manager did not produce the OutOfMemoryError and it is possible that the garbage collector didn’t run. But to reclaim native memory or address space, the associated ByteBuffer instances must get garbage collected, so their cleaner can run. This is a rare corner case where calling System.gc(); makes sense.

It’s still fragile, as System.gc(); is not guaranteed to collect all objects or to run the garbage collector at all. JEP 383 is supposed to solve this, by providing better control over the lifetime of native allocations.

即尝试回收未使用的 ByteBuffer 实例所占用的物理内存,这是一个需要调用 System.gc() 的极端罕见的场景。类似地,Bits 类也调用了 System.gc(),该次调用也与直接内存有关,源码位于 Bits.java at jdk8-b120:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// These methods should be called whenever direct memory is allocated or
// freed. They allow the user to control the amount of direct memory
// which a process may access. All sizes are specified in bytes.
static void reserveMemory(long size, int cap) {
synchronized (Bits.class) {
if (!memoryLimitSet && VM.isBooted()) {
maxMemory = VM.maxDirectMemory();
memoryLimitSet = true;
}
// -XX:MaxDirectMemorySize limits the total capacity rather than the
// actual memory usage, which will differ when buffers are page
// aligned.
if (cap <= maxMemory - totalCapacity) {
reservedMemory += size;
totalCapacity += cap;
count++;
return;
}
}

System.gc();
try {
Thread.sleep(100);
} catch (InterruptedException x) {
// Restore interrupt status
Thread.currentThread().interrupt();
}
synchronized (Bits.class) {
if (totalCapacity + cap > maxMemory)
throw new OutOfMemoryError("Direct buffer memory");
reservedMemory += size;
totalCapacity += cap;
count++;
}

}

即当可供申请的内存空间不足以满足本次申请时,会调用 System.gc() 尝试回收内存后再次进行申请。JDK 8 中还有一处调用了 System.gc(),源码位于 GC.java at jdk8-b120:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
public void run() {
for (;;) {
long l;
synchronized (lock) {

l = latencyTarget;
if (l == NO_TARGET) {
/* No latency target, so exit */
GC.daemon = null;
return;
}

long d = maxObjectInspectionAge();
if (d >= l) {
/* Do a full collection. There is a remote possibility
* that a full collection will occurr between the time
* we sample the inspection age and the time the GC
* actually starts, but this is sufficiently unlikely
* that it doesn't seem worth the more expensive JVM
* interface that would be required.
*/
System.gc();
d = 0;
}

/* Wait for the latency period to expire,
* or for notification that the period has changed
*/
try {
lock.wait(l - d);
} catch (InterruptedException x) {
continue;
}
}
}
}

该次调用的原因可以参考注释,此处不再赘述。

Spark

类似地,在 Spark 的源码中,可以发现,其中 ContextCleaner 类将使用 periodicGCService 定期的调用 System.gc() 触发 Full GC 进行垃圾回收,其默认的间隔时间为半小时,源码位于 ContextCleaner.scala at v3.2.1:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
private val periodicGCService: ScheduledExecutorService =
ThreadUtils.newDaemonSingleThreadScheduledExecutor("context-cleaner-periodic-gc")

/**
* How often to trigger a garbage collection in this JVM.
*
* This context cleaner triggers cleanups only when weak references are garbage collected.
* In long-running applications with large driver JVMs, where there is little memory pressure
* on the driver, this may happen very occasionally or not at all. Not cleaning at all may
* lead to executors running out of disk space after a while.
*/
private val periodicGCInterval = sc.conf.get(CLEANER_PERIODIC_GC_INTERVAL)

/** Start the cleaner. */
def start(): Unit = {
cleaningThread.setDaemon(true)
cleaningThread.setName("Spark Context Cleaner")
cleaningThread.start()
periodicGCService.scheduleAtFixedRate(() => System.gc(),
periodicGCInterval, periodicGCInterval, TimeUnit.SECONDS)
}
Reference

Is the System.gc() call in sun.nio.ch.FileChannelImpl a bad case? - Stack Overflow
Impact of setting -XX:+DisableExplicitGC when NIO direct buffers are used - Stack Overflow
When does System.gc() do something? - Stack Overflow
[SPARK-8414] Ensure context cleaner periodic cleanups · apache/spark@1ce4adf