Poison

关于使用通配符时同一路径下 jar 的加载顺序

如果 classpath 含有 /tmp/jars/* 这种存在通配符的目录,而目录下的不同的 jar 正好含有全限定名称相同的类,这个时候是先加载的哪一个 jar 下面的类呢?这个问题之前一直没搞清楚,之前印象中只是说顺序未定义,但是为什么是未定义的呢,直到一次线上发布,在其中一台机器报类加载相关的错误后,才把这个问题具体的定位了一遍,先从 Java 的系统类加载器说起,我们可以写个简单的程序来验证,比如下面的代码:

1
2
3
4
5
6
7
8
9
package me.tianshuang;

public class Test {

public static void main(String[] args) {
System.out.println(System.getProperty("java.class.path"));
}

}

这段代码非常简单,仅仅打印当前运行时的 classpath,我们直接运行,可以看到以下输出:

1
/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/deploy.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/cldrdata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/dnsns.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/jaccess.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/jfxrt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/localedata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/nashorn.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/sunec.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/sunpkcs11.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/zipfs.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/javaws.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/jfr.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/jfxswt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/jsse.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/management-agent.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/plugin.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/resources.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/rt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/ant-javafx.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/dt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/javafx-mx.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/jconsole.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/packager.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/sa-jdi.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/tools.jar:/Users/tianshuang/IdeaProjects/test/target/test-classes:/Users/tianshuang/IdeaProjects/test/target/classes:/Users/tianshuang/.m2/repository/junit/junit/4.12/junit-4.12.jar:/Users/tianshuang/.m2/repository/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar:/Applications/IntelliJ IDEA.app/Contents/lib/idea_rt.jar

以上程序输出了当前运行时的 classpath,用到 classpath 主要是系统类加载器,我们先看看系统类加载器的获取代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
/**
* Returns the system class loader for delegation. This is the default
* delegation parent for new <tt>ClassLoader</tt> instances, and is
* typically the class loader used to start the application.
*
* <p> This method is first invoked early in the runtime's startup
* sequence, at which point it creates the system class loader and sets it
* as the context class loader of the invoking <tt>Thread</tt>.
*
* <p> The default system class loader is an implementation-dependent
* instance of this class.
*
* <p> If the system property "<tt>java.system.class.loader</tt>" is defined
* when this method is first invoked then the value of that property is
* taken to be the name of a class that will be returned as the system
* class loader. The class is loaded using the default system class loader
* and must define a public constructor that takes a single parameter of
* type <tt>ClassLoader</tt> which is used as the delegation parent. An
* instance is then created using this constructor with the default system
* class loader as the parameter. The resulting class loader is defined
* to be the system class loader.
*
* <p> If a security manager is present, and the invoker's class loader is
* not <tt>null</tt> and the invoker's class loader is not the same as or
* an ancestor of the system class loader, then this method invokes the
* security manager's {@link
* SecurityManager#checkPermission(java.security.Permission)
* <tt>checkPermission</tt>} method with a {@link
* RuntimePermission#RuntimePermission(String)
* <tt>RuntimePermission("getClassLoader")</tt>} permission to verify
* access to the system class loader. If not, a
* <tt>SecurityException</tt> will be thrown. </p>
*
* @return The system <tt>ClassLoader</tt> for delegation, or
* <tt>null</tt> if none
*
* @throws SecurityException
* If a security manager exists and its <tt>checkPermission</tt>
* method doesn't allow access to the system class loader.
*
* @throws IllegalStateException
* If invoked recursively during the construction of the class
* loader specified by the "<tt>java.system.class.loader</tt>"
* property.
*
* @throws Error
* If the system property "<tt>java.system.class.loader</tt>"
* is defined but the named class could not be loaded, the
* provider class does not define the required constructor, or an
* exception is thrown by that constructor when it is invoked. The
* underlying cause of the error can be retrieved via the
* {@link Throwable#getCause()} method.
*
* @revised 1.4
*/
@CallerSensitive
public static ClassLoader getSystemClassLoader() {
initSystemClassLoader();
if (scl == null) {
return null;
}
SecurityManager sm = System.getSecurityManager();
if (sm != null) {
checkClassLoaderPermission(scl, Reflection.getCallerClass());
}
return scl;
}

private static synchronized void initSystemClassLoader() {
if (!sclSet) {
if (scl != null)
throw new IllegalStateException("recursive invocation");
sun.misc.Launcher l = sun.misc.Launcher.getLauncher();
if (l != null) {
Throwable oops = null;
scl = l.getClassLoader();
try {
scl = AccessController.doPrivileged(
new SystemClassLoaderAction(scl));
} catch (PrivilegedActionException pae) {
oops = pae.getCause();
if (oops instanceof InvocationTargetException) {
oops = oops.getCause();
}
}
if (oops != null) {
if (oops instanceof Error) {
throw (Error) oops;
} else {
// wrap the exception
throw new Error(oops);
}
}
}
sclSet = true;
}
}

可以看出获取系统类加载器方法 getSystemClassLoader 会调用 initSystemClassLoader 去尝试初始化系统类加载器,在初始化方法中,可以看出 sun.misc.Launcher.getLauncher() 的实例 l 上进行 getClassLoader 方法调用获取到了系统类加载器,这两个方法的源码如下:

1
2
3
4
5
6
7
8
9
10
11
12
public static Launcher getLauncher() {
return launcher;
}

private ClassLoader loader;

/*
* Returns the class loader used to launch the main application.
*/
public ClassLoader getClassLoader() {
return loader;
}

再看看 sun.misc.Launcher 类的构造函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
public Launcher() {
// Create the extension class loader
ClassLoader extcl;
try {
extcl = ExtClassLoader.getExtClassLoader();
} catch (IOException e) {
throw new InternalError(
"Could not create extension class loader", e);
}

// Now create the class loader to use to launch the application
try {
loader = AppClassLoader.getAppClassLoader(extcl);
} catch (IOException e) {
throw new InternalError(
"Could not create application class loader", e);
}

// Also set the context class loader for the primordial thread.
Thread.currentThread().setContextClassLoader(loader);

// Finally, install a security manager if requested
String s = System.getProperty("java.security.manager");
if (s != null) {
// init FileSystem machinery before SecurityManager installation
sun.nio.fs.DefaultFileSystemProvider.create();

SecurityManager sm = null;
if ("".equals(s) || "default".equals(s)) {
sm = new java.lang.SecurityManager();
} else {
try {
sm = (SecurityManager)loader.loadClass(s).newInstance();
} catch (IllegalAccessException e) {
} catch (InstantiationException e) {
} catch (ClassNotFoundException e) {
} catch (ClassCastException e) {
}
}
if (sm != null) {
System.setSecurityManager(sm);
} else {
throw new InternalError(
"Could not create SecurityManager: " + s);
}
}
}

可以看到扩展类加载器和系统类加载器均在此构造函数中创建,其中系统类加载器由 AppClassLoader.getAppClassLoader(extcl) 进行获取,该方法的实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public static ClassLoader getAppClassLoader(final ClassLoader extcl)
throws IOException
{
final String s = System.getProperty("java.class.path");
final File[] path = (s == null) ? new File[0] : getClassPath(s);

// Note: on bugid 4256530
// Prior implementations of this doPrivileged() block supplied
// a rather restrictive ACC via a call to the private method
// AppClassLoader.getContext(). This proved overly restrictive
// when loading classes. Specifically it prevent
// accessClassInPackage.sun.* grants from being honored.
//
return AccessController.doPrivileged(
new PrivilegedAction<AppClassLoader>() {
public AppClassLoader run() {
URL[] urls =
(s == null) ? new URL[0] : pathToURLs(path);
return new AppClassLoader(urls, extcl);
}
});
}

从上面这段代码,可以看出是通过运行时系统属性 java.class.path 的值作为 classpath 创建的系统类加载器,那么,如果我们在运行最开始的测试程序时,如果指定的 classpath 含有通配符,后面又是怎样处理的呢?于是我在运行最开始的测试程序时加上了以下参数 -classpath /Users/tianshuang/IdeaProjects/test/target/test-classes:/tmp/jars/*,即我们把该程序的所在目录和 /tmp/jars/* 设置为了 classpath 并运行程序,输出如下:

1
/Users/tianshuang/IdeaProjects/test/target/test-classes:/tmp/jars/activation-1.1.jar:/tmp/jars/xmlenc-0.52.jar:/tmp/jars/sentinel-transport-common-1.8.0.jar:/tmp/jars/spymemcached-2.8.4.jar:/tmp/jars/netty-codec-4.1.25.Final.jar:/tmp/jars/antlr-2.7.7.jar:/tmp/jars/jdbc-redis-1.0-SNAPSHOT.jar:/tmp/jars/commons-pool-1.5.4.jar:/Applications/IntelliJ IDEA.app/Contents/lib/idea_rt.jar

这时我们发现,通配符已经被处理了,直接替换为了对应目录下的相关 jar,而不是在 Java 层处理的通配符匹配逻辑,于是我查询了 openjdk8 的相关源码,源码位于 src/share/bin/wildcard.c,在此截取最重要的两段注释:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/*
* Class-Path Wildcards
*
* Expansion of wildcards is done early, prior to the invocation of a
* program's main method, rather than late, during the class-loading
* process itself. Each element of the input class path containing a
* wildcard is replaced by the (possibly empty) sequence of elements
* generated by enumerating the jar files in the named directory. If
* the directory foo contains a.jar, b.jar, and c.jar,
* e.g., then the class path foo/"*" is expanded into
* foo/a.jar:foo/b.jar:foo/c.jar, and that string would be the value
* of the system property java.class.path.
*
* The order in which the jar files in a directory are enumerated in
* the expanded class path is not specified and may vary from platform
* to platform and even from moment to moment on the same machine. A
* well-constructed application should not depend upon any particular
* order. If a specific order is required then the jar files can be
* enumerated explicitly in the class path.
*/

文档里面清楚的提到,加载顺序因平台而异,即使在同一台机器上也可能变化。一个结构良好的应用不应该依赖于任何特定的顺序。到这里,我们终于搞清楚了加载顺序不确定的原因。

那么回到文章开始提到的线上发布时其中一台机器报的类加载的相关的异常问题,经过排查,发现由 /WEB-INF/lib/ 目录中 jar 的加载顺序引起,那么,在 tomcat 中,同一目录下不同 jar 的加载顺序又是怎样的呢?于是我查询了相关源码,在 org.apache.catalina.loader.WebappClassLoaderBasestart 方法中有以下代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
/**
* Start the class loader.
*
* @exception LifecycleException if a lifecycle error occurs
*/
@Override
public void start() throws LifecycleException {

state = LifecycleState.STARTING_PREP;

WebResource[] classesResources = resources.getResources("/WEB-INF/classes");
for (WebResource classes : classesResources) {
if (classes.isDirectory() && classes.canRead()) {
localRepositories.add(classes.getURL());
}
}
WebResource[] jars = resources.listResources("/WEB-INF/lib");
for (WebResource jar : jars) {
if (jar.getName().endsWith(".jar") && jar.isFile() && jar.canRead()) {
localRepositories.add(jar.getURL());
jarModificationTimes.put(
jar.getName(), Long.valueOf(jar.getLastModified()));
}
}

state = LifecycleState.STARTED;
}

跟踪以上 resources.listResources("/WEB-INF/lib") 方法调用,最后发现是使用的 org.apache.catalina.webresources.DirResourceSet#list方法,源码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
@Override
public String[] list(String path) {
checkPath(path);
String webAppMount = getWebAppMount();
if (path.startsWith(webAppMount)) {
File f = file(path.substring(webAppMount.length()), true);
if (f == null) {
return EMPTY_STRING_ARRAY;
}
String[] result = f.list();
if (result == null) {
return EMPTY_STRING_ARRAY;
} else {
return result;
}
} else {
if (!path.endsWith("/")) {
path = path + "/";
}
if (webAppMount.startsWith(path)) {
int i = webAppMount.indexOf('/', path.length());
if (i == -1) {
return new String[] {webAppMount.substring(path.length())};
} else {
return new String[] {
webAppMount.substring(path.length(), i)};
}
}
return EMPTY_STRING_ARRAY;
}
}

其中调用的是 File 实例上的 list() 方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
/**
* Returns an array of strings naming the files and directories in the
* directory denoted by this abstract pathname.
*
* <p> If this abstract pathname does not denote a directory, then this
* method returns {@code null}. Otherwise an array of strings is
* returned, one for each file or directory in the directory. Names
* denoting the directory itself and the directory's parent directory are
* not included in the result. Each string is a file name rather than a
* complete path.
*
* <p> There is no guarantee that the name strings in the resulting array
* will appear in any specific order; they are not, in particular,
* guaranteed to appear in alphabetical order.
*
* <p> Note that the {@link java.nio.file.Files} class defines the {@link
* java.nio.file.Files#newDirectoryStream(Path) newDirectoryStream} method to
* open a directory and iterate over the names of the files in the directory.
* This may use less resources when working with very large directories, and
* may be more responsive when working with remote directories.
*
* @return An array of strings naming the files and directories in the
* directory denoted by this abstract pathname. The array will be
* empty if the directory is empty. Returns {@code null} if
* this abstract pathname does not denote a directory, or if an
* I/O error occurs.
*
* @throws SecurityException
* If a security manager exists and its {@link
* SecurityManager#checkRead(String)} method denies read access to
* the directory
*/
public String[] list() {
return normalizedList();
}

即整个调用栈如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
list:1159, File (java.io)
list:126, DirResourceSet (org.apache.catalina.webresources)
list:130, StandardRoot (org.apache.catalina.webresources)
listResources:351, StandardRoot (org.apache.catalina.webresources)
processWebInfLib:581, StandardRoot (org.apache.catalina.webresources)
startInternal:720, StandardRoot (org.apache.catalina.webresources)
start:183, LifecycleBase (org.apache.catalina.util)
resourcesStart:4880, StandardContext (org.apache.catalina.core)
startInternal:5018, StandardContext (org.apache.catalina.core)
start:183, LifecycleBase (org.apache.catalina.util)
addChildInternal:753, ContainerBase (org.apache.catalina.core)
addChild:727, ContainerBase (org.apache.catalina.core)
addChild:695, StandardHost (org.apache.catalina.core)
deployDirectory:1177, HostConfig (org.apache.catalina.startup)
run:1925, HostConfig$DeployDirectory (org.apache.catalina.startup)
call:511, Executors$RunnableAdapter (java.util.concurrent)
run$$$capture:266, FutureTask (java.util.concurrent)
run:-1, FutureTask (java.util.concurrent)

java.io.File#list() 方法的注释特意提到该方法不保证名称的字符串以特定的方式返回,尤其是不保证以字母顺序返回,这也解释了我之前遇到的同一版本的应用在新的一台实例上发布报类加载的相关错误,正好 /WEB-INF/lib 下存在两个 jar 含有同名的类,在新的实例上,file.list() 将不该加载的类作为数组低索引元素进行了返回,导致首先加载到了不该加载的类触发了后续的问题,后面进行了依赖排除解决了该问题。对于 tomcat /WEB-INF/lib 目录中 jar 的加载顺序,在 tomcat 7 及以前其实是按照字母顺序加载的,从 tomcat 8 及之后,调整为了依赖底层 File 类的 list 实现,对于这次改动,可以参考这个 bug: Bug 57129 - Regression. Load WEB-INF/lib jarfiles in alphabetical order。而 java.io.File#list() 的底层实现,通过 openjdk8 源码可知,在 Linux 上,最终调用的方法位于 UnixFileSystem_md.c:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
#define readdir64_r readdir_r

JNIEXPORT jobjectArray JNICALL
Java_java_io_UnixFileSystem_list(JNIEnv *env, jobject this,
jobject file)
{
DIR *dir = NULL;
struct dirent64 *ptr;
struct dirent64 *result;
int len, maxlen;
jobjectArray rv, old;

WITH_FIELD_PLATFORM_STRING(env, file, ids.path, path) {
dir = opendir(path);
} END_PLATFORM_STRING(env, path);
if (dir == NULL) return NULL;

ptr = malloc(sizeof(struct dirent64) + (PATH_MAX + 1));
if (ptr == NULL) {
JNU_ThrowOutOfMemoryError(env, "heap allocation failed");
closedir(dir);
return NULL;
}

/* Allocate an initial String array */
len = 0;
maxlen = 16;
rv = (*env)->NewObjectArray(env, maxlen, JNU_ClassString(env), NULL);
if (rv == NULL) goto error;

/* Scan the directory */
while ((readdir64_r(dir, ptr, &result) == 0) && (result != NULL)) {
jstring name;
if (!strcmp(ptr->d_name, ".") || !strcmp(ptr->d_name, ".."))
continue;
if (len == maxlen) {
old = rv;
rv = (*env)->NewObjectArray(env, maxlen <<= 1,
JNU_ClassString(env), NULL);
if (rv == NULL) goto error;
if (JNU_CopyObjectArray(env, rv, old, len) < 0) goto error;
(*env)->DeleteLocalRef(env, old);
}
#ifdef MACOSX
name = newStringPlatform(env, ptr->d_name);
#else
name = JNU_NewStringPlatform(env, ptr->d_name);
#endif
if (name == NULL) goto error;
(*env)->SetObjectArrayElement(env, rv, len++, name);
(*env)->DeleteLocalRef(env, name);
}
closedir(dir);
free(ptr);

/* Copy the final results into an appropriately-sized array */
old = rv;
rv = (*env)->NewObjectArray(env, len, JNU_ClassString(env), NULL);
if (rv == NULL) {
return NULL;
}
if (JNU_CopyObjectArray(env, rv, old, len) < 0) {
return NULL;
}
return rv;

error:
closedir(dir);
free(ptr);
return NULL;
}

即最终使用的 Linux 系统调用 readdir_r,关于该系统调用,readdir_r(3) - Linux manual page 中无任何关于文件名顺序的说明,但是通过文档我们知道自 2.24 开始,glibc 废弃了 readdir_r,且推荐使用 readdir 系统调用,而在文档 readdir(3) - Linux manual page 中,存在如下关于文件名顺序的说明:

The order in which filenames are read by successive calls to readdir() depends on the filesystem implementation; it is unlikely that the names will be sorted in any fashion.

即返回的文件名的顺序依赖于文件系统底层实现,不太可能是根据文件名排序的顺序。更多细节可以参考:Chris’s Wiki :: blog/unix/ReaddirOrder,该文章介绍了一些文件系统的实现,如基于数组的实现,基于平衡树的实现等,且当底层基于平衡树实现时,可能使用文件名的哈希值进行插入,所以往往给人的感觉是无序的。这也与 JDK 源码中的注释相符合,总之,一个结构良好的应用不应该依赖于特定的文件顺序。

Reference

Class Path Wild Cards
Order of loading jar files from lib directory - Stack Overflow
Does readdir() guarantee an order? - Stack Overflow