Poison


  • 首页

  • 归档

  • 标签

  • 搜索
close
Poison

OuterJoinBehavior

发表于 2021-07-22

业务方反馈有一条 SQL 在 Hive on Spark 上执行会触发 OOM,原 SQL 经简化后如下:

1
SELECT c.name, COUNT(DISTINCT u.user_id) AS `uv`, COUNT(*) AS `pv` FROM u LEFT JOIN w ON get_json_object(u.param, '$.id') = w.id LEFT JOIN c ON w.cat_id = c.id WHERE u.page = 'xxx.htm' AND c.name LIKE '%筛选%' AND u.dt >= '20210715' GROUP BY c.name ORDER BY `pv` DESC LIMIT 10;

其中 u.dt 为 u 表的分区列,按照该 SQL 的含义,即只需要扫描大于等于 20210715 的分区数据,但是通过检查 SQL 执行计划可视化图及查看逻辑计划,发现 u.dt 的筛选条件并没有下推至 u 表,而是在将 u 表与 w 表进行 LEFT JOIN 后再进行了过滤,导致扫描了整张 u 表,而又因为 u 表相当大,最后触发了 OOM,随后查询 Hive 关于外连接 JOIN 行为的文档:OuterJoinBehavior - Apache Hive - Apache Software Foundation,该文档详细描述了什么时候可以执行谓词下推,什么时候不能执行谓词下推,但是,所有的例子都是两张表时的示例,并没有解释多张表时的谓词下推规则,根据该文档的描述,我们把 SQL 调整为先 LEFT JOIN 其中两张表为临时表再进行一次 LEFT JOIN 后,成功进行了谓词下推,解决了该问题。

如,以下两种写法都进行了谓词下推:

1
2
3
SELECT c.name, COUNT(DISTINCT u.user_id) `uv`, COUNT(*) AS `pv` FROM (SELECT u.user_id, w.cat_id FROM u LEFT JOIN w ON get_json_object(u.param, '$.id') = w.id WHERE u.dt >= '20210715' AND u.page = 'xxx.htm') LEFT JOIN c ON w.cat_id = c.id AND c.name LIKE '%筛选%' GROUP BY c.name ORDER BY `pv` DESC LIMIT 10;

SELECT t.name, COUNT(DISTINCT u.user_id) AS `uv` , COUNT(*) AS `pv` FROM u LEFT JOIN (SELECT w.id, c.name FROM w w LEFT JOIN cat c ON w.cat_id = c.id WHERE c.name LIKE '%筛选%') t ON get_json_object(u.param, '$.id') = t.id WHERE u.dt >= '20210715' AND u.page = 'xxx.htm' GROUP BY t.name LIMIT 10;
2021-12-02

今天阅读 《Spark SQL内核剖析》 时发现在第 11 章的 “SQL 写法的陷阱” 第三小节中举的例子即为以上引用到的链接中的内容:OuterJoinBehavior - Apache Hive - Apache Software Foundation。

References

《Hive 编程指南》
Understanding Spark’s Logical and Physical Plan in layman’s term

Poison

关于 Zip 文件中的资源查找问题

发表于 2021-07-09
1
InputStream InputStream = loader.getResourceAsStream(ipDirInJar + File.separator + dexFileName);

以上代码同事反馈在本地运行时 inputStream 为 null,发现同事使用的 Windows 系统,初步怀疑为编码问题或路径分隔符问题,经过本地 debug,发现寻找 jar 包中的含有路径分隔符的文件时,最终调用了以下方法,在 Windows 系统上返回了 0:

1
2
3
/Library/Java/JavaVirtualMachines/openjdk-8.jdk/Contents/Home/src.zip!/java/util/zip/ZipFile.java:335

private static native long getEntry(long jzfile, byte[] name, boolean addSlash);

经过查询文档,原来 zip 文件里面的路径分隔符必须使用 /,而不能使用平台独立的文件系统路径分隔符,故调整为了 / 解决了该问题。

4.4.17.1 The name of the file, with optional relative path. The path stored MUST NOT contain a drive or device letter, or a leading slash. All slashes MUST be forward slashes ‘/‘ as opposed to backwards slashes ‘' for compatibility with Amiga and UNIX file systems etc. If input came from standard input, there is no file name field.

References

.ZIP File Format Specification
Zip file is created with windows path separator

Poison

关于使用通配符时同一路径下 jar 的加载顺序

发表于 2021-07-01

如果 classpath 含有 /tmp/jars/* 这种存在通配符的目录,而目录下的不同的 jar 正好含有全限定名称相同的类,这个时候是先加载的哪一个 jar 下面的类呢?这个问题之前一直没搞清楚,之前印象中只是说顺序未定义,但是为什么是未定义的呢,直到一次线上发布,在其中一台机器报类加载相关的错误后,才把这个问题具体的定位了一遍,先从 Java 的系统类加载器说起,我们可以写个简单的程序来验证,比如下面的代码:

1
2
3
4
5
6
7
8
9
package me.tianshuang;

public class Test {

public static void main(String[] args) {
System.out.println(System.getProperty("java.class.path"));
}

}

这段代码非常简单,仅仅打印当前运行时的 classpath,我们直接运行,可以看到以下输出:

1
/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/deploy.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/cldrdata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/dnsns.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/jaccess.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/jfxrt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/localedata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/nashorn.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/sunec.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/sunpkcs11.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/ext/zipfs.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/javaws.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/jfr.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/jfxswt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/jsse.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/management-agent.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/plugin.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/resources.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/lib/rt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/ant-javafx.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/dt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/javafx-mx.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/jconsole.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/packager.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/sa-jdi.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/lib/tools.jar:/Users/tianshuang/IdeaProjects/test/target/test-classes:/Users/tianshuang/IdeaProjects/test/target/classes:/Users/tianshuang/.m2/repository/junit/junit/4.12/junit-4.12.jar:/Users/tianshuang/.m2/repository/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar:/Applications/IntelliJ IDEA.app/Contents/lib/idea_rt.jar
阅读全文 »
Poison

Continuous Full GC

发表于 2021-07-01

当 JVM 执行 FullGC 时,JVM 应用线程均被暂停执行,基于 Java Agent 的监控方案全部失效,如 JMX Exporter 表现出来的现象就是采集端掉线,Prometheus 中心对 Java Agent 发起请求时因目标 JVM 正在 FullGC 而不能采集数据。不过因为健康检查等组件的存在,影响不会很大,因为实例被探测为不可用后就会被自动移除。那么,此种情况下如果需要实时监控 GC 情况,则只要保证监控器不使用同一 JVM 实例实现,如 eero 采用的方案为使用 Python 监听 gc.log 再进行上报的方式实现。如果无需实时监控 FullGC 的话,可以使用 micrometer 提供的 JvmGcMetrics 进行 GC 信息监控,其基于 JMX 中 NotificationEmitter 的方式实现了基于通知方式的 GC 数据更新,如果发生 FullGC,数据可能会因为 STW 不能马上被采集到,但是后续采集时依然能够采集到,源码位于 JvmGcMetrics.java at v1.8.2:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
class GcMetricsNotificationListener implements NotificationListener {
private final MeterRegistry registry;

GcMetricsNotificationListener(MeterRegistry registry) {
this.registry = registry;
}

@Override
public void handleNotification(Notification notification, Object ref) {
CompositeData cd = (CompositeData) notification.getUserData();
GarbageCollectionNotificationInfo notificationInfo = GarbageCollectionNotificationInfo.from(cd);

String gcCause = notificationInfo.getGcCause();
String gcAction = notificationInfo.getGcAction();
GcInfo gcInfo = notificationInfo.getGcInfo();
long duration = gcInfo.getDuration();
if (isConcurrentPhase(gcCause, notificationInfo.getGcName())) {
Timer.builder("jvm.gc.concurrent.phase.time")
.tags(tags)
.tags("action", gcAction, "cause", gcCause)
.description("Time spent in concurrent phase")
.register(registry)
.record(duration, TimeUnit.MILLISECONDS);
} else {
Timer.builder("jvm.gc.pause")
.tags(tags)
.tags("action", gcAction, "cause", gcCause)
.description("Time spent in GC pause")
.register(registry)
.record(duration, TimeUnit.MILLISECONDS);
}

final Map<String, MemoryUsage> before = gcInfo.getMemoryUsageBeforeGc();
final Map<String, MemoryUsage> after = gcInfo.getMemoryUsageAfterGc();

countPoolSizeDelta(before, after);

final long longLivedBefore = longLivedPoolNames.stream().mapToLong(pool -> before.get(pool).getUsed()).sum();
final long longLivedAfter = longLivedPoolNames.stream().mapToLong(pool -> after.get(pool).getUsed()).sum();
if (isGenerationalGc) {
final long delta = longLivedAfter - longLivedBefore;
if (delta > 0L) {
promotedBytes.increment(delta);
}
}

// Some GC implementations such as G1 can reduce the old gen size as part of a minor GC. To track the
// live data size we record the value if we see a reduction in the long-lived heap size or
// after a major/non-generational GC.
if (longLivedAfter < longLivedBefore || shouldUpdateDataSizeMetrics(notificationInfo.getGcName())) {
liveDataSize.set(longLivedAfter);
maxDataSize.set(longLivedPoolNames.stream().mapToLong(pool -> after.get(pool).getMax()).sum());
}
}

private void countPoolSizeDelta(Map<String, MemoryUsage> before, Map<String, MemoryUsage> after) {
if (allocationPoolName == null) {
return;
}
final long beforeBytes = before.get(allocationPoolName).getUsed();
final long afterBytes = after.get(allocationPoolName).getUsed();
final long delta = beforeBytes - allocationPoolSizeAfter.get();
allocationPoolSizeAfter.set(afterBytes);
if (delta > 0L) {
allocatedBytes.increment(delta);
}
}

private boolean shouldUpdateDataSizeMetrics(String gcName) {
return nonGenerationalGcShouldUpdateDataSize(gcName) || isMajorGenerationalGc(gcName);
}

private boolean isMajorGenerationalGc(String gcName) {
return GcGenerationAge.fromGcName(gcName) == GcGenerationAge.OLD;
}

private boolean nonGenerationalGcShouldUpdateDataSize(String gcName) {
return !isGenerationalGc
// Skip Shenandoah and ZGC gc notifications with the name Pauses due to missing memory pool size info
&& !gcName.endsWith("Pauses");
}
}

其 GC 数据来源于 GarbageCollectorImpl 类,该类继承了 MemoryManagerImpl 类,而 MemoryManagerImpl 类又继承了 NotificationEmitterSupport 以支持 GC 事件通知发送,相关源码可参考文末链接。

同理,也可以采用 micrometer 的 JvmHeapPressureMetrics 进行 JVM 堆内内存压力监控,通过监测 GC 后内存占比及 GC 耗时占比可以计算出堆内内存占比,源码位于:micrometer/JvmHeapPressureMetrics.java at main · micrometer-metrics/micrometer · GitHub。思想与 TeamCity Memory Monitor | TeamCity On-Premises 类似。

References

Garbagedog: How eero does continuous monitoring of Java garbage collection
garbagedog
jdk/GarbageCollectorImpl.java at jdk8-b120 · openjdk/jdk · GitHub
jdk/MemoryManagerImpl.java at jdk8-b120 · openjdk/jdk · GitHub
jdk/NotificationEmitterSupport.java at jdk8-b120 · openjdk/jdk · GitHub

Poison

jmap

发表于 2021-07-01

源码位于 JMap.java at jdk8-b120,可以看到对参数的解析以及对实际命令的调用。如果是堆转储会调用至 attachListener.cpp at jdk8-b120:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// Implementation of "dumpheap" command.
// See also: HeapDumpDCmd class
//
// Input arguments :-
// arg0: Name of the dump file
// arg1: "-live" or "-all"
jint dump_heap(AttachOperation* op, outputStream* out) {
const char* path = op->arg(0);
if (path == NULL || path[0] == '\0') {
out->print_cr("No dump file specified");
} else {
bool live_objects_only = true; // default is true to retain the behavior before this change is made
const char* arg1 = op->arg(1);
if (arg1 != NULL && (strlen(arg1) > 0)) {
if (strcmp(arg1, "-all") != 0 && strcmp(arg1, "-live") != 0) {
out->print_cr("Invalid argument to dumpheap operation: %s", arg1);
return JNI_ERR;
}
live_objects_only = strcmp(arg1, "-live") == 0;
}

// Request a full GC before heap dump if live_objects_only = true
// This helps reduces the amount of unreachable objects in the dump
// and makes it easier to browse.
HeapDumper dumper(live_objects_only /* request GC */);
int res = dumper.dump(op->arg(0));
if (res == 0) {
out->print_cr("Heap dump file created");
} else {
// heap dump failed
ResourceMark rm;
char* error = dumper.error_as_C_string();
if (error == NULL) {
out->print_cr("Dump failed - reason unknown");
} else {
out->print_cr("%s", error);
}
}
}
return JNI_OK;
}
阅读全文 »
1…212223…27

131 日志
119 标签
GitHub LeetCode
© 2025 Poison 蜀ICP备16000644号
由 Hexo 强力驱动
主题 - NexT.Mist