Poison


  • 首页

  • 归档

  • 标签

  • 搜索
close
Poison

Spark SQL 二元逻辑表达式解析

发表于 2021-12-14

今天查看 Spark SQL 源码时发现针对二元逻辑表达式解析采用了平衡二叉树以规避左递归树的性能下降问题。布尔表达式的语法规则定义位于 SqlBase.g4 at v2.4.4:

1
2
3
4
5
6
7
booleanExpression
: NOT booleanExpression #logicalNot
| EXISTS '(' query ')' #exists
| valueExpression predicate? #predicated
| left=booleanExpression operator=AND right=booleanExpression #logicalBinary
| left=booleanExpression operator=OR right=booleanExpression #logicalBinary
;

二元逻辑表达式解析即 logicalBinary 规则解析的源码位于 AstBuilder.scala at v2.4.4:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
/**
* Combine a number of boolean expressions into a balanced expression tree. These expressions are
* either combined by a logical [[And]] or a logical [[Or]].
*
* A balanced binary tree is created because regular left recursive trees cause considerable
* performance degradations and can cause stack overflows.
*/
override def visitLogicalBinary(ctx: LogicalBinaryContext): Expression = withOrigin(ctx) {
val expressionType = ctx.operator.getType
val expressionCombiner = expressionType match {
case SqlBaseParser.AND => And.apply _
case SqlBaseParser.OR => Or.apply _
}

// Collect all similar left hand contexts.
val contexts = ArrayBuffer(ctx.right)
var current = ctx.left
def collectContexts: Boolean = current match {
case lbc: LogicalBinaryContext if lbc.operator.getType == expressionType =>
contexts += lbc.right
current = lbc.left
true
case _ =>
contexts += current
false
}
while (collectContexts) {
// No body - all updates take place in the collectContexts.
}

// Reverse the contexts to have them in the same sequence as in the SQL statement & turn them
// into expressions.
val expressions = contexts.reverseMap(expression)

// Create a balanced tree.
def reduceToExpressionTree(low: Int, high: Int): Expression = high - low match {
case 0 =>
expressions(low)
case 1 =>
expressionCombiner(expressions(low), expressions(high))
case x =>
val mid = low + x / 2
expressionCombiner(
reduceToExpressionTree(low, mid),
reduceToExpressionTree(mid + 1, high))
}
reduceToExpressionTree(0, expressions.size - 1)
}

核心处理逻辑即为最后这部分,将表达式列表进行自顶向下的构建,以使树的高度尽可能低,case 0 及 case 1 即为递归出口,可以看出这段代码还是比较妙的。

Poison

MonotonicallyIncreasingID

发表于 2021-12-14

最近查阅 Spark SQL 源码时看到了很久之前用过的获取单调递增 id 方法的实现,本文简要记录。之前在离线场景下有给记录生成唯一 id 的需求,当时使用了 Spark SQL 中的 monotonically_increasing_id 方法,其源码位于 functions.scala at v2.4.4:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/**
* A column expression that generates monotonically increasing 64-bit integers.
*
* The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive.
* The current implementation puts the partition ID in the upper 31 bits, and the record number
* within each partition in the lower 33 bits. The assumption is that the data frame has
* less than 1 billion partitions, and each partition has less than 8 billion records.
*
* As an example, consider a `DataFrame` with two partitions, each with 3 records.
* This expression would return the following IDs:
*
* {{{
* 0, 1, 2, 8589934592 (1L << 33), 8589934593, 8589934594.
* }}}
*
* @group normal_funcs
* @since 1.6.0
*/
def monotonically_increasing_id(): Column = withExpr { MonotonicallyIncreasingID() }
阅读全文 »
Poison

Spark SQL JOIN 语句解析

发表于 2021-12-14

Spark SQL 语法解析的文件位于 SqlBase.g4 at v2.4.4,其中 JOIN 语句的相关语法定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
fromClause
: FROM relation (',' relation)* lateralView* pivotClause?
;

relation
: relationPrimary joinRelation*
;

joinRelation
: (joinType) JOIN right=relationPrimary joinCriteria?
| NATURAL joinType JOIN right=relationPrimary
;
阅读全文 »
Poison

关于 Spring @Profile 注解中使用表达式不生效的问题

发表于 2021-12-08

早期我们曾使用 @Profile("!prepub") 控制预发环境不执行定时任务,后面又有需求需要让开发环境也不执行定时任务,于是将表达式改为了 @Profile("!dev & !prepub"),随即发现环境控制不生效了,所有环境都在执行定时任务。然后我又测试了表达式 @Profile("daily | product"), 发现也不生效,于是简单跟了下源码,在此简单记录。

首先根据 Profile (Spring Framework API) 的文档,我们知道:

A profile expression allows for more complicated profile logic to be expressed, for example “p1 & p2”. See Profiles.of(String…) for more details about supported formats.

即在配置中可以使用表达式,比如 p1 & p2,在 Profiles (Spring Framework API) 中还给出了其他表达式的示例。

阅读全文 »
Poison

Dubbo #9361

发表于 2021-12-06

业务应用报错找不到 Dubbo 服务提供者,简单排查了下,确认服务提供方没有注册上该服务,查了下代码,发现该接口有多个实现,每个实现都是 Dubbo 的服务提供者,通过 Dubbo 的 服务分组 进行区分。

从服务提供方的启动日志中可以看到有如下输出:

1
2021-12-06 22:32:48,432 WARN    org.apache.dubbo.config.context.ConfigManager:492 -  [DUBBO] Duplicate ServiceBean found, there already has one default ServiceBean or more than two ServiceBeans have the same id, you can try to give each ServiceBean a different id : <dubbo:service beanName="ServiceBean:me.tianshuang.service.TestService:1.0:group1" unexported="false" exported="false" ref="me.tianshuang.service.impl.TestServiceImpl@6ff7b3d5" interface="me.tianshuang.service.TestService" uniqueServiceName="group1/me.tianshuang.service.TestService:1.0" prefix="dubbo.service.me.tianshuang.service.TestService" deprecated="false" group="group1" dynamic="true" version="1.0" id="me.tianshuang.service.TestService" valid="true" />, dubbo version: 2.7.5, current host: 192.168.1.9

相同的问题在 GitHub 上可以看到已经有其他用户反馈:Dubbo 2.7.5: Duplicate ServiceBean found · Issue #5923,表现与我们本地测试的一致。上面日志输出的源码位于 ConfigManager.java at dubbo-2.7.5,对应的源码为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
static <C extends AbstractConfig> void addIfAbsent(C config, Map<String, C> configsMap, boolean unique)
throws IllegalStateException {

if (config == null || configsMap == null) {
return;
}

if (unique) { // check duplicate
configsMap.values().forEach(c -> {
checkDuplicate(c, config);
});
}

String key = getId(config);

C existedConfig = configsMap.get(key);

if (existedConfig != null && !config.equals(existedConfig)) {
if (logger.isWarnEnabled()) {
String type = config.getClass().getSimpleName();
logger.warn(String.format("Duplicate %s found, there already has one default %s or more than two %ss have the same id, " +
"you can try to give each %s a different id : %s", type, type, type, type, config));
}
} else {
configsMap.put(key, config);
}
}
阅读全文 »
1…101112…27

131 日志
119 标签
GitHub LeetCode
© 2025 Poison 蜀ICP备16000644号
由 Hexo 强力驱动
主题 - NexT.Mist