Poison


  • 首页

  • 归档

  • 标签

  • 搜索
close
Poison

Kubernetes #66607

发表于 2021-11-08

我们在部分应用部署至 K8s 集群后发现一个问题,若应用 A 与应用 B 都部署至 K8s 的集群中,且应用 B 对接了负载均衡在公网通过 HTTPS 暴露其 HTTP 接口,负载均衡在七层协议层进行 HTTPS 流量解密这种场景下,应用 A 通过 HTTPS 接口请求负载均衡时,K8s 内部会将 IP 直接路由至应用 B 的节点,导致流量没有经过负载均衡,从而使 HTTPS 流量被发送至 HTTP 服务的端口,触发 SSL 握手异常。

在 Java 层中的部分异常栈帧如下:

1
2
3
4
5
6
7
8
Caused by: javax.net.ssl.SSLException: Unsupported or unrecognized SSL message
at sun.security.ssl.SSLSocketInputRecord.handleUnknownRecord(SSLSocketInputRecord.java:448)
at sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:174)
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:110)
at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1290)
at sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1199)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:401)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:373)

使用 curl 命令的异常信息如下:

1
2
3
4
5
6
7
8
9
10
11
12
* TCP_NODELAY set
* Connected to lb_domain (39.103.236.188) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* error:1408F10B:SSL routines:ssl3_get_record:wrong version number
* stopped the pause stream!
* Closing connection 0
curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number

路由信息可以通过 traceroute 命令查看,关于该问题的讨论及解决方案可以参考下方链接,因为该问题官方还没有修复,最后我们通过引入 Ingress 来做流量路由解决的该问题。

References

DigitalOcean Kubernetes and SSL wrong version number error for the requests from inside a pod | by ismail yenigül | FAUN Publication
Why kube-proxy add external-lb’s address to node local iptables rule? · Issue #66607 · kubernetes/kubernetes · GitHub
enhancements/keps/sig-network/1860-kube-proxy-IP-node-binding at master · kubernetes/enhancements · GitHub
Add IP mode field to loadbalancer status ingress by Sh4d1 · Pull Request #97681 · kubernetes/kubernetes · GitHub

Poison

RateLimiter

发表于 2021-11-03

我之前曾使用过阿里云离线版的 IP 地理位置库,在该 SDK 中,使用了 RateLimiter 去对用户的调用速率进行限制,记得上限为 15w,早期的版本采用了 tryAcquire 方法去尝试获取许可,即使用的非阻塞版本,该问题导致我们集成至 Spark 集群后,在进行离线计算时因为超过 QPS 上限使任务失败,后面向他们反馈该问题后,他们将限速实现调整为了基于 acquire 方法的阻塞版本,在限速的基础上支持了离线计算环境下的正常运行,本文简要记录 RateLimiter 的限流实现机制。

首先根据 RateLimiter 的官方文档我们知道,RateLimiter 支持以可配置的速率分发许可,它支持并发调用,且将限制来自所有线程的总调用率,但是不保证公平性。

阅读全文 »
Poison

Decorator

发表于 2021-10-31

之前在编写基于 ScheduledExecutorService (Java Platform SE 8 ) 的定时任务处理逻辑时,发现若任务出现异常,将不会被再调度执行,其文档中也有如下说明:

If any execution of the task encounters an exception, subsequent executions are suppressed.

比如如下代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
package me.tianshuang;

import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

public class ScheduledExecutorServiceTest {

public static void main(String[] args) {
Executors.newSingleThreadScheduledExecutor()
.scheduleAtFixedRate(() -> System.out.println("I'm running..."), 1, 1, TimeUnit.SECONDS);
}

}
阅读全文 »
Poison

Isolated Agent Classloader

发表于 2021-10-30

关于 Java Agent 为何需要做类加载隔离,我在实际开发 Java Agent 之前是不清楚的,直到业务需要将 Java Agent 用于应用监控,在开发过程中,对整个类加载器层次及类隔离有了更深入的理解,本文简要记录。

在早期我们用于监控的 Java Agent 的实现中,是没有做类加载隔离的,因为起初的 Java Agent 实现非常简单,仅仅是监控是否有堆转储文件产生,然后触发告警,此时 Java Agent 没有任何依赖。随着业务发展,越来越多的依赖加入至 Java Agent 后,我们发现集成至 JVM 应用后,会触发各种关于类加载的异常,如:X cannot be cast to X exceptions。

阅读全文 »
Poison

Executor of Tomcat

发表于 2021-10-19

首先我们看看 JDK 中 java.util.concurrent.ThreadPoolExecutor 提交任务的实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
/**
* Executes the given task sometime in the future. The task
* may execute in a new thread or in an existing pooled thread.
*
* If the task cannot be submitted for execution, either because this
* executor has been shutdown or because its capacity has been reached,
* the task is handled by the current {@code RejectedExecutionHandler}.
*
* @param command the task to execute
* @throws RejectedExecutionException at discretion of
* {@code RejectedExecutionHandler}, if the task
* cannot be accepted for execution
* @throws NullPointerException if {@code command} is null
*/
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
/*
* Proceed in 3 steps:
*
* 1. If fewer than corePoolSize threads are running, try to
* start a new thread with the given command as its first
* task. The call to addWorker atomically checks runState and
* workerCount, and so prevents false alarms that would add
* threads when it shouldn't, by returning false.
*
* 2. If a task can be successfully queued, then we still need
* to double-check whether we should have added a thread
* (because existing ones died since last checking) or that
* the pool shut down since entry into this method. So we
* recheck state and if necessary roll back the enqueuing if
* stopped, or start a new thread if there are none.
*
* 3. If we cannot queue task, then we try to add a new
* thread. If it fails, we know we are shut down or saturated
* and so reject the task.
*/
int c = ctl.get();
if (workerCountOf(c) < corePoolSize) {
if (addWorker(command, true))
return;
c = ctl.get();
}
if (isRunning(c) && workQueue.offer(command)) {
int recheck = ctl.get();
if (! isRunning(recheck) && remove(command))
reject(command);
else if (workerCountOf(recheck) == 0)
addWorker(null, false);
}
else if (!addWorker(command, false))
reject(command);
}

根据以上代码及注释,我们知道,在 JDK 的线程池实现中,当线程池中的 worker 数量小于 corePoolSize 时,会尝试创建 worker 并执行任务,而如果线程池中的 worker 数量大于等于 corePoolSize 时,会尝试将任务放入队列,仅当放入队列失败时才会尝试创建 worker 并执行任务。那么可以理解为 JDK 中的实现偏向于尽量少创建线程,优先放入队列,更加适合于 CPU 密集型的任务。

阅读全文 »
1…121314…27

131 日志
119 标签
GitHub LeetCode
© 2025 Poison 蜀ICP备16000644号
由 Hexo 强力驱动
主题 - NexT.Mist