Ribbon的AvailabilityFilteringRule的坑(Spring Cloud Finchley.SR2)
![作者头像](https://ask.qcloudimg.com/random-avatar/8494643/j0npsfglkt.png)
如题,本文基于Spring Cloud Finchley.SR2
我们项目配置了AvailabilityFilteringRule作为所有Ribbon调用的负载均衡规则,它有那些坑呢(理解歧义和注意点)?
首先来看源码,核心是choose方法:
public Server choose(Object key) {
int count = 0;
//通过轮询选择一个server
Server server = roundRobinRule.choose(key);
//尝试10次如果都不满足要求,就放弃,采用父类的choose
//这里为啥尝试10次?
//1. 轮询结果相互影响,可能导致某个请求每次调用轮询返回的都是同一个有问题的server
//2. 集群很大时,遍历整个集群判断效率低,我们假设集群中健康的实例要比不健康的多,如果10次找不到,就用父类的choose,这也是一种快速失败机制
while (count++ <= 10) {
if (predicate.apply(new PredicateKey(server))) {
return server;
server = roundRobinRule.choose(key);
return super.choose(key);
}
轮询是怎么轮询呢,为啥会相互影响?
来看下RoundRobinRule的源码
//多线程轮询算法
private int incrementAndGetModulo(int modulo) {
for (;;) {
//当前值
int current = nextServerCyclicCounter.get();
//新值,通过对于modulo(就是实例个数)取余
int next = (current + 1) % modulo;
//只有设置成功才返回
if (nextServerCyclicCounter.compareAndSet(current, next))
return next;
public Server choose(ILoadBalancer lb, Object key) {
if (lb == null) {
log.warn("no load balancer");
return null;
Server server = null;
int count = 0;
//这里也是10次,不遍历整个集群,防止一个请求执行过长时间在选server上,快速失败
while (server == null && count++ < 10) {
List reachableServers = lb.getReachableServers();
List allServers = lb.getAllServers();
int upCount = reachableServers.size();
int serverCount = allServers.size();
if ((upCount == 0) || (serverCount == 0)) {
log.warn("No up servers available from load balancer: " + lb);
return null;
int nextServerIndex = incrementAndGetModulo(serverCount);
server = allServers.get(nextServerIndex);
if (server == null) {
/* Transient. */
Thread.yield();
continue;
//判断server状态
if (server.isAlive() && (server.isReadyToServe())) {
return (server);
// Next.
server = null;
if (count >= 10) {
log.warn("No available alive servers after 10 tries from load balancer: "
+ lb);
return server;
}
AvailabilityFilteringRule如何判断Server满足条件?
看下判断类
AvailabilityPredicate
的源码:
这里涉及两个配置:
-
niws.loadbalancer.availabilityFilteringRule.filterCircuitTripped
,默认为true,即是否过滤掉断路的Server(什么是断路我们之后会说) -
niws.loadbalancer.availabilityFilteringRule.activeConnectionsLimit
,默认为Integer的最大值,每个Server实例最大的活跃连接数(其实就是本机发往这个Server未处理完的请求个数)
public boolean apply(@Nullable PredicateKey input) {
LoadBalancerStats stats = getLBStats();
if (stats == null) {
return true;
//判断是否满足条件
return !shouldSkipServer(stats.getSingleServerStat(input.getServer()));
private boolean shouldSkipServer(ServerStats stats) {
//niws.loadbalancer.availabilityFilteringRule.filterCircuitTripped是否为true
if ((CIRCUIT_BREAKER_FILTERING.get() &&
//该Server是否为断路状态
stats.isCircuitBreakerTripped())
//本机发往这个Server未处理完的请求个数是否大于Server实例最大的活跃连接数
|| stats.getActiveRequestsCount() >= activeConnectionsLimit.get()) {
return true;
return false;