spring cloud hystrix 容错保护（spring cloud 容错机制）

25-01-31 27

如果您对springcloudhystrix容错保护感兴趣，那么本文将是一篇不错的选择，我们将为您详在本文中，您将会了解到关于springcloudhystrix容错保护的详细内容，我们还将为您解答s

如果您对spring cloud hystrix 容错保护感兴趣，那么本文将是一篇不错的选择，我们将为您详在本文中，您将会了解到关于spring cloud hystrix 容错保护的详细内容，我们还将为您解答spring cloud 容错机制的相关问题，并且为您提供关于Java B2B2C多用户商城 springcloud架构-服务容错保护（Hystrix服务降级）、java版spring cloud+spring boot+redis社交电子商务平台：服务容错保护（Hystrix断路器）、Spring Cloud (7) 服务容错保护-Hystrix服务降级、Spring Cloud (9) 服务容错保护-Hystrix断路器的有价值信息。

本文目录一览：

spring cloud hystrix 容错保护（spring cloud 容错机制）
Java B2B2C多用户商城 springcloud架构-服务容错保护（Hystrix服务降级）
java版spring cloud+spring boot+redis社交电子商务平台：服务容错保护（Hystrix断路器）
Spring Cloud (7) 服务容错保护-Hystrix服务降级
Spring Cloud (9) 服务容错保护-Hystrix断路器

spring cloud hystrix 容错保护（spring cloud 容错机制）

转载自 https://github.com/Netflix/Hystrix/wiki/How-it-Works　

　在微服务架构中，我们将系统拆分成了很多服务单元，各单元的应用间通过服务注册与订阅的方式互相依赖。由于每个单元都在不同的进程中运行，依赖通过远程调用的方式执行，这样就有可能因为网络原因或是依赖服务自身问题出现调用故障或延迟，而这些问题会直接导致调用方的对外服务也出现延迟，若此时调用方的请求不断增加，最后就会因等待出现故障的依赖方响应形成任务积压，最终导致自身服务的瘫痪。

原理

如下是github hystrix中工作流程图

The following sections will explain this flow in greater detail:

Construct a HystrixCommand or HystrixObservableCommand Object
Execute the Command
Is the Response Cached?
Is the Circuit Open?
Is the Thread Pool/Queue/Semaphore Full?
HystrixObservableCommand.construct() or HystrixCommand.run()
Calculate Circuit Health
Get the Fallback
Return the Successful Response

1. Construct a `HystrixCommand` or `HystrixObservableCommand` Object

The first step is to construct a HystrixCommand or HystrixObservableCommand object to represent the request you are making to the dependency. Pass the constructor any arguments that will be needed when the request is made.

Construct a HystrixCommand object if the dependency is expected to return a single response. For example:

HystrixCommand command = new HystrixCommand(arg1, arg2);

Construct a HystrixObservableCommand object if the dependency is expected to return an Observable that emits responses. For example:

HystrixObservableCommand command = new HystrixObservableCommand(arg1, arg2);

2. Execute the Command

There are four ways you can execute the command, by using one of the following four methods of your Hystrix command object (the first two are only applicable to simple HystrixCommand objects and are not available for the HystrixObservableCommand):

execute() — blocks, then returns the single response received from the dependency (or throws an exception in case of an error)
queue() — returns a Future with which you can obtain the single response from the dependency
observe() — subscribes to the Observable that represents the response(s) from the dependency and returns an Observable that replicates that source Observable
toObservable() — returns an Observable that, when you subscribe to it, will execute the Hystrix command and emit its responses

K             value   = command.execute();
Future<K> fValue = command.queue(); Observable<K> ohValue = command.observe(); //hot observable Observable<K> ocValue = command.toObservable(); //cold observable

The synchronous call execute() invokes queue().get(). queue() in turn invokes toObservable().toBlocking().toFuture(). Which is to say that ultimately every HystrixCommand is backed by an Observable implementation, even those commands that are intended to return single, simple values.

3. Is the Response Cached?

If request caching is enabled for this command, and if the response to the request is available in the cache, this cached response will be immediately returned in the form of an Observable. (See “Request Caching” below.)

4. Is the Circuit Open?

When you execute the command, Hystrix checks with the circuit-breaker to see if the circuit is open.

If the circuit is open (or “tripped”) then Hystrix will not execute the command but will route the flow to (8) Get the Fallback.

If the circuit is closed then the flow proceeds to (5) to check if there is capacity available to run the command.

5. Is the Thread Pool/Queue/Semaphore Full?

If the thread-pool and queue (or semaphore, if not running in a thread) that are associated with the command are full then Hystrix will not execute the command but will immediately route the flow to (8) Get the Fallback.

6. `HystrixObservableCommand.construct()` or `HystrixCommand.run()`

Here, Hystrix invokes the request to the dependency by means of the method you have written for this purpose, one of the following:

HystrixCommand.run() — returns a single response or throws an exception
HystrixObservableCommand.construct() — returns an Observable that emits the response(s) or sends an onError notification

If the run() or construct() method exceeds the command’s timeout value, the thread will throw a TimeoutException (or a separate timer thread will, if the command itself is not running in its own thread). In that case Hystrix routes the response through 8. Get the Fallback, and it discards the eventual return value run() or construct() method if that method does not cancel/interrupt.

Please note that there''s no way to force the latent thread to stop work - the best Hystrix can do on the JVM is to throw it an InterruptedException. If the work wrapped by Hystrix does not respect InterruptedExceptions, the thread in the Hystrix thread pool will continue its work, though the client already received a TimeoutException. This behavior can saturate the Hystrix thread pool, though the load is ''correctly shed''. Most Java HTTP client libraries do not interpret InterruptedExceptions. So make sure to correctly configure connection and read/write timeouts on the HTTP clients.

If the command did not throw any exceptions and it returned a response, Hystrix returns this response after it performs some some logging and metrics reporting. In the case of run(), Hystrix returns an Observable that emits the single response and then makes an onCompleted notification; in the case of construct() Hystrix returns the same Observable returned by construct().

7. Calculate Circuit Health

Hystrix reports successes, failures, rejections, and timeouts to the circuit breaker, which maintains a rolling set of counters that calculate statistics.

It uses these stats to determine when the circuit should “trip,” at which point it short-circuits any subsequent requests until a recovery period elapses, upon which it closes the circuit again after first checking certain health checks.

8. Get the Fallback

Hystrix tried to revert to your fallback whenever a command execution fails: when an exception is thrown by construct() or run() (6.), when the command is short-circuited because the circuit is open (4.), when the command’s thread pool and queue or semaphore are at capacity (5.), or when the command has exceeded its timeout length.

Write your fallback to provide a generic response, without any network dependency, from an in-memory cache or by means of other static logic. If you must use a network call in the fallback, you should do so by means of another HystrixCommand or HystrixObservableCommand.

In the case of a HystrixCommand, to provide fallback logic you implement HystrixCommand.getFallback() which returns a single fallback value.

In the case of a HystrixObservableCommand, to provide fallback logic you implement HystrixObservableCommand.resumeWithFallback() which returns an Observable that may emit a fallback value or values.

If the fallback method returns a response then Hystrix will return this response to the caller. In the case of a HystrixCommand.getFallback(), it will return an Observable that emits the value returned from the method. In the case of HystrixObservableCommand.resumeWithFallback() it will return the same Observable returned from the method.

If you have not implemented a fallback method for your Hystrix command, or if the fallback itself throws an exception, Hystrix still returns an Observable, but one that emits nothing and immediately terminates with an onError notification. It is through this onError notification that the exception that caused the command to fail is transmitted back to the caller. (It is a poor practice to implement a fallback implementation that can fail. You should implement your fallback such that it is not performing any logic that could fail.)

The result of a failed or nonexistent fallback will differ depending on how you invoked the Hystrix command:

execute() — throws an exception
queue() — successfully returns a Future, but this Future will throw an exception if its get() method is called
observe() — returns an Observable that, when you subscribe to it, will immediately terminate by calling the subscriber’s onError method
toObservable() — returns an Observable that, when you subscribe to it, will terminate by calling the subscriber’s onError method

9. Return the Successful Response

If the Hystrix command succeeds, it will return the response or responses to the caller in the form of an Observable. Depending on how you have invoked the command in step 2, above, this Observable may be transformed before it is returned to you:

execute() — obtains a Future in the same manner as does .queue() and then calls get() on this Future to obtain the single value emitted by the Observable
queue() — converts the Observable into a BlockingObservable so that it can be converted into a Future, then returns this Future
observe() — subscribes to the Observable immediately and begins the flow that executes the command; returns an Observable that, when you subscribe to it, replays the emissions and notifications
toObservable() — returns the Observable unchanged; you must subscribe to it in order to actually begin the flow that leads to the execution of the command

Sequence Diagram

@adrianb11 has kindly provided a sequence diagram demonstrating the above flows

Circuit Breaker

The following diagram shows how a HystrixCommand or HystrixObservableCommand interacts with a HystrixCircuitBreaker and its flow of logic and decision-making, including how the counters behave in the circuit breaker.

The precise way that the circuit opening and closing occurs is as follows:

Assuming the volume across a circuit meets a certain threshold (HystrixCommandProperties.circuitBreakerRequestVolumeThreshold())...
And assuming that the error percentage exceeds the threshold error percentage (HystrixCommandProperties.circuitBreakerErrorThresholdPercentage())...
Then the circuit-breaker transitions from CLOSED to OPEN.
While it is open, it short-circuits all requests made against that circuit-breaker.
After some amount of time (HystrixCommandProperties.circuitBreakerSleepWindowInMilliseconds()), the next single request is let through (this is the HALF-OPEN state). If the request fails, the circuit-breaker returns to the OPEN state for the duration of the sleep window. If the request succeeds, the circuit-breaker transitions to CLOSED and the logic in 1. takes over again.

Isolation

Hystrix employs the bulkhead pattern to isolate dependencies from each other and to limit concurrent access to any one of them.

Threads & Thread Pools

Clients (libraries, network calls, etc) execute on separate threads. This isolates them from the calling thread (Tomcat thread pool) so that the caller may “walk away” from a dependency call that is taking too long.

Hystrix uses separate, per-dependency thread pools as a way of constraining any given dependency so latency on the underlying executions will saturate the available threads only in that pool.

It is possible for you to protect against failure without the use of thread pools, but this requires the client being trusted to fail very quickly (network connect/read timeouts and retry configuration) and to always behave well.

Netflix, in its design of Hystrix, chose the use of threads and thread-pools to achieve isolation for many reasons including:

Many applications execute dozens (and sometimes well over 100) different back-end service calls against dozens of different services developed by as many different teams.
Each service provides its own client library.
Client libraries are changing all the time.
Client library logic can change to add new network calls.
Client libraries can contain logic such as retries, data parsing, caching (in-memory or across network), and other such behavior.
Client libraries tend to be “black boxes” — opaque to their users about implementation details, network access patterns, configuration defaults, etc.
In several real-world production outages the determination was “oh, something changed and properties should be adjusted” or “the client library changed its behavior.”
Even if a client itself doesn’t change, the service itself can change, which can then impact performance characteristics which can then cause the client configuration to be invalid.
Transitive dependencies can pull in other client libraries that are not expected and perhaps not correctly configured.
Most network access is performed synchronously.
Failure and latency can occur in the client-side code as well, not just in the network call.

Benefits of Thread Pools

The benefits of isolation via threads in their own thread pools are:

The application is fully protected from runaway client libraries. The pool for a given dependency library can fill up without impacting the rest of the application.
The application can accept new client libraries with far lower risk. If an issue occurs, it is isolated to the library and doesn’t affect everything else.
When a failed client becomes healthy again, the thread pool will clear up and the application immediately resumes healthy performance, as opposed to a long recovery when the entire Tomcat container is overwhelmed.
If a client library is misconfigured, the health of a thread pool will quickly demonstrate this (via increased errors, latency, timeouts, rejections, etc.) and you can handle it (typically in real-time via dynamic properties) without affecting application functionality.
If a client service changes performance characteristics (which happens often enough to be an issue) which in turn cause a need to tune properties (increasing/decreasing timeouts, changing retries, etc.) this again becomes visible through thread pool metrics (errors, latency, timeouts, rejections) and can be handled without impacting other clients, requests, or users.
Beyond the isolation benefits, having dedicated thread pools provides built-in concurrency which can be leveraged to build asynchronous facades on top of synchronous client libraries (similar to how the Netflix API built a reactive, fully-asynchronous Java API on top of Hystrix commands).

In short, the isolation provided by thread pools allows for the always-changing and dynamic combination of client libraries and subsystem performance characteristics to be handled gracefully without causing outages.

Note: Despite the isolation a separate thread provides, your underlying client code should also have timeouts and/or respond to Thread interrupts so it can not block indefinitely and saturate the Hystrix thread pool.

Drawbacks of Thread Pools

The primary drawback of thread pools is that they add computational overhead. Each command execution involves the queueing, scheduling, and context switching involved in running a command on a separate thread.

Netflix, in designing this system, decided to accept the cost of this overhead in exchange for the benefits it provides and deemed it minor enough to not have major cost or performance impact.

Cost of Threads

Hystrix measures the latency when it executes the construct() or run() method on the child thread as well as the total end-to-end time on the parent thread. This way you can see the cost of Hystrix overhead (threading, metrics, logging, circuit breaker, etc.).

The Netflix API processes 10+ billion Hystrix Command executions per day using thread isolation. Each API instance has 40+ thread-pools with 5–20 threads in each (most are set to 10).

The following diagram represents one HystrixCommand being executed at 60 requests-per-second on a single API instance (of about 350 total threaded executions per second per server):

At the median (and lower) there is no cost to having a separate thread.

At the 90^th percentile there is a cost of 3ms for having a separate thread.

At the 99^th percentile there is a cost of 9ms for having a separate thread. Note however that the increase in cost is far smaller than the increase in execution time of the separate thread (network request) which jumped from 2 to 28 whereas the cost jumped from 0 to 9.

This overhead at the 90^th percentile and higher for circuits such as these has been deemed acceptable for most Netflix use cases for the benefits of resilience achieved.

For circuits that wrap very low-latency requests (such as those that primarily hit in-memory caches) the overhead can be too high and in those cases you can use another method such as tryable semaphores which, while they do not allow for timeouts, provide most of the resilience benefits without the overhead. The overhead in general, however, is small enough that Netflix in practice usually prefers the isolation benefits of a separate thread over such techniques.

Semaphores

You can use semaphores (or counters) to limit the number of concurrent calls to any given dependency, instead of using thread pool/queue sizes. This allows Hystrix to shed load without using thread pools but it does not allow for timing out and walking away. If you trust the client and you only want load shedding, you could use this approach.

HystrixCommand and HystrixObservableCommand support semaphores in 2 places:

Fallback: When Hystrix retrieves fallbacks it always does so on the calling Tomcat thread.
Execution: If you set the property execution.isolation.strategy to SEMAPHORE then Hystrix will use semaphores instead of threads to limit the number of concurrent parent threads that invoke the command.

You can configure both of these uses of semaphores by means of dynamic properties that define how many concurrent threads can execute. You should size them by using similar calculations as you use when sizing a threadpool (an in-memory call that returns in sub-millisecond times can perform well over 5000rps with a semaphore of only 1 or 2 … but the default is 10).

Note: if a dependency is isolated with a semaphore and then becomes latent, the parent threads will remain blocked until the underlying network calls timeout.

Semaphore rejection will start once the limit is hit but the threads filling the semaphore can not walk away.

Request Collapsing

You can front a HystrixCommand with a request collapser (HystrixCollapser is the abstract parent) with which you can collapse multiple requests into a single back-end dependency call.

The following diagram shows the number of threads and network connections in two scenarios: first without and then with request collapsing (assuming all connections are “concurrent” within a short time window, in this case 10ms).

Sequence Diagram

@adrianb11 has kindly provided a sequence diagram of request-collapsing.

Why Use Request Collapsing?

Use request collapsing to reduce the number of threads and network connections needed to perform concurrent HystrixCommand executions. Request collapsing does this in an automated manner that does not force all developers of a codebase to coordinate the manual batching of requests.

Global Context (Across All Tomcat Threads)

The ideal type of collapsing is done at the global application level, so that requests from any user on any Tomcat thread can be collapsed together.

For example, if you configure a HystrixCommand to support batching for any user on requests to a dependency that retrieves movie ratings, then when any user thread in the same JVM makes such a request, Hystrix will add its request along with any others into the same collapsed network call.

Note that the collapser will pass a single HystrixRequestContext object to the collapsed network call, so downstream systems must need to handle this case for this to be an effective option.

User Request Context (Single Tomcat Thread)

If you configure a HystrixCommand to only handle batch requests for a single user, then Hystrix can collapse requests from within a single Tomcat thread (request).

For example, if a user wants to load bookmarks for 300 video objects, instead of executing 300 network calls, Hystrix can combine them all into one.

Object Modeling and Code Complexity

Sometimes when you create an object model that makes logical sense to the consumers of the object, this does not match up well with efficient resource utilization for the producers of the object.

For example, given a list of 300 video objects, iterating over them and calling getSomeAttribute() on each is an obvious object model, but if implemented naively can result in 300 network calls all being made within milliseconds of each other (and very likely saturating resources).

There are manual ways with which you can handle this, such as before allowing the user to call getSomeAttribute(), require them to declare what video objects they want to get attributes for so that they can all be pre-fetched.

Or, you could divide the object model so a user has to get a list of videos from one place, then ask for the attributes for that list of videos from somewhere else.

These approaches can lead to awkward APIs and object models that don’t match mental models and usage patterns. They can also lead to simple mistakes and inefficiencies as multiple developers work on a codebase, since an optimization done for one use case can be broken by the implementation of another use case and a new path through the code.

By pushing the collapsing logic down to the Hystrix layer, it doesn’t matter how you create the object model, in what order the calls are made, or whether different developers know about optimizations being done or even needing to be done.

The getSomeAttribute() method can be put wherever it fits best and be called in whatever manner suits the usage pattern and the collapser will automatically batch calls into time windows.

What Is the Cost of Request Collapsing?

The cost of enabling request collapsing is an increased latency before the actual command is executed. The maximum cost is the size of the batch window.

If you have a command that takes 5ms on median to execute, and a 10ms batch window, the execution time could become 15ms in a worst case. Typically a request will not happen to be submitted to the window just as it opens, and so the median penalty is half the window time, in this case 5ms.

The determination of whether this cost is worth it depends on the command being executed. A high-latency command won’t suffer as much from a small amount of additional average latency. Also, the amount of concurrency on a given command is key: There is no point in paying the penalty if there are rarely more than 1 or 2 requests to be batched together. In fact, in a single-threaded sequential iteration collapsing would be a major performance bottleneck as each iteration will wait the 10ms batch window time.

If, however, a particular command is heavily utilized concurrently and can batch dozens or even hundreds of calls together, then the cost is typically far outweighed by the increased throughput achieved as Hystrix reduces the number of threads it requires and the number of network connections to dependencies.

Collapser Flow

Request Caching

HystrixCommand and HystrixObservableCommand implementations can define a cache key which is then used to de-dupe calls within a request context in a concurrent-aware manner.

Here is an example flow involving an HTTP request lifecycle and two threads doing work within that request:

The benefits of request caching are:

Different code paths can execute Hystrix Commands without concern of duplicate work.

This is particularly beneficial in large codebases where many developers are implementing different pieces of functionality.

For example, multiple paths through code that all need to get a user’s Account object can each request it like this:

Account account = new UserGetAccount(accountId).execute(); //or Observable<Account> accountObservable = new UserGetAccount(accountId).observe();

The Hystrix RequestCache will execute the underlying run() method once and only once, and both threads executing the HystrixCommand will receive the same data despite having instantiated different instances.

Data retrieval is consistent throughout a request.

Instead of potentially returning a different value (or fallback) each time the command is executed, the first response is cached and returned for all subsequent calls within the same request.

Eliminates duplicate thread executions.

Since the request cache sits in front of the construct() or run() method invocation, Hystrix can de-dupe calls before they result in thread execution.

If Hystrix didn’t implement the request cache functionality then each command would need to implement it themselves inside the construct or run method, which would put it after a thread is queued and executed.

Java B2B2C多用户商城 springcloud架构-服务容错保护（Hystrix服务降级）

动手试一试

在开始使用Spring Cloud Hystrix实现断路器之前，我们先拿之前实现的一些内容作为基础，其中包括：

eureka-server工程：服务注册中心，端口：1001
eureka-client工程：服务提供者，两个实例启动端口分别为2001

下面我们可以复制一下之前实现的一个服务消费者：eureka-consumer-ribbon，命名为eureka-consumer-ribbon-hystrix。下面我们开始对其进行改在：

第一步：pom.xml的dependencies节点中引入spring-cloud-starter-hystrix依赖：

<dependency>
	<groupId>org.springframework.cloud</groupId>
	<artifactId>spring-cloud-starter-hystrix</artifactId>
</dependency>

第二步：在应用主类中使用@EnableCircuitBreaker或@EnableHystrix注解开启Hystrix的使用：

@EnableCircuitBreaker
@EnableDiscoveryClient
@SpringBootApplication
public class Application {

	@Bean
	@LoadBalanced
	public RestTemplate restTemplate() {
		return new RestTemplate();
	}

	public static void main(String[] args) {
		new SpringApplicationBuilder(Application.class).web(true).run(args);
	}

}

注意：这里我们还可以使用Spring Cloud应用中的@SpringCloudApplication注解来修饰应用主类，该注解的具体定义如下所示。我们可以看到该注解中包含了上我们所引用的三个注解，这也意味着一个Spring Cloud标准应用应包含服务发现以及断路器。

@Target({ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Inherited
@SpringBootApplication
@EnableDiscoveryClient
@EnableCircuitBreaker
public @interface SpringCloudApplication {
}

第三步：改造服务消费方式，新增ConsumerService类，然后将在Controller中的逻辑迁移过去。最后，在为具体执行逻辑的函数上增加@HystrixCommand注解来指定服务降级方法，比如：

@RestController
public class DcController {

    @Autowired
    ConsumerService consumerService;

    @GetMapping("/consumer")
    public String dc() {
        return consumerService.consumer();
    }

    class ConsumerService {

        @Autowired
        RestTemplate restTemplate;

        @HystrixCommand(fallbackMethod = "fallback")
        public String consumer() {
            return restTemplate.getForObject("http://eureka-client/dc", String.class);
        }

        public String fallback() {
            return "fallback";
        }

    }

}

下面我们来验证一下上面Hystrix带来的一些基础功能。我们先把涉及的服务都启动起来，然后访问localhost:2101/consumer，此时可以获取正常的返回，比如：Services: [eureka-consumer-ribbon-hystrix, eureka-client]。

为了触发服务降级逻辑，我们可以将服务提供者eureka-client的逻辑加一些延迟，比如：

@GetMapping("/dc")
public String dc() throws InterruptedException {
    Thread.sleep(5000L);
    String services = "Services: " + discoveryClient.getServices();
    System.out.println(services);
    return services;
}

重启eureka-client之后，再尝试访问localhost:2101/consumer，此时我们将获得的返回结果为：fallback。我们从eureka-client的控制台中，可以看到服务提供方输出了原本要返回的结果，但是由于返回前延迟了5秒，而服务消费方触发了服务请求超时异常，服务消费者就通过HystrixCommand注解中指定的降级逻辑进行执行，因此该请求的结果返回了fallback。这样的机制，对自身服务起到了基础的保护，同时还为异常情况提供了自动的服务降级切换机制。

Spring Cloud大型企业分布式微服务云构建的B2B2C电子商务平台源码请加企鹅求求：贰一四七七七五六叁叁

java版spring cloud+spring boot+redis社交电子商务平台：服务容错保护（Hystrix断路器）

断路器
spring cloud b2b2c电子商务社交平台源码请加企鹅求求：叁五叁六贰四柒贰五九。断路器模式源于Martin Fowler的Circuit Breaker一文。“断路器”本身是一种开关装置，用于在电路上保护线路过载，当线路中有电器发生短路时，“断路器”能够及时的切断故障电路，防止发生过载、发热、甚至起火等严重后果。

在分布式架构中，断路器模式的作用也是类似的，当某个服务单元发生故障（类似用电器发生短路）之后，通过断路器的故障监控（类似熔断保险丝），直接切断原来的主逻辑调用。但是，在Hystrix中的断路器除了切断主逻辑的功能之外，还有更复杂的逻辑，下面我们来看看它更为深层次的处理逻辑。

以在《Spring Cloud构建微服务架构：服务容错保护（Hystrix服务降级）》一文中实现的服务降级例子为示例，我们来说说断路器的工作原理。当我们把服务提供者eureka-client中加入了模拟的时间延迟之后，在服务消费端的服务降级逻辑因为hystrix命令调用依赖服务超时，触发了降级逻辑，但是即使这样，受限于Hystrix超时时间的问题，我们的调用依然很有可能产生堆积。

这个时候断路器就会发挥作用，那么断路器是在什么情况下开始起作用呢？这里涉及到断路器的三个重要参数：快照时间窗、请求总数下限、错误百分比下限。这个参数的作用分别是：

快照时间窗：断路器确定是否打开需要统计一些请求和错误数据，而统计的时间范围就是快照时间窗，默认为最近的10秒。
请求总数下限：在快照时间窗内，必须满足请求总数下限才有资格根据熔断。默认为20，意味着在10秒内，如果该hystrix命令的调用此时不足20次，即时所有的请求都超时或其他原因失败，断路器都不会打开。
错误百分比下限：当请求总数在快照时间窗内超过了下限，比如发生了30次调用，如果在这30次调用中，有16次发生了超时异常，也就是超过50%的错误百分比，在默认设定50%下限情况下，这时候就会将断路器打开。
那么当断路器打开之后会发生什么呢？我们先来说说断路器未打开之前，对于之前那个示例的情况就是每个请求都会在当hystrix超时之后返回fallback，每个请求时间延迟就是近似hystrix的超时时间，如果设置为5秒，那么每个请求就都要延迟5秒才会返回。当熔断器在10秒内发现请求总数超过20，并且错误百分比超过50%，这个时候熔断器打开。打开之后，再有请求调用的时候，将不会调用主逻辑，而是直接调用降级逻辑，这个时候就不会等待5秒之后才返回fallback。通过断路器，实现了自动地发现错误并将降级逻辑切换为主逻辑，减少响应延迟的效果。

在断路器打开之后，处理逻辑并没有结束，我们的降级逻辑已经被成了主逻辑，那么原来的主逻辑要如何恢复呢？对于这一问题，hystrix也为我们实现了自动恢复功能。当断路器打开，对主逻辑进行熔断之后，hystrix会启动一个休眠时间窗，在这个时间窗内，降级逻辑是临时的成为主逻辑，当休眠时间窗到期，断路器将进入半开状态，释放一次请求到原来的主逻辑上，如果此次请求正常返回，那么断路器将继续闭合，主逻辑恢复，如果这次请求依然有问题，断路器继续进入打开状态，休眠时间窗重新计时。

通过上面的一系列机制，hystrix的断路器实现了对依赖资源故障的端口、对降级策略的自动切换以及对主逻辑的自动恢复机制。这使得我们的微服务在依赖外部服务或资源的时候得到了非常好的保护，同时对于一些具备降级逻辑的业务需求可以实现自动化的切换与恢复，相比于设置开关由监控和运维来进行切换的传统实现方式显得更为智能和高效。Spring cloud b2b2c电子商务社交平台源码请加企鹅求求：叁五叁六贰四柒贰五九

Spring Cloud (7) 服务容错保护-Hystrix服务降级

　　在微服务架构中，根据业务来拆分成一个个的服务，服务与服务之间可以互相调用，在Spring Cloud可以用RestTemplate+Ribbon和Feign来调用。为了保证其高可用，单个服务通常会集群部署。由于网络原因或者自身原因，服务并不能保证100%可用，如果单个服务出现问题，调用这个服务就会出现线程阻塞，此时若有大量的请求涌入，servlet容器的线程资源会被消耗完毕，导致服务瘫痪。服务与服务之间的依赖性，故障会传播，会对整个微服务造成灾难性的严重后果，这就是服务故障的雪崩效应。

　　在Spring Cloud Hystrix中实现了线程隔离、断路器等一系列的服务保护功能。它是基于Netflix的开源框架Hystrix实现的，该框架目标在于通过控制那些访问远程系统、服务和第三方的节点。从而对延迟和故障提供更强大的容错能力。Hystrix具备了服务降级、服务熔断、线程隔离、请求缓存、请求合并以及服务监控等强大的功能。

在之前的david-ribbon中pom.xml增加hystrix依赖：

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-hystrix</artifactId>
</dependency>

在启动类中使用@EnableCircuitBreaker或@EnableHystrix注解开启Hystrix

@EnableHystrix
@EnableDiscoveryClient
@SpringBootApplication
public class RibbonApplication {

    @Bean
    @LoadBalanced //开启负载均衡
    RestTemplate restTemplate(){
        return new RestTemplate();
    }

    public static void main(String[] args) {
        SpringApplication.run(RibbonApplication.class, args);
    }
}

也可以使用@SpringCloudApplication注解一个相当于上面三个

//@EnableCircuitBreaker
//@EnableDiscoveryClient
//@SpringBootApplication
@SpringCloudApplication
public class RibbonApplication {

    @Bean
    @LoadBalanced //开启负载均衡
    RestTemplate restTemplate(){
        return new RestTemplate();
    }

    public static void main(String[] args) {
        SpringApplication.run(RibbonApplication.class, args);
    }
}

新增TestService

@Service
public class TestService {

    @Autowired
    RestTemplate restTemplate;

    @HystrixCommand(fallbackMethod = "fallback")
    public String consumer(){
        return restTemplate.getForObject("http://david-eureka-client/test",String.class);
    }

    public String fallback(){
        return "fallback";
    }

}

修改TestController

@RestController
public class TestController {

    @Autowired
    TestService testService;

    @GetMapping("/consumer")
    public String test(){
        return testService.consumer();
    }

}

然后启动eureka、eureka-client、david-ribbon项目。 http://localhost:8764/consumer

test;port:8762

此时关闭client项目，再次访问则返回

fallback

Spring Cloud (9) 服务容错保护-Hystrix断路器

断路器

　　断路器本身是一种开关装置，用于在电路上保护线路过载，当线路中又电路发生短路时，断路器能够及时的切断故障电路，放置发生过载、发热、甚至起火等严重后果。

　　在分布式架构中，断路器模式的作用也是类似，当某个服务发生故障之后，通过断路器的故障监控，直接切断原来的主逻辑调用。但是，在Hystrix中的断路器除了切断主逻辑的功能之外，还有更复杂的逻辑。

之前在ribbon项目中实现了服务降级，当我们吧服务提供者eureka-client关闭后，造成无法访问test方法，触发了降级逻辑。

断路器发挥作用的三个参数：

　　快照时间窗：断路器确定是否打开需要统计一些请求和错误数据，而统计的时间范围就是快照时间窗，默认为最近的10秒。

　　请求总数下限：在快照时间窗内，必须满足请求总数下限才有资格根据熔断。默认为20，意味着在10秒内，如果该hystrix命令的调用次数不足20次，即时所有的请求都超时或其他原因失败，断路器都不会打开。

　　错误百分比下限：当请求总数在快照时间窗内超过了下限，比如发生了30此调用，如果在这30次调用中，有16次发生了超时异常，也就是超过50%的错误百分比，在默认设定50%下限情况下，这时候就会将断路器打开。

断路器打开之前，如果请求10庙内请求总数超过20，并且错误比例超过50%，则打开熔断器，打开后不会调用主逻辑，而是直接调用降级逻辑，这个时候直接返回fallback，减少响应延迟的效果。

在断路器打开之后，处理逻辑并没有结束，我们的降级逻辑已经被成为了主逻辑，那么原来的主逻辑如何恢复呢？hystrix为我们实现了自动恢复功能。当断路器打开，对主逻辑进行熔断之后hystrix会启动一个休眠时间窗，在这个时间窗内，降级逻辑是临时的成为主逻辑，当休眠时间窗到期，断路器将进入半开状态，释放一次请求到原来的住逻辑上，如果此时请求正常返回，那么断路器将继续闭合，主逻辑恢复，如果这次请求已然有问题，断路器继续进入打开状态，休眠时间窗重新计时。

上一实例中使用ribbon断路器，接下来使用feign断路器

feign是自带断路器的，在spring cloud新版本中，它默认没有打开，需要在配置文件中打开它：

feign:
  hystrix:
    enabled: true

然后在feignService接口的注解中加上fallback指定类就行了

@FeignClient(value = "david-eureka-client",fallback = TestFeignServiceHystric.class)
public interface TestFeignService {
    @GetMapping("/test")
    String consumer();
}

新建TestFeignServiceHystric类并实现此接口

@Component
public class TestFeignServiceHystric implements TestFeignService {

    @Override
    public String consumer() {
        return "error";
    }
}

启动项目：http://localhost:8765/test

port:8762

关闭 eureka-client 刷新

error

我们今天的关于spring cloud hystrix 容错保护和spring cloud 容错机制的分享已经告一段落，感谢您的关注，如果您想了解更多关于Java B2B2C多用户商城 springcloud架构-服务容错保护（Hystrix服务降级）、java版spring cloud+spring boot+redis社交电子商务平台：服务容错保护（Hystrix断路器）、Spring Cloud (7) 服务容错保护-Hystrix服务降级、Spring Cloud (9) 服务容错保护-Hystrix断路器的相关信息，请在本站查询。

本文标签：