[DATAES-760] Handling of dropped connections Created: 06/Mar/20 Updated: 31/Dec/20 Resolved: 25/Apr/20 |
|
Status: |
Closed |
Project: |
|
Component/s: |
|
Affects Version/s: |
None |
Fix Version/s: |
None |
Type: |
Improvement |
Priority: |
Minor |
Reporter: |
Assignee: |
||
Resolution: |
Won't Do |
Votes: |
0 |
None |
|||
Remaining Estimate: |
Not Specified |
||
Time Spent: |
Not Specified |
||
Original Estimate: |
Not Specified |
Issue Links:
|
|||||||||
Description |
|
When using the REST clients (reactive and non-reactive) it can happen that the connection is lost after longer idle times.
This should be investigated how this can be detected what automatic recovery can be implemented
non-reative For the non-reactive part this can be integrated in the execute method to have a retry mechanism once DATAES-751 is done Edit: non-reactive code might use for example Spring Retry to catch errors and retry to do the failed operation again, for reactive code, the different flavours of Mono.retry() or Flux.retry() can be used.
|
Comments |
|
Comment by sothawo [ 08/Mar/20 ] |
once the communication with the client is done in the execute method, connection error handling can be don in this centralized place |
Comment by Mark Paluch [ 01/Apr/20 ] |
Note that there is Spring Retry that might address some of these issues. Connection validity checking should be part of the HTTP client as there are tons of libraries that build on top of Spring. Having each of them implementing a valid check misses the point. |
Comment by sothawo [ 01/Apr/20 ] |
Comment by Mark Paluch [ 01/Apr/20 ] |
Yeah, that's the alternative. Since we cannot make any kind of assumptions on whether retrying should be used or what the settings should be, we cannot do anything useful here. We discussed an approach that lives within `RestTemplate`/`WebClient` with the team but that is again something for Spring Retry itself. For the reactive case, a simple `retry(…)` operator should be a quick way out. |
Comment by sothawo [ 01/Apr/20 ] |
ok, so I think this issue can be closed. I'll have to find a different solution how I could handle connection problems during runs of the whole tests. |
Comment by jvmlet [ 15/Apr/20 ] |
Would you please include brief guidelines in documentation/FAQ about this and how to configure rest template with retry if you decide to close this issue as 'Won't fix' |
Comment by jvmlet [ 16/Apr/20 ] |
Since we cannot make any kind of assumptions on whether retrying should be used or what the settings should be, we cannot do anything useful here. I tend to disagree, Spring Boot is about conditional configuration with healthy defaults. Customizable. The straightforward solution for this issue is to retry on Connection Reset error. This is how we detect this kind of error :
Exception ex = .....; Predicate<Throwable> shoudRetryPredicate =IOException.class::isInstance .and(t -> Optional.ofNullable(t.getMessage()).orElse("").matches("(?im).*connection\\s+reset\\s+by\\s+peer.*") );
|
Comment by Mark Paluch [ 16/Apr/20 ] |
Note that Spring Boot and Spring Data Elasticsearch different projects. Spring Boot depends on Spring Data Elasticsearch, not the other way round. I suggest filing a ticket in the Spring Boot issue tracker to provide retry support for web clients. Once these are in place, you should be able to configure retry behavior in Spring Boot. Retrying on connection reset might be an assumption that works for your case. In other arrangements that it might be undesired behavior. |
Comment by jvmlet [ 16/Apr/20 ] |
spring-data-elasticsearch is shipped as spring-boot-starter, right ? It means your auto-configuration has full capabilities to create/configure/customize every single bean in app context . Also, spring-data-elasticsearch's ElasticsearchRestTemplate seems to be using org.elasticsearch.client.RestHighLevelClient underneath , how retry support on org.springframework.web.client.RestTemplate can help here ? |
Comment by Mark Paluch [ 16/Apr/20 ] |
You're right about RestTemplate, this was an oversight, it should be the RestHighLevelClient indeed. Spring Data Elasticsearch is shipped independently from Spring Boot. Spring Boot depends on (Spring Data) Elasticsearch. |
Comment by sothawo [ 16/Apr/20 ] |
I just had a deeper look at the org.elasticsearch.client.RestClient that is used by both the low level and the high level Elasticsearch clients. The RestClient has a built-in failure retry. From the Javadocs (https://github.com/elastic/elasticsearch/blob/master/client/rest/src/main/java/org/elasticsearch/client/RestClient.java#L87-L91): * The method {@link #performRequest(String, String, Map, HttpEntity, Header...)} allows to send a request to the cluster. When * sending a request, a host gets selected out of the provided ones in a round-robin fashion. Failing hosts are marked dead and * retried after a certain amount of time (minimum 1 minute, maximum 30 minutes), depending on how many times they previously * failed (the more failures, the later they will be retried). In case of failures all of the alive nodes (or dead nodes that * deserve a retry) are retried until one responds or none of them does, in which case an {@link IOException} will be thrown. Every call done from the RestHighLevel client gets down the performRequest method. So if you have more than one node configured, failures will be handled. So when running in a local test or development mode with one node, you might encounter failures. For production use you shouldn't run with one node only anyhow. And if you have a loadbalancer or something like this fronting a multinode cluster, then this should handle failover. |
Comment by jvmlet [ 16/Apr/20 ] |
Would you consider PR that provides extendable failover support with configurable Retry, Noop (Propagate exception) and custom implementation? |
Comment by sothawo [ 16/Apr/20 ] |
Thinking about the change that was implemented with DATAES-751, you can add your own retry-handling pretty easily by providing a custom ElasticsearchRestTemplate:
@Configuration public class RestClientConfig extends AbstractElasticsearchConfiguration { @Override @Bean public RestHighLevelClient elasticsearchClient() { ClientConfiguration clientConfiguration = ClientConfiguration.builder() // .connectedTo("localhost:9200") // .build(); return RestClients.create(clientConfiguration).rest(); } @Override public ElasticsearchOperations elasticsearchOperations(ElasticsearchConverter elasticsearchConverter) { return new ElasticsearchRestTemplate(elasticsearchClient(), elasticsearchConverter) { @Override public <T> T execute(ClientCallback<T> callback) { try { return super.execute(callback); } catch (DataAccessResourceFailureException e) { // retry return super.execute(callback); } } }; } } As all the calls using the client are done via execute(ClientCallback<T>) now, you can just overwrite this method, call the base, handle error retry by calling the base implementation again. So no need for further changes. |
Comment by jvmlet [ 19/Apr/20 ] |
Thanks for the sample, sothawo, but ClientCallback<T> has package private access, please expose as public. |
Comment by sothawo [ 19/Apr/20 ] |
That was recently fixed with DATAES-789 |
Comment by Spring Issues [ 31/Dec/20 ] |
Spring Data has moved from Jira to GitHub Issues. This issue was migrated to spring-projects/spring-data-elasticsearch#1334. |
Generated at Sun Dec 26 03:29:40 UTC 2021 using Jira 8.13.15#813015-sha1:24256f8a55853293c87d0465fdf0407f4975035a.
Tags: [dataes760], handling, dropped, created, 31dec20, updated, connections, 06mar20