laitimes

Technical Practice Sharing | SpringBoot uses keycloak to record the pit

author:Flash Gene

1

Problem description

Some of the company's projects use Keycloak as a unified identity authentication and permission control method, and the backend uses SpringBoot, so it generally uses SpringBoot + integrated Keycloak as a unified login solution, and the specific construction process can refer to the official documentation. I didn't have any problems until one day when the customer reported that the page suddenly couldn't open, but after a while, it was fine, and sometimes the problem was restored before it had time to locate it.

Technical Practice Sharing | SpringBoot uses keycloak to record the pit

The front-end presentation of the problem

One time it lasted for a few minutes, and other projects (using springboot integration keycloak) have the probability of this problem, the general reason for this situation is because the interface is time-consuming, so after entering the container jstack, the current stack information is printed, the stack information is relatively long, I only pasted the problematic part:

"http-nio-8081-exec-9" #61 daemon prio=5 os_prio=0 tid=0x00007efc702c1000 nid=0x4c waiting for monitor entry [0x00007efc778f6000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.keycloak.adapters.rotation.JWKPublicKeyLocator.getPublicKey(JWKPublicKeyLocator.java:60)
        - waiting to lock <0x00000003f1888968> (a org.keycloak.adapters.rotation.JWKPublicKeyLocator)
        at org.keycloak.adapters.rotation.AdapterTokenVerifier.getPublicKey(AdapterTokenVerifier.java:121)
        at org.keycloak.adapters.rotation.AdapterTokenVerifier.createVerifier(AdapterTokenVerifier.java:111)
        at org.keycloak.adapters.rotation.AdapterTokenVerifier.verifyToken(AdapterTokenVerifier.java:47)
        at org.keycloak.adapters.BearerTokenRequestAuthenticator.authenticateToken(BearerTokenRequestAuthenticator.java:103)
        at org.keycloak.adapters.BearerTokenRequestAuthenticator.authenticate(BearerTokenRequestAuthenticator.java:88)
        at org.keycloak.adapters.RequestAuthenticator.authenticate(RequestAuthenticator.java:67)
        at org.keycloak.adapters.springsecurity.filter.KeycloakAuthenticationProcessingFilter.attemptAuthentication(KeycloakAuthenticationProcessingFilter.java:154)
        at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:212)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.keycloak.adapters.springsecurity.filter.KeycloakPreAuthActionsFilter.doFilter(KeycloakPreAuthActionsFilter.java:96)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.header.HeaderWriterFilter.doHeadersAfter(HeaderWriterFilter.java:92)
        at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:77)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
....           

This should be a blocker for the backend requesting keycloak to get the publicKey.

2

Problem cause analysis

The initial analysis is that there is a problem calling keycloak, which goes to line 60 of JWKPublicKeyLocator, and the code is as follows:

// Check if we are allowed to send request
synchronized (this) {
    currentTime = Time.currentTime();
    if (currentTime > lastRequestTime + minTimeBetweenRequests) {
        sendRequest(deployment);
        lastRequestTime = currentTime;
    } else {
        log.debug("Won't send request to realm jwks url. Last request time was " + lastRequestTime);
    }

    return lookupCachedKey(publicKeyCacheTtl, currentTime, kid);
}           

Because the current JWKPublicKeyLocator is a singleton, and all threads of the same process share this instance, when a threaded program cannot exit, other threads can only block synchronized execution, continue to see the sendRequest function, which will call the keycloak interface to obtain the publicKey, and the protocol is http.

Throw the question:

  1. Why does the API to get publicKey not return for a long time?
  2. Is there a timeout period that fails quickly when the interface fails to return after the time outer?

Our backend uses SpringBoot to automatically configure KeyCloak, and the configuration file has three main parameters:

keycloak.realm=realmId
keycloak.resource=clientId
keycloak.auth-server-url=http://127.0.0.1:8180/auth           

These parameters are automatically injected by the KeycloakSpringBootProperties configuration class, where keycloak.auth-server-url is configured with the baseurl called by keycloak, and the parameter in the customer environment is in the form of a domain name, not in the format of ip+port, so the call will go through the process of domain name resolution, load balancing, etc.

If the customer's public network environment is poor and network jitter occurs, the data may not be returned for a long time if the call is made in this way. Through code analysis, the timeout time and other parameters set by the call here are the default values, you can check the org.apache.http.client.config.RequestConfig class, the default is public static final RequestConfig DEFAULT = (new RequestConfig.Builder()).build(); Build defaults, the default values in the Builder, have three parameters about timeouts:

private int connectionRequestTimeout = -1;
private int connectTimeout = -1;
private int socketTimeout = -1;           

-1 means that there is no timeout, so our interface will not time out by default, when a request is blocked and the lock cannot be released, other requests cannot be responded, so we can only wait for the lock to be released.

3

Problem solving

To summarize the ways to avoid such problems:

  1. When calling an external interface, you must set a timeout period to avoid the unavailability of the entire service due to a timeout of a single call.
  2. If Keycloak is deployed in the same LAN environment, the address parameters of Keycloak can use the private IP parameter instead of the domain name or the Internet, so that the interface will not be returned for a long time due to network problems.

3.1 Set the timeout period

First, analyze the source code of the call part in the Keycloak JDK.

org.keycloak.adapters.rotation.JWKPublicKeyLocator#sendRequest 如下:

private void sendRequest(KeycloakDeployment deployment) {
      if (log.isTraceEnabled()) {
          log.trace("Going to send request to retrieve new set of realm public keys for client " + deployment.getResourceName());
      }

      HttpGet getMethod = new HttpGet(deployment.getJwksUrl());
      try {
          JSONWebKeySet jwks = HttpAdapterUtils.sendJsonHttpRequest(deployment, getMethod, JSONWebKeySet.class);

          Map<String, PublicKey> publicKeys = JWKSUtils.getKeysForUse(jwks, JWK.Use.SIG);

          if (log.isDebugEnabled()) {
              log.debug("Realm public keys successfully retrieved for client " +  deployment.getResourceName() + ". New kids: " + publicKeys.keySet().toString());
          }

          // Update current keys
          currentKeys.clear();
          currentKeys.putAll(publicKeys);

      } catch (HttpClientAdapterException e) {
          log.error("Error when sending request to retrieve realm keys", e);
      }
   }           

org.keycloak.adapters.HttpAdapterUtils#sendJsonHttpRequest 如下:

public static <T> T sendJsonHttpRequest(KeycloakDeployment deployment, HttpRequestBase httpRequest, Class<T> clazz) throws HttpClientAdapterException {
      try {
          HttpResponse response = deployment.getClient().execute(httpRequest);
          int status = response.getStatusLine().getStatusCode();
          if (status != 200) {
              close(response);
              throw new HttpClientAdapterException("Unexpected status = " + status);
          }
          HttpEntity entity = response.getEntity();
          if (entity == null) {
              throw new HttpClientAdapterException("There was no entity.");
          }
          InputStream is = entity.getContent();
          try {
              return JsonSerialization.readValue(is, clazz);
          } finally {
              try {
                  is.close();
              } catch (IOException ignored) {

              }
          }
      } catch (IOException e) {
          throw new HttpClientAdapterException("IO error", e);
      }
  }           

Neither HttpGet nor deployment.getClient() in the source code has a timeout period, so the default configuration used when requesting the keycloak interface is -1 to indicate that there is no timeout. Although the deployment object here is not managed by Spring, it can be obtained through other managed objects, and once it is established, it is globally unique, so the idea is to get the global deployment object, then get its client, and then change its settings.

By analyzing the code, it is found that the AdapterDeploymentContext instance is hosted by spring, and the deployment instance can be found through it. The next thing is to determine how to intercept the request, there are generally two ways to filter or interceptor, and it will be more convenient to use filter in this scenario (because the Keycloak JDK itself defines a lot of filters, and supports custom filters), such as the ones that come with the JDK org.keycloak.adapters.springsecurity.filter.KeycloakPreAuthActionsFilter, referring to this filter, we customize the filter, the code is as follows:

@Component
public class ChangeTimeOutFilter implements Filter {

    @Resource
    private AdapterDeploymentContext deploymentContext;

    private volatile boolean deploymentChanged = false;

    @Override
    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {

        HttpFacade facade = new SimpleHttpFacade((HttpServletRequest)request, (HttpServletResponse)response);
        KeycloakDeployment deployment = deploymentContext.resolveDeployment(facade);

        if (deployment == null) {
            chain.doFilter(request, response);
            return;
        }

        //deployment 是全局唯一,只需要修改一次
        if (deploymentChanged) {
            chain.doFilter(request, response);
            return;
        }

        /**
         * 设置超时时间
         */
        HttpParams params = deployment.getClient().getParams();
        params.setIntParameter(CoreConnectionPNames.SO_TIMEOUT, 10000);
        params.setIntParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, 10000);
        params.setLongParameter(ClientPNames.CONN_MANAGER_TIMEOUT, 10000L);
        deploymentChanged=true;

        chain.doFilter(request, response);
    }           

In order to test if the timeout period takes effect on the constructed error, we set the timeout to a very short timeout, such as a few milliseconds, and then the call will definitely time out. Once the code is deployed, the error log outputs:

Error when sending request to retrieve realm keys
org.keycloak.adapters.HttpClientAdapterException: IO error
        at org.keycloak.adapters.HttpAdapterUtils.sendJsonHttpRequest(HttpAdapterUtils.java:57)
        at org.keycloak.adapters.rotation.JWKPublicKeyLocator.sendRequest(JWKPublicKeyLocator.java:99)
        at org.keycloak.adapters.rotation.JWKPublicKeyLocator.getPublicKey(JWKPublicKeyLocator.java:63)
        at org.keycloak.adapters.rotation.AdapterTokenVerifier.getPublicKey(AdapterTokenVerifier.java:121)
        at org.keycloak.adapters.rotation.AdapterTokenVerifier.createVerifier(AdapterTokenVerifier.java:111)
        at org.keycloak.adapters.rotation.AdapterTokenVerifier.verifyToken(AdapterTokenVerifier.java:47)
        at org.keycloak.adapters.BearerTokenRequestAuthenticator.authenticateToken(BearerTokenRequestAuthenticator.java:103)
        at org.keycloak.adapters.BearerTokenRequestAuthenticator.authenticate(BearerTokenRequestAuthenticator.java:88)
        at org.keycloak.adapters.RequestAuthenticator.authenticate(RequestAuthenticator.java:67)
        ......
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:171)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)           

The above error message indicates that the timeout configuration has taken effect.

3.2 The keycloak access address is set to the intranet

The original keycloak configuration information:

keycloak.realm=atlas
keycloak.resource=atlas-assistant
keycloak.auth-server-url=http://122.122.122.122:8180/auth
keycloak.ssl-required=none
keycloak.public-client=true
keycloak.use-resource-role-mappings=true           

Change keycloak.auth-server-url to the private address keycloak.auth-server-url=http://172.17.0.1:8180/auth, because the frontend needs to return keycloak through the backend The address is redirected to the browser (jump to the login page), so the returned address must be the external network address (the front-end of the internal network address cannot be requested), so a new configuration item is added, and the value of this configuration item is configured as the public network address, which is only used to return to the front-end (the configuration of keycloak.auth-server-url was used before, but now it is disassembled) environment.keycloak.auth-server-url=http://122.122.122.122:8180/auth, and then after the corresponding local code modification is deployed to the test environment, an error message is reported on the page:

Technical Practice Sharing | SpringBoot uses keycloak to record the pit

Failed to log in to the private network

Let's take a look at the error log first (this log level is debug, not error, so this piece is questionable about the source code):

2022-04-01 16:45:33.069 [http-nio-8081-exec-3] DEBUG o.k.adapters.BearerTokenRequestAuthenticator - Found [1] values in authorization header, selecting the first value for Bearer. 
2022-04-01 16:45:33.069 [http-nio-8081-exec-3] DEBUG o.k.adapters.BearerTokenRequestAuthenticator - Verifying access_token 
2022-04-01 16:45:33.071 [http-nio-8081-exec-3] DEBUG o.k.adapters.BearerTokenRequestAuthenticator - Failed to verify token 
2022-04-01 16:45:33.071 [http-nio-8081-exec-3] DEBUG org.keycloak.adapters.RequestAuthenticator - Bearer FAILED 
2022-04-01 16:45:33.071 [http-nio-8081-exec-3] DEBUG o.k.a.s.f.KeycloakAuthenticationProcessingFilter - Auth outcome: FAILED 
2022-04-01 16:45:33.071 [http-nio-8081-exec-3] DEBUG o.k.a.s.f.KeycloakAuthenticationProcessingFilter - Authentication request failed: org.keycloak.adapters.springsecurity.KeycloakAuthenticationException: Invalid authorization header, see WWW-Authenticate header for details 
org.keycloak.adapters.springsecurity.KeycloakAuthenticationException: Invalid authorization header, see WWW-Authenticate header for details
	at org.keycloak.adapters.springsecurity.filter.KeycloakAuthenticationProcessingFilter.attemptAuthentication(KeycloakAuthenticationProcessingFilter.java:162)
	at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:212)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
	at org.keycloak.adapters.springsecurity.filter.KeycloakPreAuthActionsFilter.doFilter(KeycloakPreAuthActionsFilter.java:96)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:103)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
	at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:105)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:103)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
	at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:215)
	at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:178)
	at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:358)
	at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:271)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:103)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:103)           

After analyzing the source code, it was finally found that there was a failure of the RealmUrlCheck validation logic.

Technical Practice Sharing | SpringBoot uses keycloak to record the pit

Analyze these two values, one is the internal network address, the other is the external network address, although it points to the same place, but the values are different, the verification fails, why these two different addresses appear, can start from the front-end and back-end interaction logic:

Technical Practice Sharing | SpringBoot uses keycloak to record the pit

Therefore, 1.2 The address that the backend needs to return to the front-end browser must be the public network address, and the token generated after the front-end request carries the public network address, the realmUrl in the deployment is parsed according to the configuration information (keycloak.auth-server-url), which is the intranet address, and the JsonWebToken is parsed according to the token transmitted to the backend by the front-end, and the address in it is the public network address, 4.2 When verifying the token, the two addresses are inconsistent, and the backend will think that the token is in danger of being tampered with and throw an exception, so in order to solve this problem, two solutions are considered:

Scenario 1: Extend the original jdk, use custom KeycloakConfigResolver, KeycloakDeployment, etc., which is more complicated;

Scenario 2: In the above filter, modify the deployment data to replace the realmUrl address from the private network to the domain name, because this parameter is not used when the real request is made, so it will not affect the internal call (the authServerBaseUrl parameter is used when the real call is made).

To save time, using option 2, the specific filter becomes:

@Component
public class ChangeTimeOutFilter implements Filter {

    @Resource
    private AdapterDeploymentContext deploymentContext;

    @Resource
    private KeyCloakConfig keyCloakConfig;

    private static Field realmInfoUrlFd;

    private volatile boolean deploymentChanged = false;

    static {
        try {
            ChangeTimeOutFilter.realmInfoUrlFd = KeycloakDeployment.class.getDeclaredField("realmInfoUrl");
            realmInfoUrlFd.setAccessible(true);
        } catch (Exception ex){

        }
    }

    @Override
    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {

        HttpFacade facade = new SimpleHttpFacade((HttpServletRequest)request, (HttpServletResponse)response);
        KeycloakDeployment deployment = deploymentContext.resolveDeployment(facade);

        if (deployment == null) {
            chain.doFilter(request, response);
            return;
        }

        //deployment 是全局唯一,只需要修改一次
        if (deploymentChanged) {
            chain.doFilter(request, response);
            return;
        }

        /**
         * 将 realmInfoUrl 从内网改为外网,可以让 check 通过
         */
        String realmInfoUrl = deployment.getRealmInfoUrl();
        if (!StringUtils.isBlank(realmInfoUrl)) {
            realmInfoUrl = realmInfoUrl.replaceAll(keyCloakConfig.getInnerUrl(), keyCloakConfig.getAuthUrl());
            try {
                realmInfoUrlFd.set(deployment, realmInfoUrl);
            } catch (Exception ex){
            }
        }

        HttpParams params = deployment.getClient().getParams();
        params.setIntParameter(CoreConnectionPNames.SO_TIMEOUT, 10000);
        params.setIntParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, 10000);
        params.setLongParameter(ClientPNames.CONN_MANAGER_TIMEOUT, 10000L);
        deploymentChanged=true;

        chain.doFilter(request, response);
    }
}           

Validation can be bypassed by changing the value of a field to be consistent by reflection.

Based on the above solutions, we have solved this sudden error. In the future, I will continue to share various stories and improvement experiences in the past on the "Guanyuan Data Technology Team", and welcome everyone to pay attention and discuss together.

Author: Hangzhou Xiaoding, Guanyuan data back-end development, Internet veteran, has been working in the field of J2EE for a long time, keen to study various open source code and improve personal skills, good at system architecture design and implementation, design patterns, microservices, concurrent programming, domain modeling, etc., and is committed to providing high stability, high efficiency and high scalability of business systems through technology.

Source-WeChat public account: Guanyuan Data Technical Team

Source: https://mp.weixin.qq.com/s/GyTWRV19qrDiUKH9lKTa-Q