1
問題描述
公司有的項目使用 keycloak 作為統一身份認證、權限控制的方法,後端使用的是 springboot,是以一般是使用 springboot + 內建 keycloak 作為統一登入的方案,具體搭建流程可以參考官方文檔。此前一直沒有遇到問題,直到某天客戶回報說頁面突然打不開,但是過了一陣子就好了,有時候沒來得及定位問題就恢複了。
問題出現的前端展示情況
有一次持續了幾分鐘,而且其他項目(使用 springboot 內建 keycloak) 都有機率出現這種問題,這種情況分析一般原因是因為接口比較耗時,是以進入容器 jstack 後列印出了目前的堆棧資訊,堆棧資訊比較長,我隻粘貼了出問題的部分:
"http-nio-8081-exec-9" #61 daemon prio=5 os_prio=0 tid=0x00007efc702c1000 nid=0x4c waiting for monitor entry [0x00007efc778f6000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.keycloak.adapters.rotation.JWKPublicKeyLocator.getPublicKey(JWKPublicKeyLocator.java:60)
- waiting to lock <0x00000003f1888968> (a org.keycloak.adapters.rotation.JWKPublicKeyLocator)
at org.keycloak.adapters.rotation.AdapterTokenVerifier.getPublicKey(AdapterTokenVerifier.java:121)
at org.keycloak.adapters.rotation.AdapterTokenVerifier.createVerifier(AdapterTokenVerifier.java:111)
at org.keycloak.adapters.rotation.AdapterTokenVerifier.verifyToken(AdapterTokenVerifier.java:47)
at org.keycloak.adapters.BearerTokenRequestAuthenticator.authenticateToken(BearerTokenRequestAuthenticator.java:103)
at org.keycloak.adapters.BearerTokenRequestAuthenticator.authenticate(BearerTokenRequestAuthenticator.java:88)
at org.keycloak.adapters.RequestAuthenticator.authenticate(RequestAuthenticator.java:67)
at org.keycloak.adapters.springsecurity.filter.KeycloakAuthenticationProcessingFilter.attemptAuthentication(KeycloakAuthenticationProcessingFilter.java:154)
at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:212)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
at org.keycloak.adapters.springsecurity.filter.KeycloakPreAuthActionsFilter.doFilter(KeycloakPreAuthActionsFilter.java:96)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
at org.springframework.security.web.header.HeaderWriterFilter.doHeadersAfter(HeaderWriterFilter.java:92)
at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:77)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
....
這應該是後端請求 keycloak 擷取 publicKey 出現了阻塞。
2
問題原因分析
初步分析是調用 keycloak 出現了問題,進入 JWKPublicKeyLocator 的 60 行,代碼如下:
// Check if we are allowed to send request
synchronized (this) {
currentTime = Time.currentTime();
if (currentTime > lastRequestTime + minTimeBetweenRequests) {
sendRequest(deployment);
lastRequestTime = currentTime;
} else {
log.debug("Won't send request to realm jwks url. Last request time was " + lastRequestTime);
}
return lookupCachedKey(publicKeyCacheTtl, currentTime, kid);
}
因為目前 JWKPublicKeyLocator 是單例,同一個程序所有線程公用這個執行個體,是以當一個線程程式無法退出時,其他線程執行到 synchronized 隻能阻塞,繼續去看 sendRequest 函數,該函數會調用 keycloak 接口擷取 publicKey,協定是 http。
抛出問題:
- 調用擷取 publicKey 接口為啥會出現長時間未傳回?
- 是否有設定逾時時間,當接口超出時間未傳回時快速失敗?
我們後端使用 springboot 自動配置 keycloak,配置檔案主要有三個參數:
keycloak.realm=realmId
keycloak.resource=clientId
keycloak.auth-server-url=http://127.0.0.1:8180/auth
這幾個參數是 KeycloakSpringBootProperties 配置類自動注入的,其中 keycloak.auth-server-url 配置的就是 keycloak 調用的 baseurl,客戶環境該參數是域名形式,不是 ip+端口格式,是以調用時會走域名解析,負載均衡等過程。
如果客戶的外網環境很差,出現網絡抖動等問題,通過這種方式調用還是可能會出現資料長時間未傳回的情況。通過代碼分析,這裡的調用設定的逾時時間等參數用的是預設值,可以檢視 org.apache.http.client.config.RequestConfig 類,預設是使用 public static final RequestConfig DEFAULT = (new RequestConfig.Builder()).build(); 建構預設值,Builder 裡面的預設值,有關逾時的三個參數:
private int connectionRequestTimeout = -1;
private int connectTimeout = -1;
private int socketTimeout = -1;
-1 表示不逾時,是以我們的接口預設是不會逾時的,當一個請求阻塞住沒法釋放鎖,其它請求都沒辦法響應,隻能等待鎖釋放。
3
問題解決方案
總結一下避免此類問題的辦法:
- 調用外部接口時,必須設定逾時時間,避免由于一次調用逾時導緻整個服務的不可用;
- 如果 keycloak 部署在同一個區域網路環境中,配置的 keycloak 的位址參數可以使用内網 ip 參數,不使用域名或者外網,這樣不會出現由于網絡問題導緻的接口長時間不傳回。
3.1 設定逾時時間
先分析 keycloak jdk 中調用部的源碼。
org.keycloak.adapters.rotation.JWKPublicKeyLocator#sendRequest 如下:
private void sendRequest(KeycloakDeployment deployment) {
if (log.isTraceEnabled()) {
log.trace("Going to send request to retrieve new set of realm public keys for client " + deployment.getResourceName());
}
HttpGet getMethod = new HttpGet(deployment.getJwksUrl());
try {
JSONWebKeySet jwks = HttpAdapterUtils.sendJsonHttpRequest(deployment, getMethod, JSONWebKeySet.class);
Map<String, PublicKey> publicKeys = JWKSUtils.getKeysForUse(jwks, JWK.Use.SIG);
if (log.isDebugEnabled()) {
log.debug("Realm public keys successfully retrieved for client " + deployment.getResourceName() + ". New kids: " + publicKeys.keySet().toString());
}
// Update current keys
currentKeys.clear();
currentKeys.putAll(publicKeys);
} catch (HttpClientAdapterException e) {
log.error("Error when sending request to retrieve realm keys", e);
}
}
org.keycloak.adapters.HttpAdapterUtils#sendJsonHttpRequest 如下:
public static <T> T sendJsonHttpRequest(KeycloakDeployment deployment, HttpRequestBase httpRequest, Class<T> clazz) throws HttpClientAdapterException {
try {
HttpResponse response = deployment.getClient().execute(httpRequest);
int status = response.getStatusLine().getStatusCode();
if (status != 200) {
close(response);
throw new HttpClientAdapterException("Unexpected status = " + status);
}
HttpEntity entity = response.getEntity();
if (entity == null) {
throw new HttpClientAdapterException("There was no entity.");
}
InputStream is = entity.getContent();
try {
return JsonSerialization.readValue(is, clazz);
} finally {
try {
is.close();
} catch (IOException ignored) {
}
}
} catch (IOException e) {
throw new HttpClientAdapterException("IO error", e);
}
}
源碼中 HttpGet 和 deployment.getClient() 這兩個地方都未設定逾時時間,是以在請求 keycloak 接口時,使用的預設的配置,預設配置是-1 表示不逾時。這裡的 deployment 對象雖然未使用 Spring 托管,但是可以通過其他托管對象擷取到,而且它一旦建立就是全局唯一的,是以我們解決的思路是擷取全局的 deployment 對象,然後擷取其 client,然後改變其設定。
通過分析代碼發現,AdapterDeploymentContext 執行個體是 spring 托管的,而且能通過它找到 deployment 執行個體。接下來就是确定怎麼攔截這個請求,一般有兩種方式 filter 或者 interceptor,在此場景中使用 filter 會更友善點(因為 keycloak jdk 本身就定義了很多 filter,而且支援自定義 filter),例如 jdk 中自帶的 org.keycloak.adapters.springsecurity.filter.KeycloakPreAuthActionsFilter,參考這個 filter,我們自定義 filter,代碼如下:
@Component
public class ChangeTimeOutFilter implements Filter {
@Resource
private AdapterDeploymentContext deploymentContext;
private volatile boolean deploymentChanged = false;
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
HttpFacade facade = new SimpleHttpFacade((HttpServletRequest)request, (HttpServletResponse)response);
KeycloakDeployment deployment = deploymentContext.resolveDeployment(facade);
if (deployment == null) {
chain.doFilter(request, response);
return;
}
//deployment 是全局唯一,隻需要修改一次
if (deploymentChanged) {
chain.doFilter(request, response);
return;
}
/**
* 設定逾時時間
*/
HttpParams params = deployment.getClient().getParams();
params.setIntParameter(CoreConnectionPNames.SO_TIMEOUT, 10000);
params.setIntParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, 10000);
params.setLongParameter(ClientPNames.CONN_MANAGER_TIMEOUT, 10000L);
deploymentChanged=true;
chain.doFilter(request, response);
}
為了測試逾時時間是否生效所構造的錯誤,我們将逾時時間設定非常短,例如幾毫秒,然後調用一定會逾時。代碼部署後,錯誤日志會輸出:
Error when sending request to retrieve realm keys
org.keycloak.adapters.HttpClientAdapterException: IO error
at org.keycloak.adapters.HttpAdapterUtils.sendJsonHttpRequest(HttpAdapterUtils.java:57)
at org.keycloak.adapters.rotation.JWKPublicKeyLocator.sendRequest(JWKPublicKeyLocator.java:99)
at org.keycloak.adapters.rotation.JWKPublicKeyLocator.getPublicKey(JWKPublicKeyLocator.java:63)
at org.keycloak.adapters.rotation.AdapterTokenVerifier.getPublicKey(AdapterTokenVerifier.java:121)
at org.keycloak.adapters.rotation.AdapterTokenVerifier.createVerifier(AdapterTokenVerifier.java:111)
at org.keycloak.adapters.rotation.AdapterTokenVerifier.verifyToken(AdapterTokenVerifier.java:47)
at org.keycloak.adapters.BearerTokenRequestAuthenticator.authenticateToken(BearerTokenRequestAuthenticator.java:103)
at org.keycloak.adapters.BearerTokenRequestAuthenticator.authenticate(BearerTokenRequestAuthenticator.java:88)
at org.keycloak.adapters.RequestAuthenticator.authenticate(RequestAuthenticator.java:67)
......
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
上面這個報錯資訊,說明逾時配置已經生效。
3.2 keycloak 通路位址設定為内網
原來的 keycloak 配置資訊:
keycloak.realm=atlas
keycloak.resource=atlas-assistant
keycloak.auth-server-url=http://122.122.122.122:8180/auth
keycloak.ssl-required=none
keycloak.public-client=true
keycloak.use-resource-role-mappings=true
将 keycloak.auth-server-url 改為内網位址 keycloak.auth-server-url=http://172.17.0.1:8180/auth因為考慮到前端需要通過後端傳回的 keycloak 位址在浏覽器進行跳轉(跳轉到登入頁面),是以這個傳回的位址必須是外網位址(内網位址前端沒法請求),是以新增一個配置項,這個配置項的值配置為外網位址,隻用來傳回給前端(以前都是使用 keycloak.auth-server-url 這個配置,現在将其拆開) environment.keycloak.auth-server-url=http://122.122.122.122:8180/auth ,然後相應的地方代碼修改部署到測試環境後,進入頁面報錯:
設定内網登入失敗
先看錯誤日志(此日志級别是 debug,非 error,是以這塊對源碼是有疑問的):
2022-04-01 16:45:33.069 [http-nio-8081-exec-3] DEBUG o.k.adapters.BearerTokenRequestAuthenticator - Found [1] values in authorization header, selecting the first value for Bearer.
2022-04-01 16:45:33.069 [http-nio-8081-exec-3] DEBUG o.k.adapters.BearerTokenRequestAuthenticator - Verifying access_token
2022-04-01 16:45:33.071 [http-nio-8081-exec-3] DEBUG o.k.adapters.BearerTokenRequestAuthenticator - Failed to verify token
2022-04-01 16:45:33.071 [http-nio-8081-exec-3] DEBUG org.keycloak.adapters.RequestAuthenticator - Bearer FAILED
2022-04-01 16:45:33.071 [http-nio-8081-exec-3] DEBUG o.k.a.s.f.KeycloakAuthenticationProcessingFilter - Auth outcome: FAILED
2022-04-01 16:45:33.071 [http-nio-8081-exec-3] DEBUG o.k.a.s.f.KeycloakAuthenticationProcessingFilter - Authentication request failed: org.keycloak.adapters.springsecurity.KeycloakAuthenticationException: Invalid authorization header, see WWW-Authenticate header for details
org.keycloak.adapters.springsecurity.KeycloakAuthenticationException: Invalid authorization header, see WWW-Authenticate header for details
at org.keycloak.adapters.springsecurity.filter.KeycloakAuthenticationProcessingFilter.attemptAuthentication(KeycloakAuthenticationProcessingFilter.java:162)
at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:212)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
at org.keycloak.adapters.springsecurity.filter.KeycloakPreAuthActionsFilter.doFilter(KeycloakPreAuthActionsFilter.java:96)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:103)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:105)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:103)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:215)
at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:178)
at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:358)
at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:271)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:103)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:103)
通過分析源碼,最終定位到是有一個 RealmUrlCheck 校驗邏輯沒通過導緻。
分析出這兩個值,一個是内網位址,一個是外網位址,雖然指向同一個地方,但是值不同,校驗失敗,為什麼會出現這兩個不同的位址,可以從前後端互動邏輯說起:
是以 1.2 後端需要傳回給前端浏覽器位址必須是外網位址,而前端請求後生成的 token 中攜帶的就是外網位址,deployment 中的 realmUrl 是根據配置資訊(keycloak.auth-server-url)解析出來的,是内網位址,而 JsonWebToken 是根據前端傳到後端的 token 解析出來的,這裡面的位址是外網位址,4.2 校驗 token 時兩個位址是不一緻的,後端會認為該 token 存在被篡改的危險,抛出了異常,是以為了解決該問題,思考了兩種解決方案:
方案 1:擴充原來的 jdk,使用自定義的 KeycloakConfigResolver,KeycloakDeployment 等,比較複雜;
方案 2:在上面的 filter 中,對 deployment 資料進行修改,将 realmUrl 位址從内網替換成域名,因為真正請求的時候不是使用這個參數,是以不會影響内網調用(真正調用時使用的是 authServerBaseUrl 參數)。
為了節省時間,使用了方案 2,具體的 filter 變成了:
@Component
public class ChangeTimeOutFilter implements Filter {
@Resource
private AdapterDeploymentContext deploymentContext;
@Resource
private KeyCloakConfig keyCloakConfig;
private static Field realmInfoUrlFd;
private volatile boolean deploymentChanged = false;
static {
try {
ChangeTimeOutFilter.realmInfoUrlFd = KeycloakDeployment.class.getDeclaredField("realmInfoUrl");
realmInfoUrlFd.setAccessible(true);
} catch (Exception ex){
}
}
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
HttpFacade facade = new SimpleHttpFacade((HttpServletRequest)request, (HttpServletResponse)response);
KeycloakDeployment deployment = deploymentContext.resolveDeployment(facade);
if (deployment == null) {
chain.doFilter(request, response);
return;
}
//deployment 是全局唯一,隻需要修改一次
if (deploymentChanged) {
chain.doFilter(request, response);
return;
}
/**
* 将 realmInfoUrl 從内網改為外網,可以讓 check 通過
*/
String realmInfoUrl = deployment.getRealmInfoUrl();
if (!StringUtils.isBlank(realmInfoUrl)) {
realmInfoUrl = realmInfoUrl.replaceAll(keyCloakConfig.getInnerUrl(), keyCloakConfig.getAuthUrl());
try {
realmInfoUrlFd.set(deployment, realmInfoUrl);
} catch (Exception ex){
}
}
HttpParams params = deployment.getClient().getParams();
params.setIntParameter(CoreConnectionPNames.SO_TIMEOUT, 10000);
params.setIntParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, 10000);
params.setLongParameter(ClientPNames.CONN_MANAGER_TIMEOUT, 10000L);
deploymentChanged=true;
chain.doFilter(request, response);
}
}
通過反射将字段的值改變使其一緻,就可以繞過校驗。
基于以上解決方案,目前我們已經解決了這個突如其來的報錯。未來,我将繼續在“觀遠資料技術團隊”分享過往的各種踩坑故事以及改進經驗,歡迎大家關注,共同探讨。
作者:杭州小丁,觀遠資料後端開發,網際網路老兵,長期奮鬥于J2EE領域,熱衷于研究各種開源代碼并從中進行個人技能的提高,擅長系統的架構設計與實作,設計模式,微服務,并發程式設計,領域模組化等,緻力于通過技術提供高穩定性,高效率和高擴充性的業務系統。
來源-微信公衆号:觀遠資料技術團隊
出處:https://mp.weixin.qq.com/s/GyTWRV19qrDiUKH9lKTa-Q