天天看點

HttpClient在多線程環境下踩坑總結netstat -anpt

HttpClient在多線程環境下踩坑總結

問題現場

在多線程環境下使用HttpClient元件對某個HTTP服務發起請求,運作一段時間之後發現用戶端主機CPU使用率呈現出下降趨勢,而不是一個穩定的狀态。

而且,從程式日志中判斷有線程處于夯住的狀态,應該是被阻塞了。

CPU使用率逐漸下降

問題排查

一開始找不到原因,懷疑是多線程并發導緻的死鎖問題,但是通過代碼審查并未定位到任何可能的多線程并發問題。

甚至開始懷疑是否是因為記憶體資源不夠引起JVM頻繁GC到導緻業務線程被暫停,但是從GC的日志輸出結果看,GC是正常的。

于是,進入一種丈二和尚摸不着頭腦頭腦的狀态,再次Review代碼,發現并未設定請求逾時時間,于是設定逾時控制,發現問題依然存在,徹底懵逼了。

最後,dump線程堆棧和記憶體堆棧,再對堆棧資料進行分析。從分析結果看,确認是因為Socket連接配接在讀取資料時被阻塞引起線程夯住。搜尋“httpclient 逾時”關鍵字,找到各式各樣設定HttpClient逾時控制的方式,均嘗試過但是并未生效。

實際上到後來才知道,HttpCient的逾時控制在不同的版本中設定請求逾時參數的方式均各不相同,這才導緻了我使用了網上看到的方式設定之後并未生效。當然,根本原因還是因為對HttpClient這個元件不熟悉導緻的.

問題重制

1.HttpClient版本

<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.2</version>           

2.Java代碼

public class HttpClientTest {

private AtomicInteger counter = new AtomicInteger(0);
private String url = "http://www.baidu.com/";

public static void main(String[] args) {
    new HttpClientTest().test();
}

// 執行測試
private void test() {
    int number = 100000; // 總請求數
    int concurrent = 50; // 每次并發請求數
    CountDownLatch countDownLatch = new CountDownLatch(number); // 計數器
    ExecutorService threadPool = Executors.newFixedThreadPool(concurrent); // 線程池
    int concurrentPer = concurrent;
    boolean over = false;
    while(!over) {
        number = number - concurrent;
        if(number <= 0) {
            concurrentPer = number + concurrent;
            over = true;
        }

        // 線程池批量送出
        for(int i = 0; i < concurrentPer; i++) {
            threadPool.execute(new Runnable() {
                @Override
                public void run() {
                    try {
                        request(url);
                        Thread.sleep(100);
                    } catch (IOException | InterruptedException e) {
                        e.printStackTrace();
                    } finally {
                        countDownLatch.countDown();
                    }
                }
            });
        }
    }
    try {
        countDownLatch.await();
        threadPool.shutdown();
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
}

// 通路指定位址
private void request(String url) throws IOException {
    HttpGet httpGet = new HttpGet(url);
    commnicate(httpGet);
}

// 負責底層通信處理
private void commnicate(HttpRequestBase request) throws IOException {
    ResponseHandler<String> responseHandler = new ResponseHandler<String>() {
        @Override
        public String handleResponse(HttpResponse response) throws ClientProtocolException, IOException {
            return EntityUtils.toString(response.getEntity());
        }
    };

    HttpClient client = HttpClients.createDefault();
    String body = client.execute(request, responseHandler); // 線程可能會在這裡被阻塞
    System.out.println(String.format("body size: %s, counter: %s", body.length(), counter.incrementAndGet()));
}           

}

運作上述代碼一段時間後很容易可以重制出問題,如下為運作控制台資訊:

HttpClient重制報錯

并且線程全部夯住,程序無法正常結束.

檢視端口狀态存在大量請求處于建立連接配接狀态(ESTABLISHED):

netstat -anpt

tcp 0 0 172.17.7.81:56408 180.118.128.179:56033 ESTABLISHED 3766/java

tcp 0 0 172.17.7.81:57644 115.202.238.177:27016 ESTABLISHED 3847/java

tcp 0 0 172.17.7.81:36616 117.57.21.0:25719 ESTABLISHED 3766/java

tcp 0 0 172.17.7.81:59944 112.245.197.118:57220 ESTABLISHED 3847/java

tcp 0 0 172.17.7.81:48722 218.5.215.10:40835 ESTABLISHED 4007/java

tcp 0 0 172.17.7.81:52734 115.194.17.14:45210 ESTABLISHED 4007/java

tcp 0 0 172.17.7.81:60586 59.32.37.129:16637 ESTABLISHED 3686/java

tcp 0 0 172.17.7.81:36776 222.89.86.109:21667 ESTABLISHED 3766/java

tcp 0 0 172.17.7.81:51690 60.161.249.162:59039 ESTABLISHED 3927/java

tcp 0 0 172.17.7.81:42226 58.218.200.59:80 TIME_WAIT -

tcp 0 0 172.17.7.81:56566 117.70.47.194:40879 ESTABLISHED 3686/java

tcp 0 0 172.17.7.81:43266 182.120.202.204:45893 ESTABLISHED 3766/java

tcp 0 0 172.17.7.81:55630 60.169.223.16:21280 ESTABLISHED 3927/java

tcp 0 0 172.17.7.81:54922 60.168.81.26:25464 ESTABLISHED 3927/java

tcp 0 0 172.17.7.81:53352 112.252.97.83:53584 ESTABLISHED 3847/java

tcp 0 0 172.17.7.81:52684 113.121.242.43:14447 ESTABLISHED 3927/java

tcp 0 0 172.17.7.81:54750 113.121.241.168:45173 ESTABLISHED 3686/java

tcp 0 0 172.17.7.81:41510 113.105.202.106:47288 ESTABLISHED 4007/java

tcp 0 0 172.17.7.81:38804 121.232.148.62:57938 ESTABLISHED 3847/java

tcp 0 0 172.17.7.81:41468 113.105.202.106:47288 ESTABLISHED 3927/java

tcp 0 0 172.17.7.81:45444 123.163.81.185:22012 ESTABLISHED 3766/java

tcp 0 0 172.17.7.81:54810 113.121.241.168:45173 ESTABLISHED 4007/java

tcp 0 0 172.17.7.81:51542 175.153.23.147:20766 ESTABLISHED 3927/java

tcp 0 0 172.17.7.81:45644 218.5.215.10:40835 ESTABLISHED 4007/java

tcp 0 0 172.17.7.81:35730 116.53.197.198:30042 ESTABLISHED 3766/java

tcp 0 0 172.17.7.81:54738 113.121.241.168:45173 ESTABLISHED 3686/java

tcp 0 0 172.17.7.81:60600 59.32.37.129:16637 ESTABLISHED 3686/java

tcp 0 0 172.17.7.81:54862 113.121.241.168:45173 ESTABLISHED 4007/java

tcp 0 0 172.17.7.81:40980 115.225.153.215:17292 ESTABLISHED 3686/java

tcp 0 0 172.17.7.81:54166 123.149.162.129:18269 ESTABLISHED 3766/java

tcp 0 0 172.17.7.81:60712 120.35.190.184:33054 ESTABLISHED 3766/java

tcp 0 0 172.17.7.81:55802 106.42.211.65:59547 ESTABLISHED 3766/java

同時,分析線程堆棧資訊(jstack -F -l pid > thread_stack.log)可以看到如下資訊:

"pool-1-thread-45" #55 prio=5 os_prio=0 tid=0x00007f78702df000 nid=0x33d5 runnable [0x00007f7830c1d000]

java.lang.Thread.State: RUNNABLE

at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:139)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:155)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:284)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:220)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:164)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:139)
at org.chench.extra.HttpClientTest.commnicate(HttpClientTest.java:106) # 線程在這裡阻塞
at org.chench.extra.HttpClientTest.request(HttpClientTest.java:93)
at org.chench.extra.HttpClientTest.access$100(HttpClientTest.java:31)
at org.chench.extra.HttpClientTest$1.run(HttpClientTest.java:62)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
           

Locked ownable synchronizers:

- <0x0000000086d50638> (a java.util.concurrent.ThreadPoolExecutor$Worker)           

從堆棧日志中可以看到,線程處于RUNNABLE狀态,并且阻塞在at org.chench.extra.HttpClientTest.commnicate(HttpClientTest.java:106)處.

解決方案

通過線程堆棧日志分析可以定位到線程夯住是因為HttpClient在執行通路時被阻塞了,結合源代碼找到阻塞原因是因為未設定請求逾時時間.

上述問題本質上是因為HttpClient元件并未設定請求逾時控制導緻的:雖然連接配接逾時,但是讀取失敗,導緻線程一直被阻塞.

那麼,應該如何設定HttpClient的逾時時間呢?鑒于HttpClient的官方文檔沒有明确說明,并且不同版本的HttpClient元件設定逾時控制的方式不一緻,是以建議直接檢視源碼.

HttpClient執行通路請求時序圖如下:

HttpClient通路請求時序圖

順藤摸瓜,在MainClientExec.java的execute()方法中看到有2處使用了timeout參數,其含義各不相同:

(1)在擷取HttpClientConnection對象時需要讀取配置參數中的ConnectionRequestTimeout值,該參數用于控制擷取連接配接的逾時時間.

timeout參數

(2)擷取到HttpClientConnection對象之後讀取配置參數中的SocketTimeout值,設定Socket逾時時間.

設定Socket逾時

顯然,這2個timeout參數都需要從RequestConfig對象中擷取.

既然找到了使用timeout參數的地方,下一步需要确定該參數是如何設定的.沿着HttpClient的請求時序圖路徑往回查找,在InternalHttpClient.java類的doExecute()方法中可以很清晰地看到設定了RequestConfig對象參數.

@Override

protected CloseableHttpResponse doExecute(
        final HttpHost target,
        final HttpRequest request,
        final HttpContext context) throws IOException, ClientProtocolException {
    Args.notNull(request, "HTTP request");
    HttpExecutionAware execAware = null;
    if (request instanceof HttpExecutionAware) {
        execAware = (HttpExecutionAware) request;
    }
    try {
        final HttpRequestWrapper wrapper = HttpRequestWrapper.wrap(request, target);
        final HttpClientContext localcontext = HttpClientContext.adapt(
                context != null ? context : new BasicHttpContext());
        // 通過RequestConfig對象配置連接配接參數
        RequestConfig config = null; 
        if (request instanceof Configurable) {
            // 如果在HttpRequest對象中設定了RequestConfig屬性,直接使用
            config = ((Configurable) request).getConfig(); 
        }
        if (config == null) {
            // 如果在HttpRequest對象中未設定RequestConfig對象屬性,則擷取HttpParams屬性構造RequestConfig對象
            final HttpParams params = request.getParams(); 
            if (params instanceof HttpParamsNames) {
                if (!((HttpParamsNames) params).getNames().isEmpty()) {
                    config = HttpClientParamConfig.getRequestConfig(params);
                }
            } else {
                config = HttpClientParamConfig.getRequestConfig(params);
            }
        }
        if (config != null) {
            // 使用RequestConfig對象配置連接配接參數
            localcontext.setRequestConfig(config); 
        }
        setupContext(localcontext);
        final HttpRoute route = determineRoute(target, wrapper, localcontext);
        return this.execChain.execute(route, wrapper, localcontext, execAware);
    } catch (final HttpException httpException) {
        throw new ClientProtocolException(httpException);
    }
}           

(3)HttpClient預設使用的連接配接池為PoolingHttpClientConnectionManager,在建立連接配接時(connect()方法)會使用其中的SocketConfig配置參數對Socket進行配置,如下所示:

PoolingHttpClientConnectionManager.java

public void connect(

final HttpClientConnection managedConn,
    final HttpRoute route,
    final int connectTimeout,
    final HttpContext context) throws IOException {
Args.notNull(managedConn, "Managed Connection");
Args.notNull(route, "HTTP route");
final ManagedHttpClientConnection conn;
synchronized (managedConn) {
    final CPoolEntry entry = CPoolProxy.getPoolEntry(managedConn);
    conn = entry.getConnection();
}
final HttpHost host;
if (route.getProxyHost() != null) {
    host = route.getProxyHost();
} else {
    host = route.getTargetHost();
}
final InetSocketAddress localAddress = route.getLocalSocketAddress();
SocketConfig socketConfig = this.configData.getSocketConfig(host);
if (socketConfig == null) {
    // 使用配置參數SocketConfig
    socketConfig = this.configData.getDefaultSocketConfig();
}
if (socketConfig == null) {
    socketConfig = SocketConfig.DEFAULT;
}
this.connectionOperator.connect(
        conn, host, localAddress, connectTimeout, socketConfig, context);           

DefaultHttpClientConnectionOperator.java

public void connect(
        final ManagedHttpClientConnection conn,
        final HttpHost host,
        final InetSocketAddress localAddress,
        final int connectTimeout,
        final SocketConfig socketConfig,
        final HttpContext context) throws IOException {
    final Lookup<ConnectionSocketFactory> registry = getSocketFactoryRegistry(context);
    final ConnectionSocketFactory sf = registry.lookup(host.getSchemeName());
    if (sf == null) {
        throw new UnsupportedSchemeException(host.getSchemeName() +
                " protocol is not supported");
    }
    final InetAddress[] addresses = host.getAddress() != null ?
            new InetAddress[] { host.getAddress() } : this.dnsResolver.resolve(host.getHostName());
    final int port = this.schemePortResolver.resolve(host);
    for (int i = 0; i < addresses.length; i++) {
        final InetAddress address = addresses[i];
        final boolean last = i == addresses.length - 1;

        Socket sock = sf.createSocket(context);
        // 使用socketConfig參數中的逾時時間對Socket進行配置
        sock.setSoTimeout(socketConfig.getSoTimeout());
        sock.setReuseAddress(socketConfig.isSoReuseAddress());
        sock.setTcpNoDelay(socketConfig.isTcpNoDelay());
        sock.setKeepAlive(socketConfig.isSoKeepAlive());
        if (socketConfig.getRcvBufSize() > 0) {
            sock.setReceiveBufferSize(socketConfig.getRcvBufSize());
        }
        if (socketConfig.getSndBufSize() > 0) {
            sock.setSendBufferSize(socketConfig.getSndBufSize());
        }

        final int linger = socketConfig.getSoLinger();
        if (linger >= 0) {
            sock.setSoLinger(true, linger);
        }
        conn.bind(sock);

        // ...
    }
}           

經過源碼解讀可以很明确地知道,在HttpClient 4.5.2版本中,設定連接配接參數有3種方式:

(1)在HttpRequest對象中設定RequestConfig對象屬性

(2)在HttpRequest對象中設定HttpParams對象屬性.

(3)在連接配接池對象中設定SocketConfig對象屬性

既然找到了根源,下面分别通過這3種方式設定逾時參數進行驗證.

方式1: 通過RequestConfig對象設定逾時參數

int timeOut = 5000;

RequestConfig requestConfig = RequestConfig.custom()

.setConnectionRequestTimeout(timeOut) // 擷取連接配接逾時時間
    .setConnectTimeout(timeOut) // 設定HTTP連接配接逾時時間
    .setSocketTimeout(timeOut) // 設定Socket逾時時間
    .build();           

request.setConfig(requestConfig);

方式2: 通過HttpParams對象設定逾時參數

HttpParams params = new BasicHttpParams();

params.setParameter(CoreConnectionPNames.SO_TIMEOUT, timeOut); // 設定Socket逾時時間

params.setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, timeOut); // 設定HTTP連接配接逾時時間

request.setParams(params);

方式3: 通過連接配接池對象設定逾時參數

PoolingHttpClientConnectionManager connManager = new PoolingHttpClientConnectionManager();

// 對連接配接池設定SocketConfig對象

connManager.setDefaultSocketConfig(SocketConfig.custom().setSoTimeout(timeOut).build());

client = HttpClients.custom().setConnectionManager(connManager).build();

通過上述3種方式分别設定逾時參數驗證,雖然在運作過程中會有報錯,但是不會導緻線程被阻塞,程序能正常運作結束:

有報錯但是不會出現線程夯住

解決問題之後用戶端CPU使用率恢複正常:

CPU使用率基本穩定

總結/教訓/反思

之是以會遇到這樣的問題,還是因為對HttpClient元件不熟悉導緻的;另外,在發現問題之後如何快速定位問題,并搜尋穩定的解決方案很重要。

HttpClient元件每個版本的API變化都比較大,在使用時一定要徹底清楚目前使用的版本是如何設定逾時時間的。而如何确定知道逾時時間控制,通過源代碼檢視最為妥當.

在Java平台使用Http用戶端元件,可以有多個選擇:

(1)直接使用JDK提供的URL類通路

(2)使用HttpClient元件,有坑,不同版本設定參數的方式變動較大,最好是閱讀一下目前使用版本的源碼實作,正确設定好逾時時間等參數

(3)如果使用了Spring MVC架構,還可以使用Spring MVC提供的RestTemplate元件,底層是使用Netty實作的用戶端

遇到的這個坑本身并不屬于技術難點,但是面對這個問題的解決思路值得總結:

(1)程式日志,運作日志非常關鍵,是定位問題時第一時間需要檢視的

(2)代碼review,逐行逐行地審查,首先排除可能存在的代碼邏輯問題,比如:死鎖等

(3)通過jstack指令檢視線程堆棧資訊: jstack -l -F > stack.log

(4)通過jmap指令檢視記憶體堆棧資訊: jmap -dump:live format=b,file=heap.bin

(5)如果結合搜尋引擎和上述排查步驟依然未能解決問題,應該第一時間考慮直接閱讀元件的源代碼實作,特别是使用了開源元件時這可能才是真正解決問題的最佳路徑

【參考】

https://blog.csdn.net/u011191463/article/details/78664896

HttpClient逾時設定詳解

https://my.oschina.net/jywm/blog/1834702

解決httpclient逾時設定不生效的問題

https://www.jianshu.com/p/4b3e172c4f2d

HttpClient 4.5.2-(四)連接配接逾時的配置

https://www.jianshu.com/p/6a41c95855e3

HttpClient 4.5.2-(五)連接配接池的配置

https://www.jianshu.com/p/c852cbcf3d68

HttpClient高并發下性能優化-http連接配接池

https://blog.csdn.net/u011402596/article/details/44619443

HttpClient 多線程處理

https://field-notes.iteye.com/blog/2383759

多線程消費使用HttpClient引發的坑

http://blog.51cto.com/lihao001/1788490

httpclient4.3 導緻線程阻塞

https://study121007.iteye.com/blog/2304274

HttpClient4.5.2 連接配接管理

https://alafqq.iteye.com/blog/2325041

httpclient 多線程執行(網上版本太多了。。。誤人子弟)

https://gaozzsoft.iteye.com/blog/2352241

HttpClient 4.5.2版本設定連接配接逾時時間-CloseableHttpClient設定Timeout

https://www.cnblogs.com/softidea/p/6964347.html

HttpClient 專題

https://blog.csdn.net/Fhaohaizi/article/details/78217903

httpclient4.5如何確定資源釋放

https://blog.csdn.net/u010634066/article/details/83120122

一場HttpClient調用未關閉流引發的問題

https://www.cnblogs.com/mumuxinfei/p/5066633.html

Apache HttpClient使用之阻塞陷阱

https://issues.apache.org/jira/browse/HTTPCLIENT-1584

CloseableHttpClient - SSL Handshake has no Socket Timeout

https://monkeyissexy.github.io/2016/11/11/httpclient_ssl_handshake_socketTimeout_bug/

httpclient ssl handshake socketTimeout bug 分析解決過程

http://geekerwang.com/2017/10/22/%E8%AE%B0%E5%BD%95%E4%B8%80%E6%AC%A1%E5%9D%91%EF%BC%9AsocketRead-hang/

記錄一次坑:socketRead hang

https://www.cnblogs.com/jessezeng/p/7448636.html

解決: httpclient ssl 驗證導緻死鎖問題

http://itbang.me/solu/detail/201

JAVA線程卡死問題如何定位?

作者:2Simple

出處:

http://www.cnblogs.com/nuccch/