背景介紹
排查問題的時候,有遇到synchronized使用不合理導緻接口響應延遲,出現問題的僞代碼如下:
public synchronized Object businessMethod(Object params){
Object ret = xxxx;
Object response = httpClient.execute(params);
//業務邏輯
... ...
return ret;
}
上面代碼在通路遠端http服務延遲的時候,所有通路該方法的線程都阻塞住了,最終導緻了接口逾時,而該場景下是不需要使用synchronized的。
由此聯想到,如何檢測由synchronized或java.util.concurrent.Lock引起的線程阻塞問題呢?
分析思路
從對象入手
一種思路是從對象入手,通過對象上的螢幕可以擷取如下資訊:
- 持有該對象鎖的線程
- 持有該對象鎖的線程的重入次數
- 正在争取該對象鎖的線程們
- 調用了wait後,等待notify的線程們
JVMTI提供了如下接口用來擷取以上資訊:
typedef struct {
jthread owner;
jint entry_count;
jint waiter_count;
jthread* waiters;
jint notify_waiter_count;
jthread* notify_waiters;
} jvmtiMonitorUsage;
//Get information about the object's monitor.
//The fields of the jvmtiMonitorUsage structure are filled in with information about usage of the monitor.
jvmtiError GetObjectMonitorUsage(jvmtiEnv* env,jobject object,
jvmtiMonitorUsage* info_ptr)
但是,似乎無從下手。
從線程入手
如果多個線程在争用一把鎖,那麼擁有這把鎖的線程就是阻塞住了多個線程的線程,然後将擁有這把鎖的線程棧列印出來就可以對阻塞問題進行分析了。
是以線程資訊中的下面兩個資訊是我們最關注的:
- 該線程已經擁有的鎖資訊
- 該線程正在争取的鎖資訊
JVMTI中提供了擷取以上資訊的接口:
Get Owned Monitor Info
jvmtiError
GetOwnedMonitorInfo(jvmtiEnv* env,
jthread thread,
jint* owned_monitor_count_ptr,
jobject** owned_monitors_ptr)
Get Current Contended Monitor
jvmtiError
GetCurrentContendedMonitor(jvmtiEnv* env,
jthread thread,
jobject* monitor_ptr)
綜上分析,該問題總的解決思路是:
- 擷取所有線程資訊
- 擷取被争用最多的鎖對象
- 擷取每個鎖對象對應的線程資訊
- 找出擁有被争用最多的鎖對象的線程資訊
以上功能已經在arthas裡實作了,下面看看arthas是如何實作的。
arthas: thread -b
下面是arthas thread -b指令的主要實作邏輯:
public static BlockingLockInfo findMostBlockingLock() {
// 擷取所有線程資訊
ThreadInfo[] infos = threadMXBean.dumpAllThreads(threadMXBean.isObjectMonitorUsageSupported(),
threadMXBean.isSynchronizerUsageSupported());
// a map of <LockInfo.getIdentityHashCode, number of thread blocking on this>
Map<Integer, Integer> blockCountPerLock = new HashMap<Integer, Integer>();
// a map of <LockInfo.getIdentityHashCode, the thread info that holding this lock
Map<Integer, ThreadInfo> ownerThreadPerLock = new HashMap<Integer, ThreadInfo>();
// 通過周遊線程,擷取
// 1.被争用的鎖對象,及該鎖對象被多少個線程争用
// 2.已被擷取到的鎖對象,及擁有該鎖對象的線程
for (ThreadInfo info: infos) {
if (info == null) {
continue;
}
LockInfo lockInfo = info.getLockInfo();
if (lockInfo != null) {
// the current thread is blocked waiting on some condition
if (blockCountPerLock.get(lockInfo.getIdentityHashCode()) == null) {
blockCountPerLock.put(lockInfo.getIdentityHashCode(), 0);
}
int blockedCount = blockCountPerLock.get(lockInfo.getIdentityHashCode());
blockCountPerLock.put(lockInfo.getIdentityHashCode(), blockedCount + 1);
}
for (MonitorInfo monitorInfo: info.getLockedMonitors()) {
// the object monitor currently held by this thread
if (ownerThreadPerLock.get(monitorInfo.getIdentityHashCode()) == null) {
ownerThreadPerLock.put(monitorInfo.getIdentityHashCode(), info);
}
}
for (LockInfo lockedSync: info.getLockedSynchronizers()) {
// the ownable synchronizer currently held by this thread
if (ownerThreadPerLock.get(lockedSync.getIdentityHashCode()) == null) {
ownerThreadPerLock.put(lockedSync.getIdentityHashCode(), info);
}
}
}
// find the thread that is holding the lock that blocking the largest number of threads.找出擁有【被争用最多的鎖對象】的線程
int mostBlockingLock = 0; // System.identityHashCode(null) == 0
int maxBlockingCount = 0;
for (Map.Entry<Integer, Integer> entry: blockCountPerLock.entrySet()) {
if (entry.getValue() > maxBlockingCount && ownerThreadPerLock.get(entry.getKey()) != null) {
// the lock is explicitly held by anther thread.
maxBlockingCount = entry.getValue();
mostBlockingLock = entry.getKey();
}
}
if (mostBlockingLock == 0) {
// nothing found
return EMPTY_INFO;
}
BlockingLockInfo blockingLockInfo = new BlockingLockInfo();
blockingLockInfo.setThreadInfo(ownerThreadPerLock.get(mostBlockingLock));
blockingLockInfo.setLockIdentityHashCode(mostBlockingLock);
blockingLockInfo.setBlockingThreadCount(blockCountPerLock.get(mostBlockingLock));
return blockingLockInfo;
}
測試thread -b
下面代碼模拟的是synchronized和Lock引起阻塞的場景:
- synchronized方式的一共起了5個線程,其中隻有一個線程擷取了鎖,其餘4個線程等待擷取鎖;
- Lock方式的一共也起了5個線程,其中隻有一個線程擷取了鎖,其餘4個線程等待擷取鎖;
import java.lang.management.ManagementFactory;
import java.util.concurrent.locks.ReentrantLock;
public class Main {
private static final ReentrantLock REENTRANT_LOCK = new ReentrantLock();
public static void main(String[] args) {
System.out.println(ManagementFactory.getRuntimeMXBean().getName());
int num = 5;
for (int i = 0; i < num; i++) {
Thread synchronizedT = new Thread(() -> {
synchronized (Main.class) {
sleep();
}
});
synchronizedT.setName("synchronizedT-" + i);
synchronizedT.start();
Thread lockT = new Thread(() -> {
REENTRANT_LOCK.lock();
try {
sleep();
} finally {
REENTRANT_LOCK.unlock();
}
});
lockT.setName("lockT-" + i);
lockT.start();
}
}
public static void sleep() {
try {
Thread.sleep(Long.MAX_VALUE);
} catch (Exception e) {
e.printStackTrace();
}
}
}
通過thread -b期望能夠顯示出兩個線程資訊,一個是擷取了鎖的synchronized線程,一個是擷取了鎖的Lock線程,實際結果是thread -b傳回了一個線程資訊(從上面arthas的代碼也可以得出這個結論),如圖:
期望與實際不相符,這個場景下算不算arthas thread -b的一個小bug呢?
另外一點是arthas thread -b的幫助文檔說明與測試結果不相符:
總結
- synchronized關鍵字和java.util.concurrent.Lock用來實作線程間的同步;
- JVMTI提供了擷取對象螢幕的方法:GetObjectMonitorUsage、擷取線程已擁有的螢幕資訊的方法:GetOwnedMonitorInfo、擷取線程争用的螢幕的方法:GetCurrentContendedMonitor
- 通過ManagementFactory.getThreadMXBean()可以dumpAllThreads,通過ThreadInfo可以擷取LockInfo、LockedMonitors、LockedSynchronizers。