laitimes

A few lines of code easily reproduce the druid connection leaked bug PhyTimeout

author:Dayu's footprints

Background

Using arthas to analyze the druid connection pool bug step by step introduces the analysis and troubleshooting process of connection leakage, but the scenario that causes connection leakage is not described in too detail, and the following reproduces the case of connection leakage through code.

Reproduce the process

Connection leakage scenario

This simulates a scenario where the connection leaks with the phyTimeoutMillis parameter set.

The connection leaks simulated code

import com.alibaba.druid.pool.DruidConnectionHolder;
import com.alibaba.druid.pool.DruidDataSource;
import com.alibaba.druid.pool.DruidPooledConnection;

import java.util.List;
import java.util.Map;

public class DruidAbandonedCase4PhyTimeout {
    public static void main(String[] args) throws Exception{
        DruidDataSource dataSource =  new DruidDataSource();
        dataSource.setDriverClassName("com.mysql.cj.jdbc.Driver");
        dataSource.setUsername("root");
        dataSource.setPassword("123456");
        dataSource.setUrl("jdbc:mysql://127.0.0.1:3306/test?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8&useSSL=false");

        dataSource.setMinIdle(2);
        // 保证不会触发DestroyTask,以便手动调用shrink
        dataSource.setTimeBetweenEvictionRunsMillis(1 * 60 * 60 * 1000);
        // 物理连接超时时长(超过该阈值会close掉物理连接)
        dataSource.setPhyTimeoutMillis(1000);
        dataSource.setMinEvictableIdleTimeMillis(500);
        dataSource.setKeepAliveBetweenTimeMillis(501);

        dataSource.init();

        DruidPooledConnection connection1 = dataSource.getConnection();
        DruidConnectionHolder holder1 = connection1.getConnectionHolder();
        System.out.println(holder1);
        try{
            //假装该连接使用了500ms,要小于phyTimeoutMillis,以免归还连接的时候被discard了
            Thread.sleep(500L);
        }catch (Exception e){
            e.printStackTrace();
        }

        DruidPooledConnection connection2 = dataSource.getConnection();
        DruidConnectionHolder holder2 = connection2.getConnectionHolder();
        System.out.println(holder2);
        connection2.close();

        connection1.close();

        System.out.println("调用shrink【之前】,连接池中连接情况,begin");
        List<Map<String, Object>> conns = dataSource.getPoolingConnectionInfo();
        for(Map<String,Object> conn : conns){
            System.out.println(conn.get("id"));
        }
        System.out.println("调用shrink【之前】,连接池中连接情况,end");

        try{
            // 让连接1的物理连接时间超过phyTimeoutMillis,以观察连接1是否会被连接池释放掉
            Thread.sleep(501L);
        }catch (Exception e){
            e.printStackTrace();
        }

        System.out.println();
        System.out.println("物理连接时间,begin");
        long ctm = System.currentTimeMillis();
        System.out.println(holder1 + " phyConnectTimeMillis:" + (ctm - holder1.getConnectTimeMillis()) + " ,idleMillis:" + (ctm - holder1.getLastActiveTimeMillis()));
        System.out.println(holder2 + " phyConnectTimeMillis:" + (ctm - holder2.getConnectTimeMillis()) + " ,idleMillis:" + (ctm - holder2.getLastActiveTimeMillis()));
        System.out.println("物理连接时间,end");

        System.out.println();
        // 手动执行shrink
        dataSource.shrink(true);
        System.out.println("调用shrink【之后】,连接池中连接情况,begin");
        conns = dataSource.getPoolingConnectionInfo();
        for(Map<String,Object> conn : conns){
            System.out.println(conn.get("id"));
        }
        System.out.println("调用shrink【之后】,连接池中连接情况,end");

        System.out.println();

        System.out.println(holder1 + " isClosed:" + holder1.getConnection().isClosed());
        System.out.println(holder2 + " isClosed:" + holder2.getConnection().isClosed());
    }
}           

Run the above program, and the output is as follows:

{ID:26004719, ConnectTime:"2023-10-10 12:40:04", UseCount:1, LastActiveTime:"2023-10-10 12:40:04"}
{ID:31314834, ConnectTime:"2023-10-10 12:40:05", UseCount:1, LastActiveTime:"2023-10-10 12:40:05"}
调用shrink【之前】,连接池中连接情况,begin
31314834
26004719
调用shrink【之前】,连接池中连接情况,end

物理连接时间,begin
{ID:26004719, ConnectTime:"2023-10-10 12:40:04", UseCount:1, LastActiveTime:"2023-10-10 12:40:05"} phyConnectTimeMillis:1011 ,idleMillis:501
{ID:31314834, ConnectTime:"2023-10-10 12:40:05", UseCount:1, LastActiveTime:"2023-10-10 12:40:05"} phyConnectTimeMillis:502 ,idleMillis:501
物理连接时间,end

调用shrink【之后】,连接池中连接情况,begin
26004719
调用shrink【之后】,连接池中连接情况,end

{ID:26004719, ConnectTime:"2023-10-10 12:40:04", UseCount:1, LastActiveTime:"2023-10-10 12:40:05"} isClosed:true
{ID:31314834, ConnectTime:"2023-10-10 12:40:05", UseCount:1, LastActiveTime:"2023-10-10 12:40:05"} isClosed:false           

Expected execution results:

Connections with 26004719 ID should be removed from the connection pool, and connections with 31314834 ID should be managed by the connection pool

Actual execution result:

The ID is 26004719 connection is not removed from the connection pool; The connection with the ID is 31314834 but is deleted from the connection pool, and the connection is not closed, that is, the connection with the ID 31314834 is neither managed by the connection pool nor closed, which is a leaked connection.

The druid 1.1.11, 1.2.0, 1.2.8 ~ 1.2.17, 1.2.18~1.2.20 were tested, of which [1.1.11, 1.2.0, 1.2.8 ~ 1.2.17] all have the above connection leakage problems, and 1.2.18~1.2.20 does not have the above connection leakage problems.