背景
在分布式系統中,分布式事務是一個必須要解決的問題,目前使用較多的是最終一緻性方案。今天帶來的這篇,就給大家分析一下Seata的源碼是如何一步一步實作的。讀源碼的時候我們需要俯瞰起全貌,不要去扣一個一個的細節,這樣我們學習起來會快捷而且有效率,我們學習源碼需要掌握的是整體思路和核心點。
一、分布式事務簡介
分布式事務有各種實作方案,不過大體可分為兩類,一種不需要關注事務分支與全局事務的互動過程。另一種是将邏輯拆分成三個部分準備、送出、復原,分支事務加入全局事務。這兩種在Seata裡前者稱為AT模式,後者稱為MT模式。
二、内容
1.分布式事務資料庫操作型
MySQL XA方案 是其中一種,這種的話是直接作用于資料庫。
其中RM執行本地事務送出與復原;TM是分布式事務核心管理。
缺點的話,一是不适用于微服務,二很明顯由于每次操作不送出,最後導緻資料未送出越來越多時候,性能就不是很好了。那麼像Seata這種業務層面的解決就相對而已來說性能強大了很多。
2.Seata分布式事務詳解
Seata分布式事務是業務層民的解決方案。
而且隻依賴于單台資料的事務能力。
Seata包括三個role:
1> TC 事務協調,負責協調并驅動全局事務的送出與復原
2> TM 控制全局事務邊界,負責開啟全局事務,決定全局事務的送出與復原
3> RM 分支事務,負責分支注冊,接收TC指令,驅動分支事務復原與送出
自己了解後畫了個菜鳥圖,如下:
3.XA vs Seata AT
AT不需要XA協定,适配于微服務,XA事務性資源的鎖都要保持到第二階段 完成才釋放,AT分本地鎖和全局鎖,本地鎖由本地事務管理,全局鎖由全局事務管理,在決議第二階段全局送出時,全局鎖馬上可以釋放。
相關概念
- XID:一個全局事務的唯一辨別,由ip:port:sequence組成
- Transaction Coordinator (TC): 事務協調器,維護全局事務的運作狀态,負責協調并驅動全局事務的送出或復原。
- Transaction Manager ™: 控制全局事務的邊界,負責開啟一個全局事務,并最終發起全局送出或全局復原的決議。
- Resource Manager (RM): 控制分支事務,負責分支注冊、狀态彙報,并接收事務協調器的指令,驅動分支(本地)事務的送出和復原。
原理
seata涉及到三個角色之間的互動,本文通過流程圖将AT模式下的基本互動流程梳理一下,為我們以後的解析打下基礎。
假設有三個微服務,分别是服務A、B、C,其中服務A中調用了服務B和服務C,TM、TC、RM三者之間的互動流程如下圖:
- 1、服務A啟動時,GlobalTransactionScanner會對有@GlobalTransaction注解的方法進行AOP增強,并生成代理,增強的代碼位于GlobalTransactionalInterceptor類中,當調用@GlobalTransaction注解的方法時,增強代碼首先向TC注冊全局事務,表示全局事務的開始,同時TC生成XID,并傳回給TM;
- 2、服務A中調用服務B時,将XID傳遞給服務B;
- 3、服務B得到XID後,通路TC,注冊分支事務,并從TC獲得分支事務ID,TC根據XID将分支事務與全局事務關聯;
- 4、接下來服務B開始執行SQL語句,在執行前将表中對應的資料儲存一份,執行後在儲存一份,将這兩份記錄作為復原記錄寫入到資料庫中,如果執行過程中沒有異常,服務B最後将事務送出,并通知TC分支事務成功,服務B也會清除本地事務資料;
- 5、服務A通路完服務B後,通路服務C;
- 6、服務C與TC之間的互動與服務B完全一緻;
- 7、服務B和服務C都成功後,服務A通過TM通知TC全局事務成功,如果失敗了,服務A也會通知TC全局事務失敗;
-
8、TC記錄了全局事務下的每個分支事務,TC收到全局事務的結果後,如果結果成功,則通知RM成功,RM收到通知後清理之前在資料庫中儲存的復原記錄,如果失敗了,則RM要查詢出之前在資料庫儲存的復原記錄,對之前的SQL操作進行復原。
因為TM、RM、TC之間的互動都是通過網絡完成的,很容易出現網絡斷開的情況,是以TC提供了四個定時線程池,定時檢測系統中是否有逾時事務、異步送出事務、復原重試事務、重試送出事務,如果發現了有這四類事務,則從全局事務中擷取所有的分支事務,分别調用各個分支事務完成對應的操作,依次來確定事務的一緻性。
需要考慮的問題:
通過上面流程的分析可以發現,每次SQL操作(查詢除外)時,都會增加額外了三次資料庫操作;每次全局事務和分支事務開啟時,都涉及到TM、RM與TC的互動;全局事務期間還要承擔資料短時不一緻的情況,這些都是我們在使用AT模式需要考慮的情況。
在這裡插播一下,現在把架構師必須具備的一些技術總結出來一套思維導圖和錄制了一些相關視訊,分享給大家,供大家參考。感興趣的鐵子們可以背景私信【架構】免費擷取高清知識體系思維導圖及相關資料,後續也會持續更新Java的其它知識點和面試方法資料等等...感興趣的鐵汁們可以持續關注我
項目依賴
seata使用XID表示一個分布式事務,XID需要在一次分布式事務請求所涉的系統中進行傳遞,進而向feacar-server發送分支事務的處理情況,以及接收feacar-server的commit、rollback指令。是以在分布式系統中使用seata要解決XID的傳遞問題。seata目前支援全版本的dubbo,對于spring cloud的分布式項目社群也提供了相應的實作
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-alibaba-fescar</artifactId>
</dependency>
該元件實作了基于RestTemplate、Feign通訊時的XID傳遞功能,詳細說明見內建源碼深度剖析:Fescar x Spring Cloud
spring-cloud-alibaba-fescar内已包含了fescar-spring的依賴,是以可以不另外引入,檢視完整的pom.xml.
業務邏輯
業務邏輯是經典的下訂單、扣餘額、減庫存流程。根據子產品劃分為三個獨立的服務,且分别連接配接對應的資料庫
- 訂單:order-server
- 賬戶:account-server
- 庫存:storage-server
另外還有發起分布式事務的業務系統
- 業務:business-server
項目結構如下圖
配置檔案
seata的配置檔案入口為registry.conf檢視代碼ConfigurationFactory得知目前還不能指定該配置檔案,是以名稱隻能為registry.conf
private static final String REGISTRY_CONF = "registry.conf";
public static final Configuration FILE_INSTANCE = new FileConfiguration(REGISTRY_CONF);
在registry中可以指定具體配置的形式,這裡使用預設的file形式。在file.conf中有3部配置設定置内容
1.transport
transport部分的配置對應NettyServerConfig類,用于定義Netty的相關參與,client與server的通信使用的Netty
2.service
service {
#vgroup->rgroup
vgroup_mapping.my_test_tx_group = "default"
#配置Client連接配接TC的位址
default.grouplist = "127.0.0.1:8091"
#degrade current not support
enableDegrade = false
#disable
是否啟用seata的分布式事務
disableGlobalTransaction = false
}
//部分代碼
public class GlobalTransactionScanner{
private final boolean disableGlobalTransaction =
ConfigurationFactory.getInstance().getBoolean("service.disableGlobalTransaction", false);
public void afterPropertiesSet() {
if (disableGlobalTransaction) {
if (LOGGER.isInfoEnabled()) {
LOGGER.info("Global transaction is disabled.");
}
return;
}
initClient();
}
}
3.client
client {
#RM接收TC的commit通知後緩沖上限
async.commit.buffer.limit = 10000
lock {
retry.internal = 10
retry.times = 30
}
}
啟動Server
前往https://github.com/seata/seata/releases 下載下傳最新版本的 Fescar Server
解壓之後的 bin 目錄,執行
./fescar-server.sh 8091 ../data
啟動成功輸出
2019-04-09 20:27:24.637 INFO [main]c.a.fescar.core.rpc.netty.AbstractRpcRemotingServer.start:152 -Server started ...
啟動Client
對于Spring boot項目,啟動運作xxxApplication的main方法即可,seata的加載入口類位于GlobalTransactionAutoConfiguration
@Configuration
@EnableConfigurationProperties({FescarProperties.class})
public class GlobalTransactionAutoConfiguration {
private final ApplicationContext applicationContext;
private final FescarProperties fescarProperties;
public GlobalTransactionAutoConfiguration(ApplicationContext applicationContext, FescarProperties fescarProperties) {
this.applicationContext = applicationContext;
this.fescarProperties = fescarProperties;
}
@Bean
public GlobalTransactionScanner globalTransactionScanner() {
String applicationName = this.applicationContext.getEnvironment().getProperty("spring.application.name");
String txServiceGroup = this.fescarProperties.getTxServiceGroup();
if (StringUtils.isEmpty(txServiceGroup)) {
txServiceGroup = applicationName + "-fescar-service-group";
this.fescarProperties.setTxServiceGroup(txServiceGroup);
}
return new GlobalTransactionScanner(applicationName, txServiceGroup);
}
}
可以看到支援一個配置項FescarProperties,用于配置事務分組名稱
spring.cloud.alibaba.fescar.tx-service-group=my_test_tx_group
如果不指定則用spring.application.name+ -fescar-service-group生成一個名稱,是以不指定spring.application.name啟動會報錯
@ConfigurationProperties("spring.cloud.alibaba.fescar")
public class FescarProperties {
private String txServiceGroup;
public FescarProperties() {
}
public String getTxServiceGroup() {
return this.txServiceGroup;
}
public void setTxServiceGroup(String txServiceGroup) {
this.txServiceGroup = txServiceGroup;
}
}
有了applicationId和txServiceGroup則建立GlobalTransactionScanner對象,主要看其中的initClient方法
private void initClient() {
if (LOGGER.isInfoEnabled()) {
LOGGER.info("Initializing Global Transaction Clients ... ");
}
if (StringUtils.isNullOrEmpty(applicationId) || StringUtils.isNullOrEmpty(txServiceGroup)) {
throw new IllegalArgumentException(
"applicationId: " + applicationId + ", txServiceGroup: " + txServiceGroup);
}
//init TM
TMClient.init(applicationId, txServiceGroup);
if (LOGGER.isInfoEnabled()) {
LOGGER.info(
"Transaction Manager Client is initialized. applicationId[" + applicationId + "] txServiceGroup["
+ txServiceGroup + "]");
}
//init RM
RMClient.init(applicationId, txServiceGroup);
if (LOGGER.isInfoEnabled()) {
LOGGER.info("Resource Manager is initialized. applicationId[" + applicationId + "] txServiceGroup[" + txServiceGroup + "]");
}
if (LOGGER.isInfoEnabled()) {
LOGGER.info("Global Transaction Clients are initialized. ");
}
}
可以看到初始化了TMClient和RMClient,對于一個服務既可以是TM角色也可以是RM角色,至于什麼時候是TM或者RM則要看在一次全局事務中@GlobalTransactional注解标注在哪。
Client建立的結果是與TC的一個Netty連接配接,是以在啟動日志中可以看到兩個Netty Channel,其中也标明了transactionRole分别為TMROLE和RMROLE
NettyPool create channel to {"address":"127.0.0.1:8091","message":{"applicationId":"order-service","byteBuffer":{"char":"\u0000","direct":false,"double":0.0,"float":0.0,"int":0,"long":0,"readOnly":false,"short":0},"transactionServiceGroup":"hello-service-fescar-service-group","typeCode":101,"version":"0.4.0"},"transactionRole":"TMROLE"}
NettyPool create channel to {"address":"127.0.0.1:8091","message":{"applicationId":"order-service","byteBuffer":{"char":"\u0000","direct":false,"double":0.0,"float":0.0,"int":0,"long":0,"readOnly":false,"short":0},"resourceIds":"jdbc:mysql://127.0.0.1:3306/db_order?useSSL=false","transactionServiceGroup":"hello-service-fescar-service-group","typeCode":103,"version":"0.4.0"},"transactionRole":"RMROLE"}
Send:RegisterTMRequest{applicationId='order-service', transactionServiceGroup='hello-service-fescar-service-group'}
Send:RegisterRMRequest{resourceIds='jdbc:mysql://127.0.0.1:3306/db_order?useSSL=false', applicationId='order-service', transactionServiceGroup='hello-service-fescar-service-group'}
Receive:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null,messageId:2
Receive:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null,messageId:1
com.alibaba.fescar.core.rpc.netty.RmRpcClient@7904cd7c msgId:2, future :com.alibaba.fescar.core.protocol.MessageFuture@4107849f, body:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null
com.alibaba.fescar.core.rpc.netty.TmRpcClient@68609034 msgId:1, future :com.alibaba.fescar.core.protocol.MessageFuture@527cc144, body:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null
register success, cost 28 ms, version:0.4.1,role:TMROLE,channel:[id: 0xf45059d4, L:/127.0.0.1:63533 - R:/127.0.0.1:8091]
register RM success. server version:0.4.1,channel:[id: 0xb7674b6a, L:/127.0.0.1:63534 - R:/127.0.0.1:8091]
register success, cost 37 ms, version:0.4.1,role:RMROLE,channel:[id: 0xb7674b6a, L:/127.0.0.1:63534 - R:/127.0.0.1:8091]
日志中可以看到建立連接配接後,發送了注冊請求,然後得到了結果相應,RmRpcClient、TmRpcClient成功執行個體化。
TM處理流程
在本例中,TM的角色是business-service,因為BusinessService的purchase方法标注了@GlobalTransactional
@Service
public class BusinessService {
@Autowired
private StorageFeignClient storageFeignClient;
@Autowired
private OrderFeignClient orderFeignClient;
@GlobalTransactional
public void purchase(String userId, String commodityCode, int orderCount){
storageFeignClient.deduct(commodityCode, orderCount);
orderFeignClient.create(userId, commodityCode, orderCount);
}
}
GET請求127.0.0.1:8084/purchaseuserId=1001&commodityCode=2001&orderCount=1看看會發生什麼
全局事務開啟
首先需要關注的是@GlobalTransactional注解的作用,它是在GlobalTransactionalInterceptor中被攔截處理
//部分代碼
public class GlobalTransactionalInterceptor implements MethodInterceptor {
@Override
public Object invoke(final MethodInvocation methodInvocation) throws Throwable {
Class<?> targetClass = (methodInvocation.getThis() != null ? AopUtils.getTargetClass(methodInvocation.getThis()) : null);
Method specificMethod = ClassUtils.getMostSpecificMethod(methodInvocation.getMethod(), targetClass);
final Method method = BridgeMethodResolver.findBridgedMethod(specificMethod);
//擷取方法GlobalTransactional注解
final GlobalTransactional globalTransactionalAnnotation = getAnnotation(method, GlobalTransactional.class);
final GlobalLock globalLockAnnotation = getAnnotation(method, GlobalLock.class);
//如果方法有GlobalTransactional注解,則進行相應處理
if (globalTransactionalAnnotation != null) {
return handleGlobalTransaction(methodInvocation, globalTransactionalAnnotation);
} else if (globalLockAnnotation != null) {
return handleGlobalLock(methodInvocation);
} else {
return methodInvocation.proceed();
}
}
//調用了TransactionalTemplate
private Object handleGlobalTransaction(final MethodInvocation methodInvocation,
final GlobalTransactional globalTrxAnno) throws Throwable {
try {
return transactionalTemplate.execute(new TransactionalExecutor() {
@Override
public Object execute() throws Throwable {
return methodInvocation.proceed();
}
@Override
public int timeout() {
return globalTrxAnno.timeoutMills();
}
@Override
public String name() {
String name = globalTrxAnno.name();
if (!StringUtils.isNullOrEmpty(name)) {
return name;
}
return formatMethod(methodInvocation.getMethod());
}
});
} catch (TransactionalExecutor.ExecutionException e) {
TransactionalExecutor.Code code = e.getCode();
switch (code) {
case RollbackDone:
throw e.getOriginalException();
case BeginFailure:
failureHandler.onBeginFailure(e.getTransaction(), e.getCause());
throw e.getCause();
case CommitFailure:
failureHandler.onCommitFailure(e.getTransaction(), e.getCause());
throw e.getCause();
case RollbackFailure:
failureHandler.onRollbackFailure(e.getTransaction(), e.getCause());
throw e.getCause();
default:
throw new ShouldNeverHappenException("Unknown TransactionalExecutor.Code: " + code);
}
}
}
}
TransactionalTemplate定義了TM對全局事務處理的标準步驟,注釋寫的比較清楚了
public class TransactionalTemplate {
public Object execute(TransactionalExecutor business) throws TransactionalExecutor.ExecutionException {
// 1. get or create a transaction
GlobalTransaction tx = GlobalTransactionContext.getCurrentOrCreate();
try {
// 2. begin transaction
try {
triggerBeforeBegin();
tx.begin(business.timeout(), business.name());
triggerAfterBegin();
} catch (TransactionException txe) {
throw new TransactionalExecutor.ExecutionException(tx, txe,
TransactionalExecutor.Code.BeginFailure);
}
Object rs = null;
try {
// Do Your Business
rs = business.execute();
} catch (Throwable ex) {
// 3. any business exception, rollback.
try {
triggerBeforeRollback();
tx.rollback();
triggerAfterRollback();
// 3.1 Successfully rolled back
throw new TransactionalExecutor.ExecutionException(tx, TransactionalExecutor.Code.RollbackDone, ex);
} catch (TransactionException txe) {
// 3.2 Failed to rollback
throw new TransactionalExecutor.ExecutionException(tx, txe,
TransactionalExecutor.Code.RollbackFailure, ex);
}
}
// 4. everything is fine, commit.
try {
triggerBeforeCommit();
tx.commit();
triggerAfterCommit();
} catch (TransactionException txe) {
// 4.1 Failed to commit
throw new TransactionalExecutor.ExecutionException(tx, txe,
TransactionalExecutor.Code.CommitFailure);
}
return rs;
} finally {
//5. clear
triggerAfterCompletion();
cleanUp();
}
}
}
其中DefaultGlobalTransaction的begin方法就是開啟全局事務
@Override
public void begin(int timeout, String name) throws TransactionException {
//此處的角色判斷有關鍵的作用
//表明目前是全局事務的發起者(Launcher)還是參與者(Participant)
//如果在分布式事務的下遊系統方法中也加上GlobalTransactional注解
//那麼它的角色就是Participant,即會忽略後面的begin就退出了
//而判斷是發起者(Launcher)還是參與者(Participant)是根據目前上下文是否已存在XID來判斷
//沒有XID的就是Launcher,已經存在XID的就是Participant
if (role != GlobalTransactionRole.Launcher) {
check();
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("Ignore Begin(): just involved in global transaction [" + xid + "]");
}
return;
}
if (xid != null) {
throw new IllegalStateException();
}
if (RootContext.getXID() != null) {
throw new IllegalStateException();
}
//具體開啟事務的方法,擷取TC傳回的XID
xid = transactionManager.begin(null, null, name, timeout);
status = GlobalStatus.Begin;
RootContext.bind(xid);
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("Begin a NEW global transaction [" + xid + "]");
}
}
DefaultTransactionManager負責TM與TC通訊,發送begin、commit、rollback指令
@Override
public String begin(String applicationId, String transactionServiceGroup, String name, int timeout)
throws TransactionException {
GlobalBeginRequest request = new GlobalBeginRequest();
request.setTransactionName(name);
request.setTimeout(timeout);
GlobalBeginResponse response = (GlobalBeginResponse)syncCall(request);
return response.getXid();
}
至此拿到TC傳回的XID一個全局事務就開啟了,日志中也反應了上述流程
2019-04-09 13:46:57.417 DEBUG 31326 --- [nio-8084-exec-1] c.a.f.c.rpc.netty.AbstractRpcRemoting : offer message: timeout=60000,transactionName=purchase(java.lang.String,java.lang.String,int)
2019-04-09 13:46:57.417 DEBUG 31326 --- [geSend_TMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting : write message:FescarMergeMessage timeout=60000,transactionName=purchase(java.lang.String,java.lang.String,int), channel:[id: 0xa148545e, L:/127.0.0.1:56120 - R:/127.0.0.1:8091],active?true,writable?true,isopen?true
2019-04-09 13:46:57.418 DEBUG 31326 --- [lector_TMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Send:FescarMergeMessage timeout=60000,transactionName=purchase(java.lang.String,java.lang.String,int)
2019-04-09 13:46:57.421 DEBUG 31326 --- [lector_TMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Receive:MergeResultMessage com.alibaba.fescar.core.protocol.transaction.GlobalBeginResponse@2dc480dc,messageId:1196
2019-04-09 13:46:57.421 DEBUG 31326 --- [nio-8084-exec-1] c.a.fescar.core.context.RootContext : bind 192.168.224.93:8091:2008502699
2019-04-09 13:46:57.421 DEBUG 31326 --- [nio-8084-exec-1] c.a.f.tm.api.DefaultGlobalTransaction : Begin a NEW global transaction [192.168.224.93:8091:2008502699]
全局事務建立後,就開始執行business.execute(),即業務代碼storageFeignClient.deduct(commodityCode, orderCount);進入RM處理流程
RM處理流程
@GetMapping(path = "/deduct")
public Boolean deduct(String commodityCode, Integer count){
storageService.deduct(commodityCode,count);
return true;
}
@Transactional
public void deduct(String commodityCode, int count){
Storage storage = storageDAO.findByCommodityCode(commodityCode);
storage.setCount(storage.getCount()-count);
storageDAO.save(storage);
if (count == 5){
throw new RuntimeException("storage branch exception");
}
}
storage的接口和service方法并未出現seata相關的代碼和注解,那麼它是如何加入到這次全局事務中的呢,答案是ConnectionProxy中,這也是前面說為什麼必須要使用DataSourceProxy的原因,通過DataSourceProxy才能在業務代碼的事務送出時,seata通過這個切入點,來給TC發送rm的處理結果
由于業務代碼本身的事務送出被ConnectionProxy代理,是以在送出本地事務時,實際執行的是ConnectionProxy的commit方法
//部分代碼
public class ConnectionProxy extends AbstractConnectionProxy {
@Override
public void commit() throws SQLException {
//如果目前是全局事務,則執行全局事務的送出
//判斷是不是全局事務,就是看目前上下文是否存在XID
if (context.inGlobalTransaction()) {
processGlobalTransactionCommit();
} else if (context.isGlobalLockRequire()) {
processLocalCommitWithGlobalLocks();
} else {
targetConnection.commit();
}
}
private void processGlobalTransactionCommit() throws SQLException {
try {
//首先是向TC注冊RM,拿到TC配置設定的branchId
register();
} catch (TransactionException e) {
recognizeLockKeyConflictException(e);
}
try {
if (context.hasUndoLog()) {
//寫入undolog
UndoLogManager.flushUndoLogs(this);
}
//送出本地事務,可以看到寫入undolog和業務資料是在同一個本地事務中
targetConnection.commit();
} catch (Throwable ex) {
//向TC發送rm的事務處理失敗的通知
report(false);
if (ex instanceof SQLException) {
throw new SQLException(ex);
}
}
//向TC發送rm的事務處理成功的通知
report(true);
context.reset();
}
//注冊RM,建構request通過netty向TC發送指令
//将傳回的branchId存在上下文中
private void register() throws TransactionException {
Long branchId = DefaultResourceManager.get().branchRegister(BranchType.AT, getDataSourceProxy().getResourceId(),
null, context.getXid(), null, context.buildLockKeys());
context.setBranchId(branchId);
}
}
通過日志印證一下上面的流程
2019-04-09 21:57:48.341 DEBUG 38933 --- [nio-8081-exec-1] o.s.c.a.f.web.FescarHandlerInterceptor : xid in RootContext null xid in RpcContext 192.168.0.2:8091:2008546211
2019-04-09 21:57:48.341 DEBUG 38933 --- [nio-8081-exec-1] c.a.fescar.core.context.RootContext : bind 192.168.0.2:8091:2008546211
2019-04-09 21:57:48.341 DEBUG 38933 --- [nio-8081-exec-1] o.s.c.a.f.web.FescarHandlerInterceptor : bind 192.168.0.2:8091:2008546211 to RootContext
2019-04-09 21:57:48.386 INFO 38933 --- [nio-8081-exec-1] o.h.h.i.QueryTranslatorFactoryInitiator : HHH000397: Using ASTQueryTranslatorFactory
Hibernate: select storage0_.id as id1_0_, storage0_.commodity_code as commodit2_0_, storage0_.count as count3_0_ from storage_tbl storage0_ where storage0_.commodity_code=?
Hibernate: update storage_tbl set count=? where id=?
2019-04-09 21:57:48.673 INFO 38933 --- [nio-8081-exec-1] c.a.fescar.core.rpc.netty.RmRpcClient : will connect to 192.168.0.2:8091
2019-04-09 21:57:48.673 INFO 38933 --- [nio-8081-exec-1] c.a.fescar.core.rpc.netty.RmRpcClient : RM will register :jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false
2019-04-09 21:57:48.673 INFO 38933 --- [nio-8081-exec-1] c.a.f.c.rpc.netty.NettyPoolableFactory : NettyPool create channel to {"address":"192.168.0.2:8091","message":{"applicationId":"storage-service","byteBuffer":{"char":"\u0000","direct":false,"double":0.0,"float":0.0,"int":0,"long":0,"readOnly":false,"short":0},"resourceIds":"jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false","transactionServiceGroup":"hello-service-fescar-service-group","typeCode":103,"version":"0.4.0"},"transactionRole":"RMROLE"}
2019-04-09 21:57:48.677 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Send:RegisterRMRequest{resourceIds='jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false', applicationId='storage-service', transactionServiceGroup='hello-service-fescar-service-group'}
2019-04-09 21:57:48.680 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Receive:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null,messageId:9
2019-04-09 21:57:48.680 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting : com.alibaba.fescar.core.rpc.netty.RmRpcClient@7d61f5d4 msgId:9, future :com.alibaba.fescar.core.protocol.MessageFuture@186cd3e0, body:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null
2019-04-09 21:57:48.680 INFO 38933 --- [nio-8081-exec-1] c.a.fescar.core.rpc.netty.RmRpcClient : register RM success. server version:0.4.1,channel:[id: 0xd40718e3, L:/192.168.0.2:62607 - R:/192.168.0.2:8091]
2019-04-09 21:57:48.680 INFO 38933 --- [nio-8081-exec-1] c.a.f.c.rpc.netty.NettyPoolableFactory : register success, cost 3 ms, version:0.4.1,role:RMROLE,channel:[id: 0xd40718e3, L:/192.168.0.2:62607 - R:/192.168.0.2:8091]
2019-04-09 21:57:48.680 DEBUG 38933 --- [nio-8081-exec-1] c.a.f.c.rpc.netty.AbstractRpcRemoting : offer message: transactionId=2008546211,branchType=AT,resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false,lockKey=storage_tbl:1
2019-04-09 21:57:48.681 DEBUG 38933 --- [geSend_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting : write message:FescarMergeMessage transactionId=2008546211,branchType=AT,resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false,lockKey=storage_tbl:1, channel:[id: 0xd40718e3, L:/192.168.0.2:62607 - R:/192.168.0.2:8091],active?true,writable?true,isopen?true
2019-04-09 21:57:48.681 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Send:FescarMergeMessage transactionId=2008546211,branchType=AT,resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false,lockKey=storage_tbl:1
2019-04-09 21:57:48.687 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Receive:MergeResultMessage BranchRegisterResponse: transactionId=2008546211,branchId=2008546212,result code =Success,getMsg =null,messageId:11
2019-04-09 21:57:48.702 DEBUG 38933 --- [nio-8081-exec-1] c.a.f.rm.datasource.undo.UndoLogManager : Flushing UNDO LOG: {"branchId":2008546212,"sqlUndoLogs":[{"afterImage":{"rows":[{"fields":[{"keyType":"PrimaryKey","name":"id","type":4,"value":1},{"keyType":"NULL","name":"count","type":4,"value":993}]}],"tableName":"storage_tbl"},"beforeImage":{"rows":[{"fields":[{"keyType":"PrimaryKey","name":"id","type":4,"value":1},{"keyType":"NULL","name":"count","type":4,"value":994}]}],"tableName":"storage_tbl"},"sqlType":"UPDATE","tableName":"storage_tbl"}],"xid":"192.168.0.2:8091:2008546211"}
2019-04-09 21:57:48.755 DEBUG 38933 --- [nio-8081-exec-1] c.a.f.c.rpc.netty.AbstractRpcRemoting : offer message: transactionId=2008546211,branchId=2008546212,resourceId=null,status=PhaseOne_Done,applicationData=null
2019-04-09 21:57:48.755 DEBUG 38933 --- [geSend_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting : write message:FescarMergeMessage transactionId=2008546211,branchId=2008546212,resourceId=null,status=PhaseOne_Done,applicationData=null, channel:[id: 0xd40718e3, L:/192.168.0.2:62607 - R:/192.168.0.2:8091],active?true,writable?true,isopen?true
2019-04-09 21:57:48.756 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Send:FescarMergeMessage transactionId=2008546211,branchId=2008546212,resourceId=null,status=PhaseOne_Done,applicationData=null
2019-04-09 21:57:48.758 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Receive:MergeResultMessage com.alibaba.fescar.core.protocol.transaction.BranchReportResponse@582a08cf,messageId:13
2019-04-09 21:57:48.799 DEBUG 38933 --- [nio-8081-exec-1] c.a.fescar.core.context.RootContext : unbind 192.168.0.2:8091:2008546211
2019-04-09 21:57:48.799 DEBUG 38933 --- [nio-8081-exec-1] o.s.c.a.f.web.FescarHandlerInterceptor : unbind 192.168.0.2:8091:2008546211 from RootContext
- 擷取business-service傳來的XID
- 綁定XID到目前上下文中
- 執行業務邏輯sql
- 向TC建立本次RM的Netty連接配接
- 向TC發送分支事務的相關資訊
- 獲得TC傳回的branchId
- 記錄Undo Log資料
- 向TC發送本次事務PhaseOne階段的處理結果
- 從目前上下文中解綁XID
其中第1步和第9步,是在FescarHandlerInterceptor中完成的,該類并不屬于seata,而是spring-cloud-alibaba-fescar中對feign、rest支援的實作。bind和unbind XID到上下文中。到這裡RM完成了PhaseOne階段的工作,接着看PhaseTwo階段的處理邏輯。
事務送出
由于這次請求是正常流程無異常的,是以分支事務會正常commit。
在storage-service啟動時建立了與TC通訊的Netty連接配接,TC在擷取各RM的彙報結果後,就會給各RM發送commit或rollback的指令
2019-04-09 21:57:49.813 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Receive:xid=192.168.0.2:8091:2008546211,branchId=2008546212,branchType=AT,resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false,applicationData=null,messageId:1
2019-04-09 21:57:49.813 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting : com.alibaba.fescar.core.rpc.netty.RmRpcClient@7d61f5d4 msgId:1, body:xid=192.168.0.2:8091:2008546211,branchId=2008546212,branchType=AT,resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false,applicationData=null
2019-04-09 21:57:49.814 INFO 38933 --- [atch_RMROLE_1_8] c.a.f.core.rpc.netty.RmMessageListener : onMessage:xid=192.168.0.2:8091:2008546211,branchId=2008546212,branchType=AT,resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false,applicationData=null
2019-04-09 21:57:49.816 INFO 38933 --- [atch_RMROLE_1_8] com.alibaba.fescar.rm.AbstractRMHandler : Branch committing: 192.168.0.2:8091:2008546211 2008546212 jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false null
2019-04-09 21:57:49.816 INFO 38933 --- [atch_RMROLE_1_8] com.alibaba.fescar.rm.AbstractRMHandler : Branch commit result: PhaseTwo_Committed
2019-04-09 21:57:49.817 INFO 38933 --- [atch_RMROLE_1_8] c.a.fescar.core.rpc.netty.RmRpcClient : RmRpcClient sendResponse branchStatus=PhaseTwo_Committed,result code =Success,getMsg =null
2019-04-09 21:57:49.817 DEBUG 38933 --- [atch_RMROLE_1_8] c.a.f.c.rpc.netty.AbstractRpcRemoting : send response:branchStatus=PhaseTwo_Committed,result code =Success,getMsg =null,channel:[id: 0xd40718e3, L:/192.168.0.2:62607 - R:/192.168.0.2:8091]
2019-04-09 21:57:49.817 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Send:branchStatus=PhaseTwo_Committed,result code =Success,getMsg =null
從日志中可以看到
- 收到XID=192.168.0.2:8091:2008546211,branchId=2008546212的commit通知
- 執行commit動作
- 将commit結果發送給TC,branchStatus為PhaseTwo_Committed
具體看下執行commit的過程,在AbstractRMHandler類的doBranchCommit方法之前是接收TC消息包裝處理路由的過程
//拿到通知的xid、branchId等關鍵參數
//然後調用RM的branchCommit
protected void doBranchCommit(BranchCommitRequest request, BranchCommitResponse response) throws TransactionException {
String xid = request.getXid();
long branchId = request.getBranchId();
String resourceId = request.getResourceId();
String applicationData = request.getApplicationData();
LOGGER.info("Branch committing: " + xid + " " + branchId + " " + resourceId + " " + applicationData);
BranchStatus status = getResourceManager().branchCommit(request.getBranchType(), xid, branchId, resourceId, applicationData);
response.setBranchStatus(status);
LOGGER.info("Branch commit result: " + status);
}
最終會将branceCommit的請求調用到AsyncWorker的branchCommit方法。AsyncWorker的處理方式是seata架構的一個關鍵部分,大部分事務都是會正常送出的,是以在PhaseOne階段就已經結束了,這樣就可以将鎖最快的釋放。PhaseTwo階段接收commit的指令後,異步處理即可。将PhaseTwo的時間消耗排除在一次分布式事務之外。
//部分代碼
public class AsyncWorker implements ResourceManagerInbound {
private static final List<Phase2Context> ASYNC_COMMIT_BUFFER = Collections.synchronizedList(
new ArrayList<Phase2Context>());
//将需要送出的XID加入list
@Override
public BranchStatus branchCommit(BranchType branchType, String xid, long branchId, String resourceId, String applicationData) throws TransactionException {
if (ASYNC_COMMIT_BUFFER.size() < ASYNC_COMMIT_BUFFER_LIMIT) {
ASYNC_COMMIT_BUFFER.add(new Phase2Context(branchType, xid, branchId, resourceId, applicationData));
} else {
LOGGER.warn("Async commit buffer is FULL. Rejected branch [" + branchId + "/" + xid + "] will be handled by housekeeping later.");
}
return BranchStatus.PhaseTwo_Committed;
}
//通過一個定時任務消費list中的待送出XID
public synchronized void init() {
LOGGER.info("Async Commit Buffer Limit: " + ASYNC_COMMIT_BUFFER_LIMIT);
timerExecutor = new ScheduledThreadPoolExecutor(1,
new NamedThreadFactory("AsyncWorker", 1, true));
timerExecutor.scheduleAtFixedRate(new Runnable() {
@Override
public void run() {
try {
doBranchCommits();
} catch (Throwable e) {
LOGGER.info("Failed at async committing ... " + e.getMessage());
}
}
}, 10, 1000 * 1, TimeUnit.MILLISECONDS);
}
private void doBranchCommits() {
if (ASYNC_COMMIT_BUFFER.size() == 0) {
return;
}
Map<String, List<Phase2Context>> mappedContexts = new HashMap<>();
Iterator<Phase2Context> iterator = ASYNC_COMMIT_BUFFER.iterator();
//一次定時任務取出ASYNC_COMMIT_BUFFER中的所有待辦資料
//以resourceId作為key分組待辦資料,resourceId就是一個資料庫的連接配接url
//在前面的日志中可以看到,目的是為了覆寫應用的多資料源問題
while (iterator.hasNext()) {
Phase2Context commitContext = iterator.next();
List<Phase2Context> contextsGroupedByResourceId = mappedContexts.get(commitContext.resourceId);
if (contextsGroupedByResourceId == null) {
contextsGroupedByResourceId = new ArrayList<>();
mappedContexts.put(commitContext.resourceId, contextsGroupedByResourceId);
}
contextsGroupedByResourceId.add(commitContext);
iterator.remove();
}
for (Map.Entry<String, List<Phase2Context>> entry : mappedContexts.entrySet()) {
Connection conn = null;
try {
try {
//根據resourceId擷取資料源以及連接配接
DataSourceProxy dataSourceProxy = DataSourceManager.get().get(entry.getKey());
conn = dataSourceProxy.getPlainConnection();
} catch (SQLException sqle) {
LOGGER.warn("Failed to get connection for async committing on " + entry.getKey(), sqle);
continue;
}
List<Phase2Context> contextsGroupedByResourceId = entry.getValue();
for (Phase2Context commitContext : contextsGroupedByResourceId) {
try {
//執行undolog的處理,即删除xid、branchId對應的記錄
UndoLogManager.deleteUndoLog(commitContext.xid, commitContext.branchId, conn);
} catch (Exception ex) {
LOGGER.warn(
"Failed to delete undo log [" + commitContext.branchId + "/" + commitContext.xid + "]", ex);
}
}
} finally {
if (conn != null) {
try {
conn.close();
} catch (SQLException closeEx) {
LOGGER.warn("Failed to close JDBC resource while deleting undo_log ", closeEx);
}
}
}
}
}
}
是以對于commit動作的處理,RM隻需删除xid、branchId對應的undolog既可
事務復原
對于rollback場景的觸發有兩種情況,分支事務處理異常,即ConnectionProxy中report(false)的情況
TM捕獲到下遊系統上抛的異常,即發起全局事務标有@GlobalTransactional注解的方法捕獲到的異常。在前面TransactionalTemplate類的execute模版方法中,對business.execute()的調用進行了catch,catch後會調用rollback,由TM通知TC對應XID需要復原事務
public void rollback() throws TransactionException {
//隻有Launcher能發起這個rollback
if (role == GlobalTransactionRole.Participant) {
// Participant has no responsibility of committing
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("Ignore Rollback(): just involved in global transaction [" + xid + "]");
}
return;
}
if (xid == null) {
throw new IllegalStateException();
}
status = transactionManager.rollback(xid);
if (RootContext.getXID() != null) {
if (xid.equals(RootContext.getXID())) {
RootContext.unbind();
}
}
}
TC彙總後向參與者發送rollback指令,RM在AbstractRMHandler類的doBranchRollback方法中接收這個rollback的通知
protected void doBranchRollback(BranchRollbackRequest request, BranchRollbackResponse response) throws TransactionException {
String xid = request.getXid();
long branchId = request.getBranchId();
String resourceId = request.getResourceId();
String applicationData = request.getApplicationData();
LOGGER.info("Branch rolling back: " + xid + " " + branchId + " " + resourceId);
BranchStatus status = getResourceManager().branchRollback(request.getBranchType(), xid, branchId, resourceId, applicationData);
response.setBranchStatus(status);
LOGGER.info("Branch rollback result: " + status);
}
然後将rollback請求傳遞到DataSourceManager類的branchRollback方法
public BranchStatus branchRollback(BranchType branchType, String xid, long branchId, String resourceId, String applicationData) throws TransactionException {
//根據resourceId擷取對應的資料源
DataSourceProxy dataSourceProxy = get(resourceId);
if (dataSourceProxy == null) {
throw new ShouldNeverHappenException();
}
try {
UndoLogManager.undo(dataSourceProxy, xid, branchId);
} catch (TransactionException te) {
if (te.getCode() == TransactionExceptionCode.BranchRollbackFailed_Unretriable) {
return BranchStatus.PhaseTwo_RollbackFailed_Unretryable;
} else {
return BranchStatus.PhaseTwo_RollbackFailed_Retryable;
}
}
return BranchStatus.PhaseTwo_Rollbacked;
}
最終會執行UndoLogManager類的undo方法,因為是純jdbc操作代碼比較長就不貼出來了,可以通過連接配接到github檢視,說一下undo的具體流程
根據xid和branchId查找PhaseOne階段送出的undolog
如果找到了就根據undolog中記錄的資料生成回放sql并執行,即還原PhaseOne階段修改的資料
第2步處理完後,删除該條undolog資料
如果第1步沒有找到對應的undolog,就插入一條狀态為GlobalFinished的undolog.
出現沒找到的原因可能是PhaseOne階段的本地事務異常了,導緻沒有正常寫入。因為xid和branchId是唯一索引,是以第4步的插入,可以防止PhaseOne階段後續又寫入成功,那麼PhaseOne階段就會異常,這樣業務資料也是沒有送出成功的,資料最終是復原了的效果
最後希望大家能從文章中得到幫助獲得收獲,也可以評論出你想看哪方面的技術。文章會持續更新,希望能幫助到大家,哪怕是讓你靈光一現。喜歡的朋友可以點點贊和關注,也可以分享出去讓更多的人看見,一起努力一起進步!