黑匣子AI (Black-Box AI)
Explainability is extremely crucial in systems that are responsible for carrying out mission-critical tasks. For example, in healthcare, if scientists are relying on AI models to help them figure out whether the patient will have cancer or not, they need to be 100% sure of their diagnoses otherwise this can result in death, a lot of lawsuits and a lot of damage done in trust. Nature is this problem is so intense that explainability sits at the core of this problem: the data scientists and the human operators, in this case, the doctors need to understand how the machine learning system is behaving and how did it come to a decision.
在負責執行關鍵任務的系統中,可解釋性至關重要。 例如,在醫療保健中,如果科學家依靠AI模型來幫助他們确定患者是否患有癌症,則需要100%確定自己的診斷,否則可能導緻死亡,大量訴訟和訴訟。損害是由信任造成的。 本質就是這個問題如此激烈,以至于解釋性成為這個問題的核心:在這種情況下,資料科學家和人類操作人員需要醫生了解機器學習系統的行為方式以及如何做出決定。
Explainable AI is also important in finance or fintech in particular due to the growing adoption of machine learning solutions for credit scoring, loan approval, insurance, investment decisions and so on. Here again, there is a cost associated with the wrong decisions by the machine learning system: so there is a huge need to understand how the model actually works.
可解釋的AI在金融或金融科技中也很重要,特别是由于越來越多地采用機器學習解決方案來進行信用評分,貸款審批,保險,投資決策等,是以AI也很重要。 同樣,機器學習系統的錯誤決策也會帶來成本:是以,非常需要了解模型的實際工作方式。
Using black-box AI increases business risk and exposes the businesses to a deep downside — from credit card applications to determining disease to criminal justice.
使用黑匣子AI會增加業務風險,并使業務面臨深遠的負面影響-從信用卡申請到确定疾病再到刑事司法。
The reason why black-box models are not desirable becomes more clear when we look at how the business functions as a whole:
當我們檢視業務整體運作方式時,不希望使用黑匣子模型的原因變得更加清楚:
For the business decision-maker, data scientists need to answer the question of why they can trust our models, for IT & Operation, data scientists need to tell them how can they monitor and debug if an error occurs, for the data scientist, they need to know how they can further improve the accuracy of their models and finally, for regulators and auditors, they need to be able to get an answer to whether our AI system is fair or not?
對于業務決策者,資料科學家需要回答為什麼他們可以信任我們的模型的問題;對于IT和營運,資料科學家需要告訴他們如何監控和調試是否發生錯誤;對于資料科學家,他們需要知道他們如何進一步提高模型的準确性,最後,對于監管機構和審計師,他們需要能夠回答我們的AI系統是否公平?
輸入可解釋的AI (Enter explainable AI)
Explainable AI aims at providing clear and transparent predictions. An end-to-end system that provides decisions and explanations to the user and ultimately provides automated feedback to constantly improve the AI system. Remember, xAI is highly driven by the feedback so is a two-way interaction between the human and the AI system.
可解釋的AI旨在提供清晰透明的預測。 端到端系統向使用者提供決策和解釋,并最終提供自動回報以不斷改進AI系統。 請記住,xAI在很大程度上受到回報的驅動,是以人類與AI系統之間是雙向互動。
In the end, we should understand why behind the model, the impact of the model, where the model fails and what recommendations to provide.
最後,我們應該了解模型背後的原因,模型的影響,模型失敗之處以及提供哪些建議。
說明:一個開源,快速且可擴充的可解釋AI平台。 (Explainx: An open-source, fast and scalable explainable AI platform.)
explainx is an open-source explainable AI platform created by explainX.ai. Explainx, written in python, aims to help data scientists explain, monitor and debug black-box AI model — the purpose is to help build robust, unbiased and transparent AI applications.
describex是一個由explainX.ai建立的開源可解釋 AI平台。 以Python編寫的Explainx旨在幫助資料科學家解釋,監視和調試黑匣子AI模型-目的是幫助建構健壯,無偏且透明的AI應用程式。

explainx high-level architecture 說明進階架構
Within the explainx architecture, explainx provides access to state of the art interpretability techniques in just a single line of code within our Jupyter notebook.
在explaixx體系結構中,explainx可以在Jupyter筆記本中的單行代碼中通路最新的可解釋性技術。
For this example, we will use the HELOC dataset provided by FICO. The customers in this dataset have requested a credit line in the range of USD 5,000–150,000. Our job is to predict
RiskPerformance:
whether they will make timely payments over a two year period. The prediction can then be used to decide whether the homeowner qualifies for a line of credit.
在此示例中,我們将使用FICO提供的HELOC資料集。 此資料集中的客戶要求的信用額度在5,000-150,000美元之間。 我們的工作是預測
RiskPerformance:
他們是否會在兩年内及時付款。 然後,該預測可以用于确定房主是否符合信貸額度。
Loan Application Approval — explainx.ai 貸款申請準許— describex.ai
For this example, we will train CatBoost classifier model. After the training is done, we will use the explainx xAI module to explain our model and build a narrative for a business user to understand!
對于此示例,我們将訓練CatBoost分類器模型。 教育訓練完成後,我們将使用explaixx xAI子產品來解釋我們的模型,并為業務使用者建構一個叙述以供了解!
Let’s start with opening up our Jupyter notebooks and installing the explainx library. You can also clone the repository from the link below:
讓我們從打開Jupyter筆記本并安裝書信庫開始。 您也可以從以下連結克隆存儲庫:
https://github.com/explainX/explainx
https://github.com/explainX/explainx
pip install explainx
Let’s import relevant packages:
讓我們導入相關的軟體包:
from explainx import *
import catboost
from sklearn.model_selection import train_test_split
Let’s load and pre-process our dataset for model building. The dataset is already available in the explainx library.
讓我們加載和預處理我們的資料集以進行模型建構。 該資料集已經在explaixx庫中可用。
X, y = explainx.dataset_heloc()#split data into train and test
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2, random_state=0
Begin training. For the sake of this tutorial, we are going to keep it simple!
開始訓練。 為了本教程的緣故,我們将使其保持簡單!
# Run catboost model
model = CatBoostClassifier(iterations=500,
learning_rate=.3,
depth=2)# Fit model
model.fit(X_train.to_numpy(), y_train)
After the training is done, we can simply pass the test data into the explainx function and get our explanations!
訓練完成後,我們可以簡單地将測試資料傳遞給explaixx函數并獲得我們的解釋!
explainx.ai(X_test, y_test, model, model_name="catboost")
Once explainx and running, all you need to do is point your browser to http://127.0.0.1:8050 and you’ll see a very nice user interface called explainX.Dashboard.
一旦運作了explaixx,您所需要做的就是将浏覽器指向http://127.0.0.1:8050 ,您将看到一個非常漂亮的使用者界面,稱為explainX.Dashboard。
Note: If you want to view it inline, simple pass
mode="inline”
argument into the explainx function.
注意:如果要内聯檢視,請簡單地将
mode="inline”
參數 傳遞 給explainx函數。
For this tutorial, we will not go in the nitty-gritty of model building, model metrics and evaluation. Instead, we will dive right into the explanation part which is the main aim of this tutorial. So let’s start opening up the black-box!
在本教程中,我們不會深入研究模型建構,模型名額和評估。 相反,我們将直接進入解釋部分,這是本教程的主要目的。 是以,讓我們開始打開黑匣子!
We will tackle the model at four levels:
我們将在四個層次上處理該模型:
Global Level Explanation
全局級别說明
Local Prediction Explanation
本地預測說明
Scenario Analysis
情景分析
Feature Interaction & Distributions
功能互動與分布
全局級别說明 (Global Level Explanation)
We will be using the overall feature importance and overall feature impact graphs to give us a basic underlying logic of the model.
我們将使用總體特征重要性和總體特征影響圖為我們提供模型的基本底層邏輯。
Global Feature Importance 全球特征重要性
Interpretation: This tells that according to the CatBoost model,
ExternalRiskEstimate, MSinceMostRecentInq and PercentTradesNeverDelq
are the top three variables with the largest impact on
RiskPerformance.
These three represents Risk Estimate, Credit Inquiries and Debt Level information: extremely important categories when assessing risk.
解釋:這表明根據CatBoost模型,
ExternalRiskEstimate, MSinceMostRecentInq and PercentTradesNeverDelq
是對
RiskPerformance.
影響最大的前三個變量
RiskPerformance.
這三個代表風險估計,信用查詢和債務水準資訊:評估風險時極為重要的類别。
This information gives us a general idea of feature contribution but to understand whether each of these features has a positive or a negative impact on
RiskPerformance
we need to consult the feature impact graph.
這些資訊為我們提供了特征貢獻的總體思路,但是要了解這些特征中的每一個對
RiskPerformance
的正面還是負面影響,我們需要查閱特征影響圖。
Global Feature Impact 全球特征影響
This graph gives us even more insight into the model logic. We can clearly observe that
ExternalRiskEstimate
impacts the predicting variable positively — pushes it towards “Good Credit Risk Performance” and that matches our intuition as well. We have to use a little bit of our domain knowledge in finance for this:
ExternalRiskEstimate
is a consolidated version of some risk markers (higher is better) so automatically we learn that this variable will always positively affect the prediction. Then we have
NumSatisfactoryTrades:
the number of “satisfactory” accounts (“trades”) has a significant positive effect on the predicted probability of good credit.
該圖使我們對模型邏輯有了更深入的了解。 我們可以清楚地看到,
ExternalRiskEstimate
對預測變量産生了積極的影響-将其推向“良好的信用風險表現”,并且也符合我們的直覺。 為此,我們必須使用我們在金融領域的一點點知識:
ExternalRiskEstimate
是某些風險标記的合并版本(越高越好),是以我們自動得知該變量将始終對預測産生積極影響。 然後我們得到
NumSatisfactoryTrades:
“滿意”帳戶(“交易”)的數量對良好信用的預測機率具有顯着的積極影響。
However, in the lower end, delinquency variables push the overall prediction towards 0 (in this case, application denied). This is interesting and we can even dig deeper and see when does this negative effect of delinquency wears off? (Something to try on your own!)
但是,在較低端,違約變量将總體預測推向0(在這種情況下,應用程式被拒絕)。 這很有趣,我們甚至可以更深入地研究,看看這種犯罪的負面影響何時會消失? (可以自行嘗試!)
Now as we have an idea of how each feature affects the prediction, we can move on to explaining a single prediction for a specific customer. For that, we will be using an impact graph or a decision plot that will help us get attribution scores for that particular prediction. To further support our analysis, we will calculate similar profiles that most closely resemble the one we are trying to predict.
現在,我們已經了解了每個功能部件如何影響預測,是以我們可以繼續說明特定客戶的單個預測。 為此,我們将使用影響圖或決策圖來幫助我們獲得該特定預測的歸因得分。 為了進一步支援我們的分析,我們将計算與我們試圖預測的最相似的相似輪廓。
本地預測說明 (Local Prediction Explanation)
So let’s explain for customer on Row # 9 in our data. For this specific customer, the application was approved because the
RiskPerformance
was “Good” and our model also classified it correctly!
是以,讓我們為資料中的第9行的客戶進行解釋。 對于該特定客戶,該應用程式被準許,因為
RiskPerformance
為“良好”,并且我們的模型也對其進行了正确分類!
Customer Number 9 — Local Prediction Explanation 客戶編号9-本地預測說明
Let’s explore the model logic:
讓我們探索模型邏輯:
Local Feature Impact for Customer # 9 對客戶9的本地功能的影響
So this graph clearly shows the top three positive and top three negatively impacting variables. According to the model,
MSinceMostRecentInq
had the most positive impact on the variable. So this tells us that the higher value of this variable means that there is no penalty for having more than one month since the most recent inquiry. Then we have
ExternalRiskEstimate
that again plays a positive role in pushing the prediction towards “good credit behaviour”. However,
PercentTradesNeverDelq
affected the prediction negatively: this might be the case if the value of this variable is extremely small because the smaller value for this variable lowers the probability of good credit score.
是以,此圖清楚地顯示了前三個正影響和前三個負影響變量。 根據模型,
MSinceMostRecentInq
對變量的影響最大。 是以,這告訴我們,此變量的值較高,意味着自最近一次查詢以來,超過一個月沒有任何罰款。 然後,我們有了
ExternalRiskEstimate
,它在将預測推向“良好的信用行為”方面也發揮了積極作用。 但是,
PercentTradesNeverDelq
對預測産生了負面影響:如果此變量的值非常小,則可能是這種情況,因為此變量的值越小,則信用分數越好。
To keep it short, these findings match our mental-models as the attribution score of each variable is correctly assigned. To further support our analysis, we will find similar customers!
為了簡短起見,這些發現與我們的心理模型相符,因為正确配置設定了每個變量的歸屬分數。 為了進一步支援我們的分析,我們将找到類似的客戶!
Explainx comes with an in-built prototypical analysis function which provides a much more well rounded and comprehensive view of why the decision for the applicant may be justifiable.
Explainx帶有内置的原型分析功能,可以為申請人的決定為何合理的問題提供更加全面和全面的了解。
The above table depicts the five closest user profiles to the chosen applicant. Based on importance weight assigned to each profile by the method, we see that the prototype under column zero is the most representative user profile by far. This is (intuitively) confirmed from the feature similarity where more than 50% of the features (12 out of 23) of this prototype are identical to that of the chosen user whose prediction we want to explain. Also, the bank employee looking at the prototypical users and their features surmise that the approved applicant belongs to a group with high values of ExternalRiskEstimate. This justification gives the bank employees more confidence in approving the user’s application.
上表描述了與所選申請人最接近的五個使用者資料。 根據通過該方法配置設定給每個配置檔案的重要性權重,我們看到到目前為止,列零下的原型是最具代表性的使用者配置檔案。 (從直覺上)從特征相似性得到了證明,該原型中超過50%的特征(23個中的12個)與我們要解釋其預測的所選使用者的特征相同。 同樣,銀行員工在檢視原型使用者及其特征時,也推測準許的申請人屬于具有高ExternalRiskEstimate值的組。 這種理由使銀行員工在準許使用者的申請時更有信心。
情景分析 (Scenario Analysis)
Now, let’s explore different scenarios and see how the model performs. We can apply the data filters within the data table (no need to write SQL queries to filter your data) and explain multiple instances and scenarios very easily.
現在,讓我們探索不同的場景并檢視模型的性能。 我們可以在資料表中應用資料過濾器(無需編寫SQL查詢來過濾資料),并可以輕松地解釋多個執行個體和場景。
This is extremely useful when you are trying to understand behaviour on a specific cluster or group of data. For example, we want to see if the model attributes the same weights to the users when the
ExternalRiskEstimate
is > 60 and
MSinceOldestTradeOpen
is greater than 200.
當您試圖了解特定群集或一組資料上的行為時,這非常有用。 例如,我們想看看當
ExternalRiskEstimate
> 60且
MSinceOldestTradeOpen
大于200時,模型是否将相同的權重配置設定給使用者。
data filters applied — explainx 資料過濾器已應用—說明x
We can clearly see the dominance of ExternalRiskEstimate from the chart below.
從下面的圖表中,我們可以清楚地看到ExternalRiskEstimate的優勢。
Feature Impact when RiskEstimate > 60 and TradeOpen > 200 當RiskEstimate> 60和TradeOpen> 200時功能影響
When the ExternalRiskEstimate is greater than 60, it is seen as a positive sign which matches our internal mental model as well: ExternalRiskEstimate is actually monotonically decreasing which means that as the value of ExternalRiskEstimate increases, the probability of Bad Credit Score decreases! So in this cluster where the RiskEstimate is greater than 60, we will have more Good customers who were extended the credit line. We can confirm this by using a feature interaction plot, specifically a partial dependence plot:
當ExternalRiskEstimate大于60時,它也被視為與我們内部心理模型相比對的正号:ExternalRiskEstimate實際上單調下降,這意味着随着ExternalRiskEstimate值的增加,不良信用評分的可能性也會降低! 是以,在RiskEstimate大于60的叢集中,我們将有更多擴充信用額度的良好客戶。 我們可以使用特征互相作用圖,特别是部分依賴圖來确認這一點:
功能互動與分布 (Feature Interaction & Distribution)
The partial dependence plot validates our assumption. In the plot below, Red = Good RiskPerformance (credit line extended) and Silver = Bad RiskPerformance (credit line denied).
偏相關圖驗證了我們的假設。 在下面的圖中,紅色=良好的RiskPerformance(信用額度已延長),銀=不良的風險性能(信用額度被拒絕)。
We can clearly see that pattern that as ExternalRiskEstimate value increases, its impacts on the output also increases and we have more instances of good risk performance. We see more concentration of red dots as the ExternalRiskEstimate value goes up and that makes a lot of sense!
我們可以清楚地看到這種模式,随着ExternalRiskEstimate值的增加,其對産出的影響也随之增加,并且我們有更多具有良好風險表現的執行個體。 随着ExternalRiskEstimate值的增加,我們看到更多的紅點集中,這很有意義!
結論 (Conclusion)
Data Scientists can use explainx to further explore patterns by looking at interactions between different variables and how they impact the overall prediction. So let’s end by summarizing our findings for a business user to understand:
資料科學家可以通過解釋不同變量之間的互相作用以及它們如何影響總體預測,來使用解釋器進一步探索模式。 是以,讓我們最後總結一下我們的發現,以使業務使用者可以了解:
- According to the CatBoost model,
are the top three variables with the largest impact onExternalRiskEstimate, MSinceMostRecentInq and PercentTradesNeverDelq
根據CatBoost模型,RiskPerformance.
是對ExternalRiskEstimate, MSinceMostRecentInq and PercentTradesNeverDelq
影響最大的前三個變量RiskPerformance.
RiskPerformance.
-
impacts the predicting variable positively — pushes it towards “Good Credit Risk” but delinquency variables push the overall prediction towards “Bad Credit Risk” (in this case, application denied).ExternalRiskEstimate
對預測變量産生積極影響-将其推向“良好信用風險”,但拖欠變量将整體預測推向“不良信用風險”(在這種情況下,申請被拒絕)。ExternalRiskEstimate
- For our customer # 9, RiskPerformance = Good,
played a positive role in pushing the prediction towards “good credit behaviour”. However,ExternalRiskEstimate
PercentTradesNeverDelq
affected the prediction negatively: this might be the case if the value of this variable is extremely small because the smaller value for this variable lowers the probability of good credit score.
對于我們的9号客戶,RiskPerformance = Good,
在将預測推向“良好信用行為”方面發揮了積極作用。 但是,ExternalRiskEstimate
對預測産生了負面影響:如果此變量的值非常小,則可能是這種情況,因為此變量的值越小,則信用分數越好。PercentTradesNeverDelq
-
We found customers that had very similar behaviour and variable values to our customer # 9: strengthening our hypothesis even more.
我們發現客戶的行為和變量值與我們的客戶9非常相似:進一步加強了我們的假設。
- We were able to validate the model’s logic by further exploring the PDP that clearly showed us how increasing the value of
ExternalRiskEstimate
increases the probability of loan approval.
通過進一步研究PDP,我們能夠驗證模型的邏輯,該PDP清楚地向我們展示了增加
的值如何增加了貸款準許的可能性。ExternalRiskEstimate
-
Data Scientist can further explore a similar process and build an even more comprehensive data narrative that anyone can easily understand.
資料科學家可以進一步探索類似的過程,并建構更全面的資料叙述,任何人都可以輕松了解。
I hope you all enjoyed this case study. Explainability is extremely crucial and more than relevant today — so the ability to present a narrative that shows your understanding of how AI works is a vital skill for a data scientist. This is the real essence of human-AI understanding and democratization of AI.
我希望大家都喜歡這個案例研究。 可解釋性是至關重要的,而且比今天要重要的多,是以提供陳述以表明您對AI的工作原理的了解能力對于資料科學家來說是一項至關重要的技能。 這是人類對AI的了解和AI民主化的真正實質。
Download explainx: https://github.com/explainX/explainxDocumentation: https://explainx-documentation.netlify.app/
下載下傳說明x : https : //github.com/explainX/explainx文檔: https : //explainx-documentation.netlify.app/
翻譯自: https://towardsdatascience.com/practical-explainable-ai-loan-approval-use-case-f06d2fba4245