天天看點

懿說學區(40)SPSS統計分析(50)主成分分析操作

作者:LearningYard學苑

Yishuo school district (40) | SPSS statistical analysis (50) principal component analysis operation

懿說學區(40)SPSS統計分析(50)主成分分析操作

Learning

Yard

“分享興趣,傳播快樂,增長見聞,留下美好! 大家好,這裡是小編。歡迎大家繼續通路學苑内容,我們将竭誠為您帶來更多更好的内容分享。

"Share interest, spread happiness, increase knowledge, and leave a good impression! Hello everyone, this is Xiaobian. Welcome to continue to visit the content of Xueyuan, and we will wholeheartedly bring you more and better content to share.

懿說學區(40)SPSS統計分析(50)主成分分析操作

上一期,我們一起了解了主成分分析和因子分析的基礎理論知識,今天的這一期,小編為大家帶來了主成分分析的一般性步驟和SPSS執行個體操作。

In the last issue, we learned the basic theoretical knowledge of principal component analysis and factor analysis together. In today's issue, the editor brings you the general steps of principal component analysis and SPSS example operation.

主成分分析步驟:

假定輸入一個決策表T=(U,C∪D,V,F),其中U為論域,X={x1,x2,x3,x4,…,xn},C和D分别為條件屬性集和決策屬性集。需輸出條件屬性的主成分是P={y1,y2,y3,y4,y5…ym}。則其步驟如下:

Steps of principal component analysis:

Suppose you input a decision table T=(U, C, D, V, F), where U is the universe, X={x1, x2, x3, x4,..., xn}, C and D are the conditional attribute set and the decision attribute set respectively. The principal component of the condition attribute to be output is P={y1, y2, y3, y4, y5... ym}. The steps are as follows:

第1步,原始資料的标準化處理

懿說學區(40)SPSS統計分析(50)主成分分析操作

進行标準化處理,使得其每個均值為0,方差為1.

第2步,計算相關系數矩陣。計算第1步中得到的資料集X的相關系數矩陣R。

第3步,計算特征值及機關特征向量。計算R的特征值λi及其對應的機關特征向量ei,I = 1,2,3,…,p,并将特征值由大到小的順序排列,即λ1>λ2>λ3…>λp。

第4步,計算主成分的方差貢獻率和累積方差貢獻率。

第k個主成分方差為ak,若其中a1的值最大,則說明y1綜合x1,x2,x3,…,xn資訊的能力最強,主成分p值的選取一般為使得累積方差貢獻率≥80%(或特征值大于1)的前m個特征值。

第5步,計算主成分。利用前p個特征值對應的機關特征向量計算原始資料的主成分y1,y2,y3,…,ym。

Step 1: Standardization of raw data

Press x_ {ij}=\frac{x_{ij}-\buildrel x_ j\overx_ J overlay} { sqrt {Var funcapply left (x_j right)}} is standardized to make each mean value 0 and variance 1

Step 2, calculate the correlation coefficient matrix. Calculate the correlation coefficient matrix R of dataset X obtained in step 1.

Step 3: calculate eigenvalues and unit eigenvectors. Calculate the characteristic value of R λ I and its corresponding unit eigenvector ei, I=1,2,3,..., p, and arrange the eigenvalues in order from large to small, namely λ 1> λ 2> λ 3…> λ p。

Step 4: calculate the variance contribution rate and cumulative variance contribution rate of the principal components.

The k-th principal component variance is ak. If the value of a1 is the largest, it means that y1 has the strongest ability to synthesize x1, x2, x3,..., xn information. The selection of the p-value of the principal component is generally the first m eigenvalues that make the cumulative variance contribution rate ≥ 80% (or the eigenvalue is greater than 1).

Step 5, calculate the principal components. Use the unit eigenvector corresponding to the first p eigenvalues to calculate the principal components y1, y2, y3,..., ym of the original data.

SPSS

執行個體分析

下面,我們來進入一個具體的SPSS執行個體。

為了在總體上反映出20世紀末世界經濟全球化的狀況,現選擇了1999年全球具有代表性的16個國家的資料如下表所示,這些國家參與經濟全球化的程度名額已經給出。試利用SPSS軟體分析一個國家參與經濟全球化的程度主要受哪些因子的影響。

懿說學區(40)SPSS統計分析(50)主成分分析操作

Next, let's enter a specific SPSS instance.

In order to reflect the situation of world economic globalization at the end of the 20th century in general, the data of 16 representative countries in the world in 1999 are selected as shown in the table below. The indicators of the degree of participation of these countries in economic globalization have been given. Try to use SPSS software to analyze which factors affect the degree of a country's participation in economic globalization.

各名額的經濟含義如下:

The economic meaning of each indicator is as follows:

X1:GDP占全球GDP的比重

X1: Proportion of GDP in global GDP

X2:貨物貿易占貨物GDP的比重

X2: Proportion of goods trade in goods GDP

X3:外國分支機構占世界全部分支機構的比重

X3: Proportion of foreign branches in all branches in the world

X4:本國發生的全部收益占GDP的比重

X4: The proportion of total income generated in the country to GDP

X5:本國發生的全部收益占世界發生的全部收益的比重

X5: The proportion of all income generated in the country to all income generated in the world

X6:對外直接投資和接受外國直接投資總額占GDP的比重

X6: Proportion of total foreign direct investment and foreign direct investment received in GDP

X7:外國直接投資占國内投資總額的比重

X7: Proportion of foreign direct investment in total domestic investment

X8:本國直接投資額占全球直接投資額的比重

X8: Proportion of domestic direct investment in global direct investment

X9:跨國并購額占全球跨國并購額的比重

X9: Proportion of cross-border M&A in global cross-border M&A

X10:國際經濟外向度

X10: International economic extroversion

X11:對外貿易依存度

X11: Dependence on foreign trade

X12:貨物和服務進出口總額占GDP的比重

X12: Proportion of total import and export of goods and services in GDP

X13:國際金融總資本流量占GDP的比重

X13: Proportion of total international financial capital flow to GDP

X14:對外金融資産負債占GDP的比重

X14: Proportion of external financial assets and liabilities in GDP

X15:國際金融總資本流量占全球國際金融總資本流量的比重

X15: Proportion of total international financial capital flow to global total international financial capital flow

第一步,分析并組織資料。從資料上看,一共有15個因素,但是有些因素試存在相關性的,同時各因素對全球化影響的程度也是不一樣的,故可采用主成分分析。按照上述名額一列定義變量,輸入資料并儲存。

懿說學區(40)SPSS統計分析(50)主成分分析操作
懿說學區(40)SPSS統計分析(50)主成分分析操作

The first step is to analyze and organize the data. From the perspective of data, there are 15 factors in total, but some of them are related, and the degree of influence of each factor on globalization is also different, so the principal component analysis can be used. Define variables according to the above indicators, input data and save.

第二步,進行主成分分析的設定。打開“分析->降維->因子”對話框,按下圖所示進行設定。

懿說學區(40)SPSS統計分析(50)主成分分析操作
懿說學區(40)SPSS統計分析(50)主成分分析操作
懿說學區(40)SPSS統計分析(50)主成分分析操作

The second step is to set the principal component analysis. Open the "Analysis ->Dimension Reduction ->Factor" dialog box and set it as shown in the following figure.

第三步,主要結果及分析。

因子分析的結果輸出了特征值和方差貢獻表,主成分的碎石圖和旋轉前的因子荷載矩陣。從特征根和方差的貢獻表上可以看出,前3個主成分已經解釋了總方差的将近86.7%,故可以選擇前3個主成分進行分析。

懿說學區(40)SPSS統計分析(50)主成分分析操作

The third step is the main results and analysis.

The results of factor analysis output the eigenvalue and variance contribution table, the gravel map of principal components and the factor load matrix before rotation. From the contribution table of characteristic root and variance, it can be seen that the first three principal components have explained nearly 86.7% of the total variance, so the first three principal components can be selected for analysis.

主成分的碎石圖,結合特征根曲線的拐點以及特征值來看,前3個主成分的折線坡度較陡,而後面就趨于平緩,也說明取前三個較好。

懿說學區(40)SPSS統計分析(50)主成分分析操作

According to the inflection point and characteristic value of the characteristic root curve, the broken line slope of the first three principal components is steep, while the latter tends to be flat, which also indicates that the first three are better.

第四步,利用因子分析的結果進行主成分分析。運算過程如下圖所示,得出其主成分和綜合得分情況。結合圖表不難看出,各國參與國際化水準的高低,其中美國最高,印度最低。

懿說學區(40)SPSS統計分析(50)主成分分析操作

The fourth step is to use the results of factor analysis to conduct principal component analysis. The calculation process is shown in the figure below, and its principal components and comprehensive scores are obtained. According to the chart, it is not difficult to see the level of countries' participation in internationalization, of which the United States is the highest and India is the lowest.

下期預告:本期,我們學習了

主成分分析的實踐操作。

下一期,我們将會學習

如何進行因子分析的實踐操作。

今天的分享就到這裡了

如果您對今天的文章有獨特的想法

歡迎給我們留言

讓我們相約明天

祝您今天過得開心快樂!

That's all for today's sharing. If you have unique ideas about today's article, please leave us a message. Let's meet tomorrow. I wish you a happy day today!

參考資料:百度百科,《SPSS 23 統計分析實用教程》

翻譯:百度翻譯

本文由learningyard新學苑原創,部分文字圖檔來源于他處,如有侵權,請聯系删除。

繼續閱讀