天天看点

【FinE】统计与计量指标计算(Matlab)CDF函数(normal distribution)CDF函数(t-location-scale distribution)CAPM模型 β \beta β计算误差项 ε i \varepsilon_i εi​计算 t t t分布拟合序列Generalized Hyperbolic DistributionGoodness of FitReferences

导航

  • CDF函数(normal distribution)
  • CDF函数(t-location-scale distribution)
  • CAPM模型 β \beta β计算
  • 误差项 ε i \varepsilon_i εi​计算
    • correlation and covariance
  • t t t分布拟合序列
  • Generalized Hyperbolic Distribution
    • approximation method 1
    • approximation method 2
  • Goodness of Fit
    • Anderson & Darling Test
    • Kolmogorv Distance
    • L1 Distance
    • L2 Distance
  • References

CDF函数(normal distribution)

计算

normal distribution

拟合下累积分布函数

function [Fncap] = minux_CDF_normal(returns)
    % Normal cumulative distribution function, the prob. the return
    % is lower than or equal to the return in cell(i, j) of returns where
    % assuming returns follow a normal distribution.
    
    % input:
    % returns: matrix of stock returns
    % output:
    % Fncap: matrix of cumulative distribution function
    
    [n, m]=size(returns);
    Fncap = zeros(n, m);
    for j=1:m
        A = fitdist(returns(:, j), 'Normal'); % 使用正态分布拟合资产收益率
        for i = 1:n
            Fncap(i, j)=normcdf(returns(i, j), A.mu, A.sigma);
        end
    end
    plot(sort(returns), sort(Fncap));
end
           
【FinE】统计与计量指标计算(Matlab)CDF函数(normal distribution)CDF函数(t-location-scale distribution)CAPM模型 β \beta β计算误差项 ε i \varepsilon_i εi​计算 t t t分布拟合序列Generalized Hyperbolic DistributionGoodness of FitReferences

CDF函数(t-location-scale distribution)

The

t-location-scale distribution

is preferred nub cases of

leptokurtosis

, as expressed by the distribution1 of returns(使用t分布可以更好地刻画收益率序列中的尖峰厚尾现象)

给出 T ( x , μ , σ , ν ) T(x, \mu, \sigma, \nu) T(x,μ,σ,ν)分布的pdf如下

T ( x , μ , σ , ν ) = Γ ( ν + 1 2 ) σ ν π Γ ( ν 2 ) [ ν + ( x − μ σ ) 2 ν ] − ν + 1 2 T(x, \mu, \sigma, \nu)=\frac{\Gamma(\frac{\nu+1}{2})}{\sigma\sqrt{\nu\pi}\Gamma(\frac{\nu}{2})}\bigg[\frac{\nu+(\frac{x-\mu}{\sigma})^2}{\nu}\bigg]^{-\frac{\nu+1}{2}} T(x,μ,σ,ν)=σνπ

​Γ(2ν​)Γ(2ν+1​)​[νν+(σx−μ​)2​]−2ν+1​

由于 Γ \Gamma Γ结构的存在,所以无法直接计算

t-location-scale

CDF

,可以使用标准化的方法,转为

student t-distribution

求解,

pdf

的转化关系如下

T ( y , μ , σ , ν ) = 1 σ PDF tStudent ( y − μ σ ) T(y, \mu, \sigma, \nu)=\frac{1}{\sigma}\text{PDF}_{\text{tStudent}}(\frac{y-\mu}{\sigma}) T(y,μ,σ,ν)=σ1​PDFtStudent​(σy−μ​)

得到

CDF

之间的转化关系

F t S t u ( y − μ σ ) = F t L o c ( y ) F_{tStu}(\frac{y-\mu}{\sigma})=F_{tLoc}(y) FtStu​(σy−μ​)=FtLoc​(y)

function [Ftcap] = minux_CDF_t(returns)
    % t-location scale cumulative distribution function
    % in cell(i,j) there is the prob. that the return is lower than or
    % equal to the return in cell of returns, if returns follow a
    % t-location scale distribution
    
    [n, m] = size(returns);
    Ftcap = zeros(n, m);
    for j = 1:m
        A = fitdist(returns(:, j), 'tLocationScale');
        for i = 1:n
            Ftcap(i, j)=tcdf((returns(i, j)-A.mu)/A.sigma, A.nu);
        end
    end
    plot(sort(returns), sort(Ftcap));
end
           
【FinE】统计与计量指标计算(Matlab)CDF函数(normal distribution)CDF函数(t-location-scale distribution)CAPM模型 β \beta β计算误差项 ε i \varepsilon_i εi​计算 t t t分布拟合序列Generalized Hyperbolic DistributionGoodness of FitReferences

CAPM模型 β \beta β计算

CAPM模型

E ( R i ) = R f + β i [ E ( R m ) − R f ] + ε i E(R_i)=R_f+\beta_i[E(R_m)-R_f]+\varepsilon_i E(Ri​)=Rf​+βi​[E(Rm​)−Rf​]+εi​

可以使用简单线性回归的方法或者公式法计算 β i \beta_i βi​的值

function [beta] = minux_beta(returns, retmkt)
    %% generate a column vector that contains the betas of all stocks
    % input:
    % returns: matrix of stocks returns
    % retmkt: column vector of returns of the market
    % output:
    % beta: column vector of stocks betas
    
    % 回归方法计算
    m = size(returns, 2);
    beta = zeros(m, 1);
    for i=1:m
        beta(i, 1)=regress(returns(:, i), retmkt);
    end
    
    % 矩阵方法计算
    %{
    cov_mat = cov(returns, retmkt);
    beta = cov_mat(1, 2)/cov_mat(2, 2);
    %}
end
           

误差项 ε i \varepsilon_i εi​计算

设置误差项 ε i \varepsilon_i εi​为从如下概率空间抽取的独立随机变量(

independent random variables

).

ε i = σ i T t S t u ( ν i ) \varepsilon_i=\sigma_iT_{tStu}(\nu_i) εi​=σi​TtStu​(νi​)

function [e] = minux_err(returns, retmkt, beta, rf)
    % generate the error values for the CAPM formula from a process
    % following an fitted t-Location Scale Distribution
    % 根据CAPM模型计算误差项
    % input:
    % returns: matrix of stocks returns, 资产收益率矩阵
    % retmkt: matrix of market returns,市场收益率向量
    % beta,beta列向量
    % rf,无风险利率
    % output:
    % e, matrix of error terms for each stock
    
    [n, m] = size(returns);
    e = zeros(n, m);
    for i=1:m
        e(:, i) = returns(:, i)-rf-beta(i, 1)*(retmkt-rf);
    end
end
           
The expected value of the returns of the market is calculated with the same method as the error term. With all the variables needed, we can simply follow the CAPM formula and calculate the expected returns of security i i i following a t t t-location scale distribution.

根据 t t t分布的性质模拟出场景代码如下

function [rbarStocks] = minux_SCENARIOS(retmkt, beta, rf, e)
    % this function creates the matrix of predicted returns following the
    % CAPM
    % input:
    % retmkt: matrix of market returns
    % beta: risk free rate of returns
    % e: error matrix
    % output:
    % rbarStocks: matrix of predicted stocks returns following a t-location
    % scale
    [n, m] = size(e);
    rbarStocks = zeros(n, m);
    dMarket = fitdist(retmkt, 'tLocationScale'); % 使用t分布拟合市场收益率序列
    for i = 1:m
        dErr = fitdist(e(:, i), 'tLocationScale'); % t分布拟合误差序列
        for j=1:n
            rbarStocks(j, i) = rf(j, 1)+beta(i, 1)* (random(dMarket)-rf(j, 1))+random(dErr); % 根据拟合市场序列,beta值,误差项生成stock scenarios
        end
    end
end
           

correlation and covariance

One of the major issues related to the use of the CAPM model is the correlation between the predicted returns and the error term. In fact it would mean that the model, with the current structure and variables is not explaining the behavior of returns in a complete way.

计算误差与模拟收益率序列之间的协方差代码如下

function [CovER, Cor] = minux_cov(rbarStocks, e)
    % input:
    % rbarStocks: matrix of predicted stock returns
    % e: matrix of error returns
    % CovER: 模拟序列和误差项之间的协方差
    % Cor: 模拟序列和误差项之间的相关系数
    
    [n, m] = size(rbarStocks);
    Cor = zeros(m, 1);
    for j=1:m
        A = cov(rbarStocks(:, j)', e(:, j));
        if j>=2
            CovER = [CovER; A(1, 2)] ;
        else
            CovER = A(1, 2);
        end
    end
        
    for j=1:m
        Cor(j, 1) = corr(rbarStocks(:, j), e(:, j));
    end
end
           

t t t分布拟合序列

The prediction of scenarios coherent with the t − t- t−location scale distribution, without using

CAPM

.
%% t-分布拟合
rbar_tloc = minux_t_fit(returns);
sample_ret = returns(:, 1);
 A = fitdist(sample_ret, 'tLocationScale');
xmin = min(sample_ret)-0.01;
xmax = max(sample_ret)+0.01;

xp = xmin: 0.01:xmax;
hold on;
yyaxis left; % 激活左轴
yp = pdf(A, xp);
plot(xp, yp, 'b--');
yyaxis right; % 激活右轴
histogram(sample_ret, 'Normalization', 'probability', 'FaceAlpha', 0.4);
legend('t-fitted', 'returns');
hold off;
           

拟合函数为

function [rbarStocks_tloc] = minux_t_fit(rStocks)
    [n, m]=size(rStocks);
    rbarStocks_tloc = zeros(n, m);
    for j = 1:m
        A = fitdist(rStocks(:, j), 'tLocationScale'); % 将收益率序列使用t-dist拟合
        for i=1:n
            rbarStocks_tloc(i, j)=A.mu+A.sigma*trnd(A.nu);
        end
    end
end
           

t t t拟合后在双坐标轴上绘制pdf图像,可以发现重合度较高,可以描述收益率尖峰厚尾的现象

【FinE】统计与计量指标计算(Matlab)CDF函数(normal distribution)CDF函数(t-location-scale distribution)CAPM模型 β \beta β计算误差项 ε i \varepsilon_i εi​计算 t t t分布拟合序列Generalized Hyperbolic DistributionGoodness of FitReferences

对比使用正态分布的拟合图像如下

【FinE】统计与计量指标计算(Matlab)CDF函数(normal distribution)CDF函数(t-location-scale distribution)CAPM模型 β \beta β计算误差项 ε i \varepsilon_i εi​计算 t t t分布拟合序列Generalized Hyperbolic DistributionGoodness of FitReferences

Generalized Hyperbolic Distribution

广义双曲线分布(

GH

)的概率密度函数为

d G H ( λ , α , β , δ , μ ) ( x ) = a ( λ , α , β , δ , μ ) ( δ 2 + ( x − μ ) 2 ) ( 2 λ − 1 4 ) e β ( x − μ ) × K λ − 1 2 ( α δ 2 + ( x − μ ) 2 ) d_{{GH}(\lambda, \alpha, \beta, \delta, \mu)}(x)=a(\lambda, \alpha, \beta, \delta, \mu)(\delta^2+(x-\mu)^2)^{(\frac{2\lambda-1}{4})}e^{\beta(x-\mu)}\times K_{\lambda-\frac{1}{2}}(\alpha\sqrt{\delta^2+(x-\mu)^2}) dGH(λ,α,β,δ,μ)​(x)=a(λ,α,β,δ,μ)(δ2+(x−μ)2)(42λ−1​)eβ(x−μ)×Kλ−21​​(αδ2+(x−μ)2

​)

其中, α > 0 \alpha>0 α>0为形状参数,偏度由 β \beta β的绝对值确定,且 0 ≤ ∣ β ∣ < α 0\leq |\beta|<\alpha 0≤∣β∣<α, μ ∈ R \mu\in R μ∈R为位置参数, λ ∈ R \lambda\in R λ∈R刻画尾部形状, δ > 0 \delta>0 δ>0描述尺度参数,正则化函数 a ( λ , α , β , δ , μ ) a(\lambda, \alpha, \beta, \delta, \mu) a(λ,α,β,δ,μ)

a ( λ , α , β , δ , μ ) = ( α 2 − β 2 ) λ 2 2 π α ( λ − 1 2 ) δ λ K λ ( δ α 2 − β 2 ) a(\lambda, \alpha, \beta, \delta, \mu)=\frac{(\alpha^2-\beta^2)^{\frac{\lambda}{2}}}{\sqrt{2\pi}\alpha^{(\lambda-\frac{1}{2})}\delta^\lambda K_\lambda(\delta\sqrt{\alpha^2-\beta^2})} a(λ,α,β,δ,μ)=2π

​α(λ−21​)δλKλ​(δα2−β2

​)(α2−β2)2λ​​

其中函数 K λ K_\lambda Kλ​表示第三种

modified Bessel function with index

λ \lambda λ.

GH

分布依赖于

Generalized Inverse Gaussian Distribution (GIG)

,设 δ α 2 + β 2 = ζ \delta\sqrt{\alpha^2+\beta^2}=\zeta δα2+β2

​=ζ,

GH

的期望值为

E ( G H ) = μ + β δ 2 ζ K λ + 1 ( ζ ) K λ ( ζ ) = μ + β E ( G I G ) \mathbb{E}(GH)=\mu+\frac{\beta \delta^2}{\zeta}\frac{K_{\lambda+1}(\zeta)}{K_\lambda(\zeta)}=\mu+\beta\mathbb{E}(GIG) E(GH)=μ+ζβδ2​Kλ​(ζ)Kλ+1​(ζ)​=μ+βE(GIG)

方差为

V ( G H ) = δ 2 ζ K λ + 1 ( ζ ) K λ ( ζ ) + β 2 δ 4 ζ 2 ( K λ + 2 ( ζ ) K λ ( ζ ) − K λ + 1 2 ( ζ ) K λ 2 ( ζ ) ) = E ( G I G ) + β 2 V ( G I G ) \mathbb{V}(GH)=\frac{\delta^2}{\zeta}\frac{K_{\lambda+1}(\zeta)}{K_\lambda(\zeta)}+\beta^2\frac{\delta^4}{\zeta^2}(\frac{K_{\lambda+2}(\zeta)}{K_\lambda(\zeta)}-\frac{K_{\lambda+1}^2(\zeta)}{K_\lambda^2(\zeta)})=\mathbb{E}(GIG)+\beta^2\mathbb{V}(GIG) V(GH)=ζδ2​Kλ​(ζ)Kλ+1​(ζ)​+β2ζ2δ4​(Kλ​(ζ)Kλ+2​(ζ)​−Kλ2​(ζ)Kλ+12​(ζ)​)=E(GIG)+β2V(GIG)

approximation method 1

构造一个严格递增的函数 h ( x ; μ ) = F G H ( x ) − μ h(x; \mu)=F_{GH}(x)-\mu h(x;μ)=FGH​(x)−μ,其中 u u u是从 ( 0 , 1 ) (0, 1) (0,1)均匀分布中的随机抽样,可以得到数值解

h ( x ; u ) = ∫ ∞ x d G H ( λ , α , β , δ , μ ) ( x ) d x − μ h(x; u)=\int_\infty^xd_{GH(\lambda, \alpha, \beta, \delta, \mu)}(x)dx-\mu h(x;u)=∫∞x​dGH(λ,α,β,δ,μ)​(x)dx−μ

approximation method 2

根据

Newton-Raphson

算法近似

{ x 1 start point x k + 1 = x k − F G H ( x k ) − μ d G H ( λ , α , β , δ , μ ) \begin{cases} x_1\quad\text{start point}\\ x_{k+1}=x_k-\frac{F_{GH}(x_k)-\mu}{d_{GH(\lambda, \alpha, \beta, \delta, \mu)}} \end{cases} {x1​start pointxk+1​=xk​−dGH(λ,α,β,δ,μ)​FGH​(xk​)−μ​​

Goodness of Fit

Anderson & Darling Test

Anderson & Darling test

检测任意分布的公式为

A D = max ⁡ x ∈ R ∣ F e m p ( x ) − F e s t ( x ) ∣ F e s t ( x ) ( 1 − F e s t ( x ) ) AD = \max\limits_{x\in R}\frac{|F_{emp}(x)-F_{est}(x)|}{\sqrt{F_{est}(x)(1-F_{est}(x))}} AD=x∈Rmax​Fest​(x)(1−Fest​(x))

​∣Femp​(x)−Fest​(x)∣​

其中 F e m p ( x ) F_{emp}(x) Femp​(x)表示

empirical CDF

, F e s t ( x ) F_{est}(x) Fest​(x)表示

estimated CDF

Kolmogorv Distance

CDF

函数差值的绝对值的上界

K D = sup ⁡ x ∈ R ∣ F e m p ( x ) − F e s t ( x ) ∣ KD=\sup_{x\in R}|F_{emp}(x)-F_{est}(x)| KD=x∈Rsup​∣Femp​(x)−Fest​(x)∣

L1 Distance

L 1 = ∑ i ∣ F e m p ( x i ) − F e s t ( x i ) ∣ L_1=\sum_i|F_{emp}(x_i)-F_{est}(x_i)| L1​=i∑​∣Femp​(xi​)−Fest​(xi​)∣

L2 Distance

L 2 = ∑ i ∣ F e m p ( x i ) − F e s t ( x i ) ∣ 2 L_2=\sqrt{\sum_i|F_{emp}(x_i)-F_{est}(x_i)|^2} L2​=i∑​∣Femp​(xi​)−Fest​(xi​)∣2

References

optimization of Conditional Value-at-Risk

  1. Location-Scale Distributions ↩︎

继续阅读