天天看點

Python 第三方子產品 統計

一.statsmodels子產品

官方文檔:https://www.statsmodels.org/stable/user-guide.html \quad https://www.statsmodels.org/stable/api.html

1.概述

(1)簡介:

更多功能參見:https://zhuanlan.zhihu.com/p/91384305

statsmodels是1個Python統計分析子產品,源于斯坦福大學統計學教授Jonathan Taylor,并由Skipper Seabold和Josef Perktold于2010年正式創
建該項目.其包含了許多經典統計學和經濟計量學的算法,主要有:
①回歸模型:線性回歸,廣義線性模型,健壯線性模型,線性混合效應模型等
②方差分析(ANOVA)
③時間序列分析和狀态空間模型:AR,ARMA,ARIMA,VAR等
④廣義的矩量法
⑤非參數方法:核密度估計,核回歸
⑥統計模型結果可視化方法
           

(2)與其他子產品的關系:

①與patsy:受R的公式系統的啟發,Nathaniel Smith建立了patsy項目.該子產品提供了statsmodels的公式/模型的規範架構
②與scikit-learn:statsmodels更關注統計推斷,而sklearn更注重預測
           

(3)安裝:

pip install statsmodels
           

(4)不同導入方法的比較:

參見:https://www.statsmodels.org/stable/api-structure.html#import-paths-and-structure

2.橫斷面研究(Cross-Sectional Study)-基于數組的接口

#注意:
①這類接口推薦用于互動式使用
②這些類/函數實際上是定義在其他地方的,sm隻是提供了1個接口
           

(1)導入:

#通常導入為sm:
import statsmodels.api as sm
           

(2)回歸(Regression):

"普通最小二乘法"(Ordinary Least Squares):class sm.OLS(<endog>,<exog>[,missing='none',hasconst=None,**kwargs])
  #實際上是class statsmodels.regression.linear_model.OLS
  #參數說明:
    endog:指定資料點的y值;為1-D array-like
    exog:指定資料點的x值;為n×k array-like,其中n=len(<endog>),k為特征數
    missing:指定如何處理缺失值;為"none"(不檢查是否包含NaN)/"drop"(丢棄相應記錄)/"raise"(報錯)
      hasconst:Indicates whether the RHS includes a user-supplied constant. If True, a constant is not checked
      for and k_constant is set to 1 and all result statistics are calculated as if a constant is present. If
      False, a constant is not checked for and k_constant is set to 0;為None/bool
    kwargs:指定使用公式接口時要傳入的其他參數

######################################################################################################################

"廣義最小二乘法"(Generalized Least Squares):class sm.GLS(<endog>,<exog>[,sigma=None,missing='none',hasconst=None,**kwargs])
  #實際上是class statsmodels.regression.linear_model.GLS
  #參數說明:其他參數同sm.OLS
    sigma:指定協方差權重矩陣;為None/scalar/array
      #The default is None for no scaling
      #If sigma is a scalar, it is assumed that sigma is an n x n diagonal matrix with the given scalar, sigma as the
      #value of each diagonal element
      #If sigma is an n-length vector, then sigma is assumed to be a diagonal matrix with the given sigma on the
      #diagonal
      #This should be the same as WLS

######################################################################################################################

Generalized Least Squares with AR covariance structures:class sm.GLSAR(<endog>,<exog>[,rho=1,missing='none',hasconst=None,**kwargs])
  #實際上是class statsmodels.regression.linear_model.GLSAR

######################################################################################################################

"權重最小二乘法"(Weighted Least Squares):class sm.WLS(<endog>,<exog>[,weights=1.0,missing='none',hasconst=None,**kwargs])
  #實際上是class statsmodels.regression.linear_model.WLS
  #參數說明:其他參數同sm.OLS
    weights:指定權重;為int/1-D array-like

######################################################################################################################

"遞歸最小二乘法"(Recursive Least Squares):class sm.RecursiveLS(<endog>,<exog>[,constraints=None,**kwargs])
  #實際上是class statsmodels.regression.recursive_ls.RecursiveLS

######################################################################################################################

"滾動普通最小二乘法"(Rolling Ordinary Least Squares):class sm.RollingOLS(<endog>,<exog>[,window=None,min_nobs=None,missing='drop',expanding=False])
  #實際上是class statsmodels.regression.rolling.RollingOLS

######################################################################################################################

"滾動權重最小二乘法"(Rolling Weighted Least Squares):class sm.RollingWLS(<endog>,<exog>[,window=None,weights=None,min_nobs=None,missing='drop',expanding=False])
  #實際上是class statsmodels.regression.rolling.RollingWLS
           

(3)缺失值的處理(Imputation):

"基于高斯模型的貝葉斯插補"(Bayesian Imputation using a Gaussian model):class sm.BayesGaussMI(<data>[,mean_prior=None,cov_prior=None,cov_prior_df=1])
  #實際上是class statsmodels.imputation.bayes_mi.BayesGaussMI
"基于貝葉斯估計的廣義線性混合模型"(Generalized Linear Mixed Model with Bayesian estimation):class sm.BinomialBayesMixedGLM(<endog>,<exog>,<exog_vc>,<ident>[,vcp_p=1,fe_p=2,fep_names=None,vcp_names=None,vc_names=None])
  #實際上是class statsmodels.genmod.bayes_mixed_glm.BinomialBayesMixedGLM
"因子分析"(Factor analysis):class sm.Factor([endog=None,n_factor=1,corr=None,method='pa',smc=True,endog_names=None,nobs=None,missing='drop'])
  #實際上是class statsmodels.multivariate.factor.Factor
基于指定"缺失值處理器"(Imputer)的"多重插補"(Multiple Imputation):class sm.MI(<imp>,<model>[,model_args_fn=None,model_kwds_fn=None,formula=None,fit_args=None,fit_kwds=None,xfunc=None,burn=100,nrep=20,skip=10])
  #實際上是class statsmodels.imputation.bayes_mi.MI
基于"鍊式方程"(Chained Equations)的多重插補:class sm.MICE(<model_formula>,<model_class>,<data>[,n_skip=3,init_kwds=None,fit_kwds=None])
  #實際上是class statsmodels.imputation.mice.MICE
包裝資料集以允許使用sm.MICE處理缺失值:class sm.MICEData(<data>[,perturbation_method='gaussian',k_pmm=20,history_callback=None])
  #實際上是class statsmodels.imputation.mice.MICEData
           

(4)廣義估計方程(Generalized Estimating Equations;GEE):

"基于GEE的邊際回歸模型"(Marginal Regression Model using GEE):class sm.GEE(<endog>,<exog>,<groups>[,time=None,family=None,cov_struct=None,missing='none',offset=None,exposure=None,dep_data=None,constraint=None,update_dep=True,weights=None,**kwargs])
  #實際上是class statsmodels.genmod.generalized_estimating_equations.GEE
"基于GEE的名義反應邊際回歸模型"(Nominal Response Marginal Regression Model using GEE):sm.NominalGEE(<endog>,<exog>,<groups>[,time=None,family=None,cov_struct=None,missing='none',offset=None,dep_data=None,constraint=None,**kwargs])
  #實際上是class statsmodels.genmod.generalized_estimating_equations.NominalGEE
"基于GEE的順序反應邊際回歸模型"(Ordinal Response Marginal Regression Model using GEE):class sm.OrdinalGEE(<endog>,<exog>,<groups>[,time=None,family=None,cov_struct=None,missing='none',offset=None,dep_data=None,constraint=None,**kwargs])
  #實際上是statsmodels.genmod.generalized_estimating_equations.OrdinalGEE
           

(5)廣義線性模型(Generalized Linear Models;GLM):

"廣義線性模型"(Generalized Linear Models;GLM):class sm.GLM(<endog>,<exog>[,family=None,offset=None,exposure=None,freq_weights=None,var_weights=None,missing='none',**kwargs])
  #實際上是class statsmodels.genmod.generalized_linear_model.GLM
"廣義加性模型"(Generalized Additive Models;GAM):class sm.GLMGam(<endog>,<exog>[,smoother=None,alpha=0,family=None,offset=None,exposure=None,missing='none',**kwargs])
  #實際上是class statsmodels.gam.generalized_additive_model.GLMGam
"基于貝葉斯估計的廣義線性混合模型"(Generalized Linear Mixed Model with Bayesian estimation):class sm.PoissonBayesMixedGLM(<endog>,<exog>,<exog_vc>,<ident>[,vcp_p=1,fe_p=2,fep_names=None,vcp_names=None,vc_names=None])
  #實際上是class statsmodels.genmod.bayes_mixed_glm.PoissonBayesMixedGLM
           

(6)離散與計數模型(Discrete and Count Models):

"廣義泊松模型"(Generalized Poisson Model):class sm.GeneralizedPoisson(<endog>,<exog>[,p=1,offset=None,exposure=None,missing='none',check_rank=True,**kwargs])
  #實際上是class statsmodels.discrete.discrete_model.GeneralizedPoisson
"Logit模型"(Logit Model):class sm.Logit(<endog>,<exog>[,check_rank=True,**kwargs])
  #實際上是class statsmodels.discrete.discrete_model.Logit
"多分類Logit模型"(Multinomial Logit Model):class sm.MNLogit(<endog>,<exog>[,check_rank=True,**kwargs])
  #實際上是class statsmodels.discrete.discrete_model.MNLogit
"泊松模型"(Poisson Model):class sm.Poisson(<endog>,<exog>[,offset=None,exposure=None,missing='none',check_rank=True,**kwargs])
  #實際上是class statsmodels.discrete.discrete_model.Poisson
"Probit模型"(Probit Model):class sm.Probit(<endog>,<exog>[,check_rank=True,**kwargs])
  #實際上是class statsmodels.discrete.discrete_model.Probit
"負二項式模型"(Negative Binomial Model):class sm.NegativeBinomial(<endog>,<exog>[,loglike_method='nb2',offset=None,exposure=None,missing='none',check_rank=True,**kwargs])
  #實際上是class statsmodels.discrete.discrete_model.NegativeBinomial
"廣義負二項式模型"(Generalized Negative Binomial Model):class sm.NegativeBinomialP(<endog>,<exog>[,p=2,offset=None,exposure=None,missing='none',check_rank=True,**kwargs])
  #實際上是class statsmodels.discrete.discrete_model.NegativeBinomialP
"零膨脹廣義泊松模型"(Zero Inflated Generalized Poisson Model):class sm.ZeroInflatedGeneralizedPoisson(<endog>,<exog>[,exog_infl=None,offset=None,exposure=None,inflation='logit',p=2,missing='none',**kwargs])
  #實際上是class statsmodels.discrete.count_model.ZeroInflatedGeneralizedPoisson
"零膨脹廣義負二項式模型"(Zero Inflated Generalized Negative Binomial Model):class sm.ZeroInflatedNegativeBinomialP(<endog>,<exog>[,exog_infl=None,offset=None,exposure=None,inflation='logit',p=2,missing='none',**kwargs])
  #實際上是class statsmodels.discrete.count_model.ZeroInflatedNegativeBinomialP
"泊松零膨脹模型"(Poisson Zero Inflated Model):class sm.ZeroInflatedPoisson(<endog>,<exog>[,exog_infl=None,offset=None,exposure=None,inflation='logit',missing='none',**kwargs])
  #實際上是class statsmodels.discrete.count_model.ZeroInflatedPoisson
           

(7)多變量模型(Multivariate Models):

"多元方差分析"(Multivariate Analysis of Variance;MANOVA):class sm.MANOVA(<endog>,<exog>[,missing='none',hasconst=None,**kwargs])
  #實際上是class statsmodels.multivariate.manova.MANOVA
"主成分分析"(Principal Component Analysis;PCA):class sm.PCA(<data>[,ncomp=None,standardize=True,demean=True,normalize=True,gls=False,weights=None,method='svd',missing=None,tol=5e-08,max_iter=1000,tol_em=5e-08,max_em_iter=100])
  #實際上是class statsmodels.multivariate.pca.PCA
           

(8)其他模型(Misc Models):

"線性混合效應模型"(Linear Mixed Effects Model):class sm.MixedLM(<endog>,<exog>,<groups>[,exog_re=None,exog_vc=None,use_sqrt=True,missing='none',**kwargs])
  #實際上是class statsmodels.regression.mixed_linear_model.MixedLM
"Cox比例風險回歸模型"(Cox Proportional Hazards Regression Model):class sm.PHReg(<endog>,<exog>[,status=None,entry=None,strata=None,offset=None,ties='breslow',missing='drop',**kwargs])
  #實際上是class statsmodels.duration.hazard_regression.PHReg
"分位數回歸"(Quantile Regression):class sm.QuantReg(<endog>,<exog>[,**kwargs])
  #實際上是class statsmodels.regression.quantile_regression.QuantReg
"穩健線性模型"(Robust Linear Model):class sm.RLM(<endog>,<exog>[,M=None,missing='none',**kwargs])
  #實際上是class statsmodels.robust.robust_linear_model.RLM
"對生存函數的估計和推斷"(Estimation and inference for a survival function):class sm.SurvfuncRight(<time>,<status>[,entry=None,title=None,freq_weights=None,exog=None,bw_factor=1.0])
  #實際上是class statsmodels.duration.survfunc.SurvfuncRight
           

(9)圖像(Graphics):

Q-Q and P-P Probability Plots:class sm.ProbPlot(<data>[,dist=<scipy.stats._continuous_distns.norm_gen object>,fit=False,distargs=(),a=0,loc=0,scale=1])
  #實際上是class statsmodels.graphics.gofplots.ProbPlot
Plot a reference line for a qqplot:sm.qqline(<ax>,<line>[,x=None,y=None,dist=None,fmt='r-',**lineoptions])
  #實際上是statsmodels.graphics.gofplots.qqline()
Q-Q plot of the quantiles of x versus the quantiles/ppf of a distribution:sm.qqplot(<data>[,dist=<scipy.stats._continuous_distns.norm_gen object>,distargs=(),a=0,loc=0,scale=1,fit=False,line=None,ax=None,**plotkwargs])
  #實際上是statsmodels.graphics.gofplots.qqplot()
Q-Q Plot of two samples’ quantiles:sm.qqplot_2samples(<data1>,<data2>[,xlabel=None,ylabel=None,line=None,ax=None])
  #實際上是statsmodels.graphics.gofplots.qqplot_2samples()
           

(10)工具(Tools):

Run the test suite:sm.test([extra_args=None,exit=False])
  #實際上是statsmodels.__init__.test()
Add a column of ones to an array:sm.add_constant(<data>[,prepend=True,has_constant='skip'])
  #實際上是tatsmodels.tools.tools.add_constant()
Load a previously saved object:sm.load_pickle(<fname>)
  #實際上是statsmodels.iolib.smpickle.load_pickle()
List the versions of statsmodels and any installed dependencies:sm.show_versions([show_dirs=True])
  #實際上是statsmodels.tools.print_version.show_versions()
Opens a browser and displays online documentation:sm.webdoc([func=None,stable=None])
  #實際上是statsmodels.tools.web.webdoc()
           

3.時間序列研究(Time-Series Study)

(1)導入:

#通常導入為tsa:
import statsmodels.tsa.api as tsa
           

(2)統計與測試(Statistics and Tests):

求"自相關函數"(Autocorrelation Function):tsa.acf(<x>[,adjusted=False,nlags=None,qstat=False,fft=None,alpha=None,missing='none'])
  #實際上是statsmodels.tsa.stattools.acf()
估計"自協方差"(Autocovariance):tsa.acovf(<x>[,adjusted=False,demean=True,fft=None,missing='none',nlag=None])
  #實際上是statsmodels.tsa.stattools.acovf()
"迪基-福勒機關根檢驗"(Augmented Dickey-Fuller Unit Root Test):tsa.adfuller(<x>[,maxlag=None,regression='c',autolag='AIC',store=False,regresults=False])
  #實際上是statsmodels.tsa.stattools.adfuller()
時間序列獨立性的"BDS檢驗統計"(BDS Test Statistic):tsa.bds(<x>[,max_dim=2,epsilon=None,distance=1.5])
  #實際上是statsmodels.tsa.stattools.bds()
"互相關函數"(Cross-Correlation Function):tsa.ccf(<x>,<y>[,adjusted=True])
  #實際上是statsmodels.tsa.stattools.ccf()
求2個序列間的"互協方差"(Cross-Covariance):tsa.ccovf(<x>,<y>[,adjusted=True,demean=True])
  #實際上是statsmodels.tsa.stattools.ccovf()
對"一進制方程"(Univariate Equation)的"無協整性"(No-Cointegration)的測試:tsa.coint(<y0>,<y1>[,trend='c',method='aeg',maxlag=None,autolag='aic',return_results=None])
  #實際上是statsmodels.tsa.stattools.coint()
"KPSS檢驗"(Kwiatkowski-Phillips-Schmidt-Shin Test):tsa.kpss(<x>[,regression='c',nlags=None,store=False])
  #實際上是statsmodels.tsa.stattools.kpss()
"偏自相關估計"(Partial Autocorrelation Estimate):tsa.pacf(<x>[,nlags=None,method='ywadjusted',alpha=None])
  #實際上是statsmodels.tsa.stattools.pacf()
基于OLS的偏自相關:tsa.pacf_ols(<x>[,nlags=None,efficient=True,adjusted=False])
  #實際上是statsmodels.tsa.stattools.pacf_ols()
基于"非遞歸尤爾—沃克方程"(Non-Recursive Yule_Walker)的偏自相關估計:tsa.pacf_yw(<x>[,nlags=None,method='adjusted'])
  #實際上是statsmodels.tsa.stattools.pacf_yw()
求"LBQ統計量"(Ljung-Box Q Statistic):tsa.q_stat(<x>,<nobs>[,type=None])
  #statsmodels.tsa.stattools.q_stat()
           

(3)單變量時間序列分析(Univariate Time-Series Analysis):

"自回歸模型"(Autoregressive Model;AR Model):class tsa.AutoReg(<endog>,<lags>[,trend='c',seasonal=False,exog=None,hold_back=None,period=None,missing='none',deterministic=None,old_names=None])
  #實際上是class statsmodels.tsa.ar_model.AutoReg
"自回歸差分整合滑動平均模型"(Autoregressive Integrated Moving Average Model;ARIMA Model):class tsa.ARIMA(<endog>[,exog=None,order=(0,0,0),seasonal_order=(0,0,0,0),trend=None,enforce_stationarity=True,enforce_invertibility=True,concentrate_scale=False,trend_offset=1,dates=None,freq=None,missing='none',validate_specification=True])
  #實際上是class statsmodels.tsa.arima.model.ARIMA
帶有"外生回歸因子"(Exogenous Regressors)的"季節性自回歸差分整合滑動平均模型"(Seasonal AutoRegressive Integrated Moving Average Model;SARIMA Model):class tsa.SARIMAX(<endog>[,exog=None,order=(0,0,0),seasonal_order=(0,0,0,0),trend=None,measurement_error=False,time_varying_regression=False,mle_regression=True,simple_differencing=False,enforce_stationarity=True,enforce_invertibility=True,hamilton_representation=False,concentrate_scale=False,trend_offset=1,use_exact_diffuse=False,dates=None,freq=None,missing='none',validate_specification=True,**kwargs])
  #實際上是class statsmodels.tsa.statespace.sarimax.SARIMAX
計算大量ARMA模型的"資訊準則"(Information Criteria):tsa.arma_order_select_ic(<y>[,max_ar=4,max_ma=2,ic='bic',trend='c',model_kw=None,fit_kw=None])
  #實際上是statsmodels.tsa.stattools.arma_order_select_ic()
來自ARMA模型的"模拟資料"(Simulate Data):tsa.arma_generate_sample(<ar>,<ma>[,nsample,scale=1,distrvs=None,axis=0,burnin=0])
  #實際上是statsmodels.tsa.arima_process.arma_generate_sample()
指定"拉格朗日多項式"(Lagrange-Polynomial)的ARMA過程的理論性質:class tsa.ArmaProcess([ar=None,ma=None,nobs=100])
  #實際上是class statsmodels.tsa.arima_process.ArmaProcess
           

(4)指數平滑法(Exponential Smoothing):

"霍爾特-溫特指數平滑法"(Holt Winter's Exponential Smoothing):class tsa.ExponentialSmoothing(<endog>[,trend=None,damped_trend=False,seasonal=None,seasonal_periods=None,initialization_method=None,initial_level=None,initial_trend=None,initial_seasonal=None,use_boxcox=None,bounds=None,dates=None,freq=None,missing="none")
  #實際上是class statsmodels.tsa.holtwinters.ExponentialSmoothing
"霍爾特指數平滑法"(Holt's Exponential Smoothing):class tsa.Holt(<endog>[,exponential=False,damped_trend=False,initialization_method=None,initial_level=None,initial_trend=None])
  #實際上是class statsmodels.tsa.holtwinters.Holt
"簡單指數平滑法"(Simple Exponential Smoothing):class tsa.SimpleExpSmoothing(<endog>[,initialization_method=None,initial_level=None])
  #實際上是class statsmodels.tsa.holtwinters.SimpleExpSmoothing
"線性指數平滑模型"(Linear Exponential Smoothing Models):class tsa.ExponentialSmoothing(<endog>[,trend=False,damped_trend=False,seasonal=None,initialization_method='estimated',initial_level=None,initial_trend=None,initial_seasonal=None,bounds=None,concentrate_scale=True,dates=None,freq=None,missing='none'])
  #實際上是class statsmodels.tsa.statespace.exponential_smoothing.ExponentialSmoothing
"ETS模型"(ETS models):class tsa.ETSModel(<endog>[,error='add',trend=None,damped_trend=False,seasonal=None,seasonal_periods=None,initialization_method='estimated',initial_level=None,initial_trend=None,initial_seasonal=None,bounds=None,dates=None,freq=None,missing='none'])
  #實際上是class statsmodels.tsa.exponential_smoothing.ets.ETSModel
           

(5)多元時間序列模型(Multivariate Time Series Models):

"動态因子模型"(Dynamic factor model):class tsa.DynamicFactor(<endog>,<k_factors>,<factor_order>[,exog=None,error_order=0,error_var=False,error_cov_type='diagonal',enforce_stationarity=True,**kwargs])
  #實際上是class statsmodels.tsa.statespace.dynamic_factor.DynamicFactor
基于"最大期望算法"(Expectation-Maximization Algorithm;EM Algorithm)的動态因子模型:class tsa.DynamicFactorMQ(<endog>[,k_endog_monthly=None,factors=1,factor_orders=1,factor_multiplicities=None,idiosyncratic_ar1=True,standardize=True,endog_quarterly=None,init_t0=False,obs_cov_diag=False,**kwargs])
  #實際上是class statsmodels.tsa.statespace.dynamic_factor_mq.DynamicFactorMQ
拟合VAR(p)過程并選擇"滞後階數"(Lag Order):class tsa.VAR(<endog>[,exog=None,dates=None,freq=None,missing='none'])
  #實際上是class statsmodels.tsa.vector_ar.var_model.VAR
帶有外生回歸因子的"向量自回歸滑動平均模型"(Vector Autoregressive Moving Average Model):class tsa.VARMAX(<endog>[,exog=None,order=(1,0),trend='c',error_cov_type='unstructured',measurement_error=False,enforce_stationarity=True,enforce_invertibility=True,trend_offset=1,**kwargs])
  #實際上是class statsmodels.tsa.statespace.varmax.VARMAX
拟合VAR過程并估計A與B的"Structural Components":class tsa.SVAR(<endog>,<svar_type>[,dates=None,freq=None,A=None,B=None,missing='none'])
  #實際上是class statsmodels.tsa.vector_ar.svar_model.SVAR
"向量誤差修正模型"(Vector Error Correction Model;VECM):class tsa.VECM(<endog>[,exog=None,exog_coint=None,dates=None,freq=None,missing='none',k_ar_diff=1,coint_rank=1,deterministic='nc',seasons=0,first_season=0])
  #實際上是class statsmodels.tsa.vector_ar.vecm.VECM
"一進制未觀測分量時間序列模型"(Univariate Unobserved Components Time Series Model):class tsaNone,exog=None,irregular=False,stochastic_level=False,stochastic_trend=False,stochastic_seasonal=True,stochastic_freq_seasonal=None,stochastic_cycle=False,damped_cycle=False,cycle_period_bounds=None,mle_regression=True,use_exact_diffuse=False,**kwargs])
  #實際上是class statsmodels.tsa.statespace.structural.UnobservedComponents
           

(6)過濾與分解(Filters and Decompositions):

基于滑動平均的"季節分解"(Seasonal Decomposition):tsa.seasonal_decompose(<x>[,model='additive',filt=None,period=None,two_sided=True,extrapolate_trend=0])
  #實際上是statsmodels.tsa.seasonal.seasonal_decompose()
"基于LOESS的季節-趨勢分解"(Season-Trend Decomposition using LOESS;STL):class tsa.STL(<endog>[,period=None,seasonal=7,trend=None,low_pass=None,seasonal_deg=0,trend_deg=0,low_pass_deg=0,robust=False,seasonal_jump=1,trend_jump=1,low_pass_jump=1])
  #實際上是class statsmodels.tsa.seasonal.STL
"BK帶通濾波器"(Baxter-King Bandpass Filter):tsa.bkfilter(<x>[,low=6,high=32,K=12])
  #實際上是statsmodels.tsa.filters.bk_filter.bkfilter()
"CF不對稱随機遊走濾波器"(Christiano Fitzgerald Asymmetric,Random Walk Filter):tsa.cffilter(<x>[,low=6,high=32,drift=True])
  #實際上是statsmodels.tsa.filters.cf_filter.cffilter()
"HP濾波器"(Hodrick-Prescott Filter.):tsa.hpfilter(<x>[,lamb=1600])
  #實際上是statsmodels.tsa.filters.hp_filter.hpfilter()
           

(7)馬爾可夫區制轉換模型(Markov Regime Switching Models):

"馬爾可夫轉換回歸模型"(Markov Switching Regression Model):class tsa.MarkovAutoregression(<endog>,<k_regimes>,<order>[,trend='c',exog=None,exog_tvtp=None,switching_ar=True,switching_trend=True,switching_exog=False,switching_variance=False,dates=None,freq=None,missing='none'])
  #實際上是class statsmodels.tsa.regime_switching.markov_autoregression.MarkovAutoregression
"1階K-區制馬爾可夫轉換模型"(First-Order K-Regime Markov Switching Regression Model):class tsa.MarkovRegression(<endog>,<k_regimes>[,trend='c',exog=None,order=0,exog_tvtp=None,switching_trend=True,switching_exog=True,switching_variance=False,dates=None,freq=None,missing='none'])
  #實際上是class statsmodels.tsa.regime_switching.markov_regression.MarkovRegression
           

(8)預測(Forecasting):

Model-based forecasting using STL to remove seasonality:class tsa.STLForecast(<endog>,<model>[,model_kwargs=None,period=None,seasonal=7,trend=None,low_pass=None,seasonal_deg=1,trend_deg=1,low_pass_deg=1,robust=False,seasonal_jump=1,trend_jump=1,low_pass_jump=1])
  #實際上是class statsmodels.tsa.forecasting.stl.STLForecast
The Theta forecasting model of Assimakopoulos and Nikolopoulos (2000):class tsa.ThetaModel(<endog>[,period=None,deseasonalize=True,use_test=True,method='auto',difference=False)
  #實際上是class statsmodels.tsa.forecasting.theta.ThetaModel
           

(9)時間序列工具(Time-Series Tools):

Returns an array with lags included given an array:tsa.add_lag(<x>[,col=None,lags=1,drop=False,insert=True])
  #實際上是statsmodels.tsa.tsatools.add_lag()
Add a trend and/or constant to an array:tsa.add_trend(<x>[,trend='c',prepend=False,has_constant='skip'])
  #實際上是statsmodels.tsa.tsatools.add_trend()
Detrend an array with a trend of given order along axis 0 or 1:tsa.detrend(<x>[,order=1,axis=0])
  #實際上是statsmodels.tsa.tsatools.detrend()
Create 2d array of lags:tsa.lagmat(<x>,<maxlag>[,trim='forward',original='ex',use_pandas=False])
  #實際上是statsmodels.tsa.tsatools.lagmat()
Generate lagmatrix for 2d array, columns arranged by variables:tsa.lagmat2ds(<x>[,maxlag0,maxlagex=None,dropex=0,trim='forward',use_pandas=False])
  #實際上是statsmodels.tsa.tsatools.lagmat2ds()
Container class for deterministic terms:class tsa.DeterministicProcess(<index>[,period=None,constant=False,order=0,seasonal=False,fourier=0,additional_terms=(),drop=False])
  #實際上是class statsmodels.tsa.deterministic.DeterministicProcess
           

(10)X12/X13接口(X12/X13 Interface):

Perform x13-arima analysis for monthly or quarterly data:tsa.x13_arima_analysis(<endog>[,maxorder=(2,1),maxdiff=(2,1),diff=None,exog=None,log=None,outlier=True,trading=False,forecast_periods=None,retspec=False,speconly=False,start=None,freq=None,print_stdout=False,x12path=None,prefer_x13=True])
  #實際上是statsmodels.tsa.x13.x13_arima_analysis()
Perform automatic seasonal ARIMA order identification using x12/x13 ARIMA:tsa.x13_arima_select_order(<endog>[,maxorder=(2,1),maxdiff=(2,1),diff=None,exog=None,log=None,outlier=True,trading=False,forecast_periods=None,start=None,freq=None,print_stdout=False,x12path=None,prefer_x13=True])
  #實際上是statsmodels.tsa.x13.x13_arima_select_order()
           

4.基于公式的接口(Formula Interface)

A convenience interface for specifying models using formula strings and DataFrames. This API directly exposes the
from_formula class method of models that support the formula API
           

(1)導入:

#通常導入為smf:
import statsmodels.formula.api as smf
           

(2)模型(Models):

二.patsy子產品

官方文檔:https://pypi.org/project/patsy/

1.概述

(1)簡介:

patsy是1個用于描述統計模型(尤其是線性模型或具有線性元件的模型)和建構設計矩陣的Python庫.其受R/S語言中的公式迷你語言啟發并與之相容,為
Python帶來了"R公式"(R "formulas")的便利性
           

(2)安裝:

pip install patsy
           

2.使用