根據官網介紹,Kats是一個用于分析時間序列資料的工具箱,是一個輕量級、易于使用和可推廣的架構,用于執行時間序列分析。時間序列分析是工業資料科學和工程工作的重要組成部分,從了解關鍵統計資料和特征,檢測回歸和異常,預測未來趨勢。Kats旨在為時間序列分析提供一站式服務,包括檢測、預測、特征提取/嵌入、多元分析等。Kats由Facebook基礎設施戰略團隊釋出,可以在PyPI上下載下傳。先存下來,以後做時序分析,這個可以打包用了,不錯不錯。
![](https://img.laitimes.com/img/__Qf2AjLwojIjJCLyojI0JCLiMGc902byZ2P4ADOiNjNyUGM0AzNjFGNhVmY5QzYkFmZyYTM2YjY2U2LcBza5QTcsJja2FXLp1ibj1ycvR3Lc5Wanlmcv9CXt92YucWbp9WYpRXdvRnL2A3Lc9CX6MHc0RHaiojIsJye.jpg)
- Homepage: https://facebookresearch.github.io/Kats/
- Source code repository: https://github.com/facebookresearch/kats
- Contributing: https://github.com/facebookresearch/Kats/blob/master/CONTRIBUTING.md
- Tutorials: https://github.com/facebookresearch/Kats/tree/master/tutorials
- Kats Python package: https://pypi.org/project/kats/0.1/
- Kats website: https://facebookresearch.github.io/Kats/
案例:
1 預測
封裝很多models,傳統線性時間序列模型,有ensemble的API,結合時間序列特征可做meta-learning
from kats.consts import TimeSeriesData
from kats.models.prophet import ProphetModel, ProphetParams
# take `air_passengers` data as an example
air_passengers_df = pd.read_csv("../kats/data/air_passengers.csv")
air_passengers_ts = TimeSeriesData(air_passengers_df)
# create a model param instance
params = ProphetParams(seasonality_mode='multiplicative') # additive mode gives worse results
# create a prophet model instance
m = ProphetModel(air_passengers_ts, params)
# fit model simply by calling m.fit()
m.fit()
# make prediction for next 30 month
fcst = m.predict(steps=30, freq="MS")
2 異常值偵測
在模拟資料集上采用CUSUM檢測算法。針對時序異常值做異常檢驗
# import packages
from kats.consts import TimeSeriesData
from kats.detectors.cusum_detection import CUSUMDetector
# simulate time series with increase
np.random.seed(10)
df_increase = pd.DataFrame(
{
'time': pd.date_range('2019-01-01', '2019-03-01'),
'increase':np.concatenate([np.random.normal(1,0.2,30), np.random.normal(2,0.2,30)]),
}
)
# convert to TimeSeriesData object
timeseries = TimeSeriesData(df_increase)
# run detector and find change points
change_points = CUSUMDetector(timeseries).detector()
3 特征提取
從給定的時間序列資料中提取有意義的特征,可以得到65個時間序列相關的features
# Initiate feature extraction class
from kats.tsfeatures.tsfeatures import TsFeatures
features = TsFeatures().transform(air_passengers_ts)
4 模拟
from kats.utils.simulator import Simulator
sim = Simulator(n=1000, freq="D", start = "2021-12-01") # simulate1000 days of data
arima_sim_list = [sim.arima_sim(ar=[0.1, 0.05], ma = [0.04, 0.1], d = 1) for _ in range(10)]
# generate 10 TimeSeriesData with trend shifts
trend_sim_list = [
sim.trend_shift_sim(
cp_arr = [30, 60, 75],
trend_arr=[3, 15, 2, 8],
intercept=30,
noise=50,
seasonal_period=7,
seasonal_magnitude=np.random.uniform(10, 100),
random_seed=random_seed
) for _ in range(10)
]
# generate 10 TimeSeriesData with level shifts
level_shift_list = [
sim.level_shift_sim(
cp_arr = [30, 60, 75],
level_arr=[1.35, 1.05, 1.35, 1.2],
noise=0.05,
seasonal_period=7,
seasonal_magnitude=np.random.uniform(0.1, 1.0),
random_seed=random_seed
) for _ in range(10)
]