天天看點

「量化前沿」kats:Facebook開源的時間序列工具箱

作者:泓裕金工

根據官網介紹,Kats是一個用于分析時間序列資料的工具箱,是一個輕量級、易于使用和可推廣的架構,用于執行時間序列分析。時間序列分析是工業資料科學和工程工作的重要組成部分,從了解關鍵統計資料和特征,檢測回歸和異常,預測未來趨勢。Kats旨在為時間序列分析提供一站式服務,包括檢測、預測、特征提取/嵌入、多元分析等。Kats由Facebook基礎設施戰略團隊釋出,可以在PyPI上下載下傳。先存下來,以後做時序分析,這個可以打包用了,不錯不錯。

「量化前沿」kats:Facebook開源的時間序列工具箱

  • Homepage: https://facebookresearch.github.io/Kats/
  • Source code repository: https://github.com/facebookresearch/kats
  • Contributing: https://github.com/facebookresearch/Kats/blob/master/CONTRIBUTING.md
  • Tutorials: https://github.com/facebookresearch/Kats/tree/master/tutorials
  • Kats Python package: https://pypi.org/project/kats/0.1/
  • Kats website: https://facebookresearch.github.io/Kats/

案例:

1 預測

封裝很多models,傳統線性時間序列模型,有ensemble的API,結合時間序列特征可做meta-learning

from kats.consts import TimeSeriesData
from kats.models.prophet import ProphetModel, ProphetParams

# take `air_passengers` data as an example
air_passengers_df = pd.read_csv("../kats/data/air_passengers.csv")
air_passengers_ts = TimeSeriesData(air_passengers_df)

# create a model param instance
params = ProphetParams(seasonality_mode='multiplicative') # additive mode gives worse results

# create a prophet model instance
m = ProphetModel(air_passengers_ts, params)

# fit model simply by calling m.fit()
m.fit()

# make prediction for next 30 month
fcst = m.predict(steps=30, freq="MS")           

2 異常值偵測

在模拟資料集上采用CUSUM檢測算法。針對時序異常值做異常檢驗

# import packages
from kats.consts import TimeSeriesData
from kats.detectors.cusum_detection import CUSUMDetector

# simulate time series with increase
np.random.seed(10)
df_increase = pd.DataFrame(
    {
        'time': pd.date_range('2019-01-01', '2019-03-01'),
        'increase':np.concatenate([np.random.normal(1,0.2,30), np.random.normal(2,0.2,30)]),
    }
)

# convert to TimeSeriesData object
timeseries = TimeSeriesData(df_increase)

# run detector and find change points
change_points = CUSUMDetector(timeseries).detector()           

3 特征提取

從給定的時間序列資料中提取有意義的特征,可以得到65個時間序列相關的features

# Initiate feature extraction class
from kats.tsfeatures.tsfeatures import TsFeatures

features = TsFeatures().transform(air_passengers_ts)           

4 模拟

from kats.utils.simulator import Simulator
sim = Simulator(n=1000, freq="D", start = "2021-12-01") # simulate1000 days of data
 arima_sim_list = [sim.arima_sim(ar=[0.1, 0.05], ma = [0.04, 0.1], d = 1) for _ in range(10)]
 # generate 10 TimeSeriesData with trend shifts
trend_sim_list = [
    sim.trend_shift_sim(
        cp_arr = [30, 60, 75],
        trend_arr=[3, 15, 2, 8],
        intercept=30,
        noise=50,
        seasonal_period=7,
        seasonal_magnitude=np.random.uniform(10, 100),
        random_seed=random_seed
    ) for _ in range(10)
]
# generate 10 TimeSeriesData with level shifts
level_shift_list = [
    sim.level_shift_sim(
        cp_arr = [30, 60, 75],
        level_arr=[1.35, 1.05, 1.35, 1.2],
        noise=0.05,
        seasonal_period=7,
        seasonal_magnitude=np.random.uniform(0.1, 1.0),
        random_seed=random_seed
    ) for _ in range(10)
]            

繼續閱讀