集成异常检测
1.准备数据
获取内部数据集
from paddlets.datasets.repository import get_dataset
ts_data = get_dataset('NAB_TEMP') # label_col: 'label', feature_cols: 'value'
2.数据预处理
切分训练集,然后标准化数据
import paddle
import numpy as np
from paddlets.transform import StandardScaler
#set seed
seed = 2022
paddle.seed(seed)
np.random.seed(seed)
train_tsdata, test_tsdata = ts_data.split(0.15)
#standardize
scaler = StandardScaler('value')
scaler.fit(train_tsdata)
train_tsdata_scaled = scaler.transform(train_tsdata)
test_tsdata_scaled = scaler.transform(test_tsdata)
3.准备模型
为集成模型准备基础模型, in_chunk_len这个参数已经被提取到集成模型中,底层模型可以忽略这个参数。
from paddlets.models.anomaly import AutoEncoder
from paddlets.models.anomaly import VAE
ae_params = {"max_epochs":100}
vae_params = {"max_epochs":100}
4.组装和拟合模型
更多关于集成异常检测模型的信息,请参考 EnsembleAnomaly doc .
例子
from paddlets.ensemble import WeightingEnsembleAnomaly
model = WeightingEnsembleAnomaly(
in_chunk_len=2,
estimators=[(AutoEncoder, ae_params),(VAE, vae_params)],
mode = "voting")
model.fit(train_tsdata_scaled)
5. 模型预测和评估
用训练的模型进行预测和评估
from paddlets.metrics import F1,ACC,Precision,Recall
pred_label = model.predict(test_tsdata_scaled)
lable_name = pred_label.target.data.columns[0]
f1 = F1()(test_tsdata, pred_label)
precision = Precision()(test_tsdata, pred_label)
recall = Recall()(test_tsdata, pred_label)
print ('f1: ', f1[lable_name])
print ('precision: ', precision[lable_name])
print ('recall: ', recall[lable_name])