Backtesting
Backtesting simulates predictions that would have been obtained historically with a given model. Backtesting is used to predict the future accuracy of a forecasting method, and is therefore useful in determining which model can be considered the most accurate .
Backtest is an iterative procedure,backtest repeatly predict on dataset with a fixed predict window, and then moves to the end of the training set forward by fixed steps. As in above figure, the orange part is a predict window with length 3. In each new iteration, the window will move forward by 3 points, and the train set length will forward 3 points either. This procedure repeats util moves to the end of the training set.
Example
1)Prepare Dataset
from paddlets.datasets.repository import get_dataset
dataset = get_dataset('WTH')
train_dataset, val_test_dataset = dataset.split(0.8)
val_dataset, test_dataset = val_test_dataset.split(0.5)
train_dataset.plot(add_data=[val_dataset,test_dataset],labels=["val","test"])
2)Fit model
from paddlets.models.forecasting import MLPRegressor
mlp = MLPRegressor(
in_chunk_len = 7 * 96,
out_chunk_len = 96,
max_epochs=100
)
mlp.fit(train_dataset, val_dataset)
3)Backtesting
Five examples are given below. For more bascktest features please read Backtesting API doc .
Backtesting Example 1
Backtest will start at model input_chunk_length and return a MSE score by Default.
from paddlets.utils import backtest
score= backtest(
data=test_dataset,
model=mlp
)
print(score)
#{'WetBulbCelsius': 14.125747060941888}
Backtesting Example 2
User define metrics.
from paddlets.utils import backtest
from paddlets.metrics import MAE
mae = MAE()
score = backtest(
data=test_dataset,
model=mlp,
metric = mae)
print(score)
Backtesting Example 3
If set return_predicts
to True, Backtest will return both score and predictions.
from paddlets.utils import backtest
score, preds_data= backtest(
data=test_dataset,
model=mlp,
return_predicts = True)
val_test_dataset.plot(add_data=preds_data,labels="backtest")
Backtesting Example 4
start
can control the start point of backtest, If set start
to 0.5, Backtest will start at the middle of dataset.
from paddlets.utils import backtest
score, preds_data= backtest(
data=test_dataset,
model=mlp,
start =0.5,
return_predicts = True)
test_dataset.plot(add_data=preds_data,labels="backtest")
Backtesting Example 5
predict_window
is the window for the prediction.(Equal to model.out_chunk_len by default)
stride
is the number of time steps between two consecutive predict window. (Equal to predict_window
by default)
In most situations, predict_window
and stride
should be set to simulate the realy prediction.
from paddlets.utils import backtest
score, preds_data = backtest(
data=test_dataset,
model=mlp,
start =0.5,
predict_window=1,
stride=1,
return_predicts = True)
test_dataset.plot(add_data=preds_data,labels="backtest")
Backtesting Example 6
If set predict_window
!= stride
and return_predicts
= True, backtest will generate a List of TSdataset as predictions.
Because the predict results are overlaped in this situation.
from paddlets.utils import backtest
score, preds_data = backtest(
data=test_dataset,
model=mlp,
predict_window=24,
stride=12,
return_predicts = True)
type(preds_data)
#list[TSdataset]