paddlets.datasets.splitter

class SplitterBase(skip_size: int = 0, verbose: bool = True)[source]

Bases: object

Base class for all splitter.

Parameters
  • skip_size (int) – Series to be skipped between train data and test data, equal to 0 by default.

  • verbose (bool) – Whehter trun on the verbose mode, set to True by default.

Returns

None

Raises

None

split(dataset: TSDataset, return_index: bool = False) Union[TSDataset, DatetimeIndex, RangeIndex][source]

Split TSdataset.

Parameters
  • dataset (TS) – Dataset to be splitted.

  • return_index (bool) – Return index or return TSDataset, set to False by default

Returns

TSDataset|pd.DatetimeIndex|pd.RangeIndex

Raises

ValueError

class HoldoutSplitter(test_size: Union[int, float], skip_size: int = 0, verbose: bool = True)[source]

Bases: SplitterBase

Holdout splitter

Split TSDataset into an training set and a test set.

Parameters
  • test_size (int|float) – Test_size, can be int or float, int represent data length, type float represent data ratio.

  • skip_size (int) – Series to be skipped between train data and test data.

  • verbose (bool) – Whehter trun on the verbose mode, set to True by default.

Returns

None

Example

1) For example for test_size = 5 , other param set to default. here is a representation of the folds:

I * * * * * * * x x x x xI

* = training fold. x = test fold.

2) For example for test_size = 0.5, other param set to default. here is a representation of the folds:

I * * * * * * x x x x x xI

* = training fold. x = test fold.

class ExpandingWindowSplitter(n_splits: int = 5, test_size: Union[None, int] = None, skip_size: int = 0, max_train_size: Optional[int] = None, verbose: bool = True)[source]

Bases: SplitterBase

Expanding Window Splitter

Split time series repeatedly into an growing training set and a fixed-size test set.

Parameters
  • n_splits – Number of folds, not None.

  • test_size (int|float|None) – Test data size, test_size = n_samples // (n_splits+1) when test_size = None.

  • skip_size (int) – Series to be skipped between train data and test data.

  • max_train_size (int) – Max train size.

  • verbose (bool) – Whehter trun on the verbose mode, set to True by default.

Returns

None

Raises

None

Example

1) For example for n_splits = 5, other params set to default. By default, test_size = n_samples // (n_splits+1) = 12//(5+1) = 2 here is a representation of the folds:

I * * x x - - - - - - - -I
I * * * * x x - - - - - -I
I * * * * * * x x - - - -I
I * * * * * * * * x x - -I
I * * * * * * * * * * x xI

* = training fold. x = test fold.

2) For example for n_splits = 5, test_size = 1 other params set to default. here is a representation of the folds:

I * * * * * * * x - - - -I
I * * * * * * * * x - - -I
I * * * * * * * * * x - -I
I * * * * * * * * * * x -I
I * * * * * * * * * * * xI

* = training fold. x = test fold.

3) For example for n_splits = 5, test_size = 1, skip_size = 1, other params set to default. here is a representation of the folds:

I * * * * * * - x - - - -I
I * * * * * * * - x - - -I
I * * * * * * * * - x - -I
I * * * * * * * * * - x -I
I * * * * * * * * * * - xI

* = training fold. x = test fold.

4) For example for n_splits = 5, test_size = 1, max_train_size = 5,other params set to default. here is a representation of the folds:

I * * * * * - - x - - - -I
I * * * * * - - - x - - -I
I * * * * * - - - - x - -I
I * * * * * - - - - - x -I
I * * * * * - - - - - - xI

* = training fold. x = test fold.

property get_n_splits: int

Get n_splits

Parameters

None

Returns

n_splits

Return type

int

Raises

None

class SlideWindowSplitter(train_size: int, test_size: int, step_size: Optional[int] = None, skip_size: int = 0, verbose: bool = True)[source]

Bases: SplitterBase

Slide Window Splitter

Split time series repeatedly into a fixed-length training and test set.

Parameters
  • train_size (int) – Train data size, not None.

  • test_size (int|float|None) – Test data size, not None.

  • step_size – Step size between two folds, equal to test_size if None.

  • skip_size (int) – Series to be skipped between train data and test data.

  • verbose (bool) – Whehter trun on the verbose mode, set to True by default.

Returns

None

Raises

ValuError

Example

1) For example for train_size = 5, test_size = 2, other params set to default. here is a representation of the folds:

I * * * * * x x - - - - -I
I - - * * * * * x x - - -I
I - - - - * * * * * x x -I

* = training fold. x = test fold.

2) For example for train_size = 5, test_size = 2, step_size = 1, other params set to default. here is a representation of the folds:

I * * * * * x x - - - - -I
I - * * * * * x x - - - -I
I - - * * * * * x x - - -I
I - - - * * * * * x x - -I
I - - - - * * * * * x x -I
I - - - - - * * * * * x xI

* = training fold. x = test fold.

3) For example for n_splits = 5, test_size = 2, skip_size = 1, other param set to default. here is a representation of the folds:

I * * * * * - x x - - - -I
I - - * * * * * - x x - -I
I - - - - * * * * * - x xI

* = training fold. x = test fold.