paddlets.models.forecasting.dl.adapter.data_adapter
- class DataAdapter[source]
Bases:
objectData adapter, converts
paddlets.TSDatasettopaddle.io.Datasetandpaddle.io.DataLoader.- to_paddle_dataset(rawdataset: TSDataset, in_chunk_len: int = 1, out_chunk_len: int = 1, skip_chunk_len: int = 0, sampling_stride: int = 1, time_window: Optional[Tuple] = None) PaddleDatasetImpl[source]
Converts
paddlets.TSDatasettopaddle.io.Dataset.- Parameters
rawdataset (TSDataset) – Raw TSDataset for converting to
paddle.io.Dataset.in_chunk_len (int) – The size of the loopback window, i.e., the number of time steps feed to the model.
out_chunk_len (int) – The size of the forecasting horizon, i.e., the number of time steps output by the model.
skip_chunk_len (int) – Optional, the number of time steps between in_chunk and out_chunk for a single sample. The skip chunk is neither used as a feature (i.e. X) nor a label (i.e. Y) for a single sample. By default, it will NOT skip any time steps.
sampling_stride (int, optional) – Time steps to stride over the i-th sample and (i+1)-th sample. More precisely, let t be the time index of target time series, t[i] be the start time of the i-th sample, t[i+1] be the start time of the (i+1)-th sample, then sampling_stride represents the result of t[i+1] - t[i].
time_window (Tuple, optional) – A two-element-tuple-shaped time window that allows adapter to build samples. time_window[0] refers to the window lower bound, while time_window[1] refers to the window upper bound. Each element in the left-closed-and-right-closed interval refers to the TAIL index of each sample.
- Returns
A built PaddleDatasetImpl.
- Return type
- to_paddle_dataloader(paddle_dataset: PaddleDatasetImpl, batch_size: int, collate_fn: Optional[Callable] = None, shuffle: bool = True) DataLoader[source]
Converts
paddle.io.Datasettopaddle.io.DataLoader.- Parameters
paddle_dataset (PaddleDatasetImpl) – Raw
TSDatasetfor buildingpaddle.io.DataLoader.batch_size (int) – The number of samples for a single batch.
collate_fn (Callable, optional) – User-defined collate function for each batch, optional.
shuffle (bool, optional) – Whether to shuffle indices order before generating batch indices, default True. TODO: add this argument to
__init__()construct method allow caller to set its value.
- Returns
A built paddle DataLoader.
- Return type
PaddleDataLoader
Examples
# Given: batch_size = 4 in_chunk_len = 3 out_chunk_len = 2 known_cov_chunk_len = in_chunk_len + out_chunk_len = 3 + 2 = 5 observed_cov_chunk_len = in_chunk_len = 3 target_col_num = 2 (target column number, e.g. ["t0", "t1"]) known_cov_col_num = 3 (known covariates column number, e.g. ["k0", "k1", "k2"]) observed_cov_col_num = 1 (observed covariates column number, e.g. ["obs0"]) # Built DataLoader instance: dataloader = [ # 1st batch { "past_target": paddle.Tensor(shape=(batch_size, in_chunk_len, target_col_num)), "future_target": paddle.Tensor(shape=(batch_size, out_chunk_len, target_col_num)), "known_cov": paddle.Tensor(shape=(batch_size, known_cov_chunk_len, known_cov_col_num)), "observed_cov": paddle.Tensor(shape=(batch_size, observed_cov_chunk_len, observed_cov_col_num)) }, # ... # N-th batch { "past_target": paddle.Tensor(shape=(batch_size, in_chunk_len, target_col_num)), "future_target": paddle.Tensor(shape=(batch_size, out_chunk_len, target_col_num)), "known_cov": paddle.Tensor(shape=(batch_size, known_cov_chunk_len, known_cov_col_num)), "observed_cov": paddle.Tensor(shape=(batch_size, observed_cov_chunk_len, observed_cov_col_num)) } ]