paddlets.pipeline.pipeline

class Pipeline(steps: List[Tuple[object, str]])[source]

Bases: Trainable

The pipeline is designed to build a workflow for time series modeling which may be comprised of a set of transformers and an model.

Note: The model is optional.

Parameters: steps (List[Tuple[object, str]]) – A list of transformers and a final model.

Examples

>>> ...
>>> ksigma_params = {"cols":['example_columns'], "k": 0.5}
>>> mlp_params = {'in_chunk_len': 7, 'out_chunk_len': 3, 'skip_chunk_len': 0, 'eval_metrics': ["mse", "mae"]}
>>> pipeline = Pipeline([(KSigma, ksigma_params), (TimeFeatureGenerator, {}), (MLPRegressor, mlp_params)])

fit(train_tsdataset: Union[TSDataset, List[TSDataset]], valid_tsdataset: Optional[Union[TSDataset, List[TSDataset]]] = None)[source]

Fit transformers and transform the data then fit the model.

Parameters

train_tsdataset (Union[TSDataset, List[TSDataset]]) – Train dataset.
valid_tsdataset (Union[TSDataset, List[TSDataset]], optional) – Valid dataset.

Returns

Pipeline with fitted transformers and fitted model.

Return type

Pipeline

transform(tsdataset: Union[TSDataset, List[TSDataset]], inplace: bool = False, cache_transform_steps: bool = False, previous_caches: Optional[List[TSDataset]] = None) → Union[TSDataset, Tuple[TSDataset, List[TSDataset]]][source]

Transform the TSDataset using the fitted transformers in the pipeline.

Parameters

tsdataset (Union[TSDataset, List[TSDataset]]) – Data to be transformed.
inplace (bool) – Set to True to perform inplace transform and avoid a data copy. Default is False.
cache_transform_steps – Cache each transform step’s transorm result into a list.
previous_caches – previous transform results cache

Returns

Return transformed results by default. Return Both: transformed results and each transform step’s caches if set cache_transform_steps = True.

Return type

Tuple[TSDataset,Tuple[List[TSDataset],TSDataset]]

inverse_transform(tsdataset: Union[TSDataset, List[TSDataset]], inplace: bool = False) → TSDataset[source]

The inverse transformation of self.transform. Apply inverse_transform using the fitted transformers in the pipeline. Note that not all transformers implement inverse_transform method. If a transformer do not implement inverse_transform, it would not inversely transform the input data.

Parameters

tsdataset (Union[TSDataset, List[TSDataset]]) – Data to apply inverse_transform.
inplace (bool) – Set to True to perform inplace transform and avoid a data copy. Default is False.

Returns

Inversely transformed TSDataset.

Return type

TSDataset

predict(tsdataset: TSDataset) → TSDataset[source]

Transform the TSDataset using the fitted transformers and perform prediction with the fitted model in the pipeline, only effective when the model exists in the pipeline.

Parameters: tsdataset (TSDataset) – Data to be predicted.
Returns: Predicted results of calling self.predict on the final model.
Return type: TSDataset

predict_proba(tsdataset: TSDataset) → TSDataset[source]

Transform the TSDataset using the fitted transformers and perform probability prediction with the fitted model in the pipeline, only effective when the model exists in the pipeline.

Parameters: tsdataset (TSDataset) – Data to be predicted.
Returns: Predicted results of calling self.predict_proba on the final model.
Return type: TSDataset

predict_score(tsdataset: TSDataset) → TSDataset[source]

Transform the TSDataset using the fitted transformers and perform anomaly detection score prediction with the fitted model in the pipeline, only effective when the model exists in the pipeline.

Parameters: tsdataset (TSDataset) – Data to be predicted.
Returns: Predicted results of calling self.predict_score on the final model.
Return type: TSDataset

recursive_predict(tsdataset: TSDataset, predict_length: int) → TSDataset[source]

Apply self.predict method iteratively for multi-step time series forecasting, the predicted results from the current call will be appended to the TSDataset object and will appear in the loopback window for next call. Note that each call of self.predict will return a result of length out_chunk_len, so it will be called ceiling(predict_length/out_chunk_len) times to meet the required length.

Parameters

tsdataset (TSDataset) – Data to be predicted.
predict_length (int) – Length of predicted results.

Returns

Predicted results.

Return type

TSDataset

recursive_predict_proba(tsdataset: TSDataset, predict_length: int) → TSDataset[source]

Apply self.predict_proba method iteratively for multi-step time series forecasting, the predicted results from the current call will be appended to the TSDataset object and will appear in the loopback window for next call. Note that each call of self.predict_proba will return a result of length out_chunk_len, so it will be called ceiling(predict_length/out_chunk_len) times to meet the required length.

Parameters

tsdataset (TSDataset) – Data to be predicted.
predict_length (int) – Length of predicted results.

Returns

Predicted results.

Return type

TSDataset

save(path: str, pipeline_file_name: str = 'pipeline-partial.pkl', model_file_name: str = 'paddlets_model')[source]

Save the pipeline to a directory.

Parameters

path (str) – Output directory path.
pipeline_file_name (str) – Name of pipeline object. This file contains transformers and meta information of pipeline.
model_file_name (str) – Name of model object. See BaseModel.save for more information.

classmethod load(path: str, pipeline_file_name: str = 'pipeline-partial.pkl', model_file_name: str = 'paddlets_model')[source]

Load the pipeline from a directory.

Parameters

path (str) – Input directory path.
pipeline_file_name (str) – Name of pipeline object. This file contains transformers and meta information of pipeline.
model_file_name (str) – Name of model object. See BaseModel.save for more information.

Returns

The loaded pipeline.

Return type

Pipeline