paddlets.models.ml_model_wrapper
- class MLModelBaseWrapper(model_class: Type, in_chunk_len: int, out_chunk_len: int = 1, skip_chunk_len: int = 0, sampling_stride: int = 1, model_init_params: Optional[Dict[str, Any]] = None, fit_params: Optional[Dict[str, Any]] = None, predict_params: Optional[Dict[str, Any]] = None)[source]
Bases:
MLBaseModelTime series model base wrapper for third party models.
- Parameters
model_class (Type) – Class type of the third party model.
in_chunk_len (int) – The size of the loopback window, i.e., the number of time steps feed to the model.
out_chunk_len (int) – The size of the forecasting horizon, i.e., the number of time steps output by the model.
skip_chunk_len (int, optional) – The number of time steps between in_chunk and out_chunk for a single sample. The skip chunk is neither used as a feature (i.e. X) nor a label (i.e. Y) for a single sample. By default, it will NOT skip any time steps.
sampling_stride (int, optional) – Time steps to stride over the i-th sample and (i+1)-th sample. More precisely, let t be the time index of target time series, t[i] be the start time of the i-th sample, t[i+1] be the start time of the (i+1)-th sample, then sampling_stride represents the result of t[i+1] - t[i].
model_init_params (Dict[str, Any]) – All params for initializing the third party model.
fit_params (Dict[str, Any], optional) – All params for fitting third party model except x_train / y_train.
predict_params (Dict[str, Any], optional) – All params for forecasting third party model except x_test / y_test.
- class SklearnModelWrapper(model_class: Type, in_chunk_len: int, out_chunk_len: int, skip_chunk_len: int = 0, sampling_stride: int = 1, model_init_params: Optional[Dict[str, Any]] = None, fit_params: Optional[Dict[str, Any]] = None, predict_params: Optional[Dict[str, Any]] = None, udf_ml_dataloader_to_fit_ndarray: Optional[Callable] = None, udf_ml_dataloader_to_predict_ndarray: Optional[Callable] = None)[source]
Bases:
MLModelBaseWrapperTime series model wrapper for sklearn third party models.
- Parameters
model_class (Type) – Class type of the third party model.
in_chunk_len (int) – The size of the loopback window, i.e., the number of time steps feed to the model.
out_chunk_len (int) – The size of the forecasting horizon, i.e., the number of time steps output by the model.
skip_chunk_len (int, optional) – The number of time steps between in_chunk and out_chunk for a single sample. The skip chunk is neither used as a feature (i.e. X) nor a label (i.e. Y) for a single sample. By default, it will NOT skip any time steps.
sampling_stride (int, optional) – Time steps to stride over the i-th sample and (i+1)-th sample. More precisely, let t be the time index of target time series, t[i] be the start time of the i-th sample, t[i+1] be the start time of the (i+1)-th sample, then sampling_stride represents the result of t[i+1] - t[i].
model_init_params (Dict[str, Any]) – All params for initializing the third party model.
fit_params (Dict[str, Any], optional) – All params for fitting third party model except x_train / y_train.
predict_params (Dict[str, Any], optional) – All params for forecasting third party model except x_test / y_test.
udf_ml_dataloader_to_fit_ndarray (Callable, optional) – User defined function for converting MLDataLoader object to a numpy.ndarray object that can be processed by fit method of the third party model.
udf_ml_dataloader_to_predict_ndarray (Callable, optional) – User defined function for converting MLDataLoader object to a numpy.ndarray object that can be processed by predict method of the third party model.
- default_sklearn_ml_dataloader_to_fit_ndarray(ml_dataloader: MLDataLoader, model_init_params: Dict[str, Any], in_chunk_len: int, skip_chunk_len: int, out_chunk_len: int) Tuple[ndarray, Optional[ndarray]][source]
Default function for converting MLDataLoader to a numpy array that can be used for fitting the sklearn model.
- Parameters
ml_dataloader (MLDataLoader) – MLDataLoader object to be converted.
model_init_params (Dict) – parameters when initializing sklearn models, possibly be used while converting.
in_chunk_len (int) – The size of the loopback window, i.e., the number of time steps feed to the model. Possibly be used while converting.
skip_chunk_len (int, optional) – The number of time steps between in_chunk and out_chunk for a single sample. The skip chunk is neither used as a feature (i.e. X) nor a label (i.e. Y) for a single sample. By default, it will NOT skip any time steps. Possibly be used while converting.
out_chunk_len (int) – The size of the forecasting horizon, i.e., the number of time steps output by the model. Possibly be used while converting.
- Returns
Converted numpy array. The first and second element in the tuple represent x_train and y_train, respectively.
- Return type
Tuple[np.ndarray, Optional[np.ndarray]]
- default_sklearn_ml_dataloader_to_predict_ndarray(ml_dataloader: MLDataLoader, model_init_params: Dict[str, Any], in_chunk_len: int, skip_chunk_len: int, out_chunk_len: int) Tuple[ndarray, Optional[ndarray]][source]
Default function for converting MLDataLoader to a numpy array that can be predicted by the sklearn model.
- Parameters
ml_dataloader (MLDataLoader) – MLDataLoader object to be converted.
model_init_params (Dict) – parameters when initializing sklearn models, possibly be used while converting.
in_chunk_len (int) – The size of the loopback window, i.e., the number of time steps feed to the model. Possibly be used while converting.
skip_chunk_len (int, optional) – The number of time steps between in_chunk and out_chunk for a single sample. The skip chunk is neither used as a feature (i.e. X) nor a label (i.e. Y) for a single sample. By default, it will NOT skip any time steps. Possibly be used while converting.
out_chunk_len (int) – The size of the forecasting horizon, i.e., the number of time steps output by the model. Possibly be used while converting.
- Returns
Converted numpy array. The first and second element in the tuple represent x and y, respectively, where y is optional.
- Return type
Tuple[np.ndarray, Optional[np.ndarray]]
- class PyodModelWrapper(model_class: Type, in_chunk_len: int, sampling_stride: int = 1, model_init_params: Optional[Dict[str, Any]] = None, predict_params: Optional[Dict[str, Any]] = None, udf_ml_dataloader_to_fit_ndarray: Optional[Callable] = None, udf_ml_dataloader_to_predict_ndarray: Optional[Callable] = None)[source]
Bases:
MLModelBaseWrapperTime series model wrapper for pyod third party models.
- Parameters
model_class (Type) – Class type of the third party model.
in_chunk_len (int) – The size of the loopback window, i.e., the number of time steps feed to the model.
sampling_stride (int, optional) – Time steps to stride over the i-th sample and (i+1)-th sample. More precisely, let t be the time index of target time series, t[i] be the start time of the i-th sample, t[i+1] be the start time of the (i+1)-th sample, then sampling_stride represents the result of t[i+1] - t[i].
model_init_params (Dict[str, Any]) – All params for initializing the third party model.
predict_params (Dict[str, Any], optional) – All params for forecasting third party model except x_test / y_test.
udf_ml_dataloader_to_fit_ndarray (Callable, optional) – User defined function for converting MLDataLoader object to a numpy.ndarray object that can be processed by fit method of the third party model.
udf_ml_dataloader_to_predict_ndarray (Callable, optional) – User defined function for converting MLDataLoader object to a numpy.ndarray object that can be processed by predict method of the third party model.
- predict_score(tsdataset: TSDataset) ndarray[source]
Predict raw anomaly scores of tsdataset using the fitted model, outliers are assigned with higher scores.
- Parameters
tsdataset (TSDataset) – The input samples for which will be computed.
- Returns
numpy array of shape (n_samples,), the anomaly score of the input samples.
- Return type
np.ndarray
- default_pyod_ml_dataloader_to_fit_ndarray(ml_dataloader: MLDataLoader, model_init_params: Dict[str, Any], in_chunk_len: int) Tuple[ndarray, Optional[ndarray]][source]
Default function for converting MLDataLoader to a numpy array that can be used for fitting the pyod model.
In this method will remove in_chunk_len dimension for the passed data. The reason is that all models in pyod requires X.ndim must == (n_samples, n_features), where n_samples is identical to batch_size, n_features is identical to observed_cov_col_num (In paddlets context, we define n_samples as batch_size, define n_features as observed_cov_col_num for anomaly detection models). However, the samples built by data adapter are 3-dim ndarray with shape of (batch_size, in_chunk_len, observed_cov_col_num), thus needs to flatten (i.e. remove) the first dimension (i.e., batch_size) and make it a 2-dim array.
- Parameters
ml_dataloader (MLDataLoader) – MLDataLoader object to be converted.
model_init_params (Dict) – parameters when initializing sklearn models, possibly be used while converting.
in_chunk_len (int) – The size of the loopback window, i.e., the number of time steps feed to the model. Possibly be used while converting.
- Returns
Converted numpy array. The first and second element in the tuple represent x_train and y_train, respectively.
- Return type
Tuple[np.ndarray, Optional[np.ndarray]]
- default_pyod_ml_dataloader_to_predict_ndarray(ml_dataloader: MLDataLoader, model_init_params: Dict[str, Any], in_chunk_len: int) Tuple[ndarray, Optional[ndarray]][source]
Default function for converting MLDataLoader to a numpy array that can be predicted by the pyod model.
- Parameters
ml_dataloader (MLDataLoader) – MLDataLoader object to be converted.
model_init_params (Dict) – parameters when initializing sklearn models, possibly be used while converting.
in_chunk_len (int) – The size of the loopback window, i.e., the number of time steps feed to the model. Possibly be used while converting.
- Returns
Converted numpy array. The first and second element in the tuple represent x and y, respectively, where y is optional.
- Return type
Tuple[np.ndarray, Optional[np.ndarray]]
- make_ml_model(model_class: Type, in_chunk_len: int, out_chunk_len: int = 1, skip_chunk_len: int = 0, sampling_stride: int = 1, model_init_params: Optional[Dict[str, Any]] = None, fit_params: Optional[Dict[str, Any]] = None, predict_params: Optional[Dict[str, Any]] = None, udf_ml_dataloader_to_fit_ndarray: Optional[Callable] = None, udf_ml_dataloader_to_predict_ndarray: Optional[Callable] = None) MLModelBaseWrapper[source]
Make Wrapped time series model based on the third-party model.
- Parameters
model_class (Type) – Class type of the third party model.
in_chunk_len (int) – The size of the loopback window, i.e., the number of time steps feed to the model.
out_chunk_len (int) – The size of the forecasting horizon, i.e., the number of time steps output by the model.
skip_chunk_len (int, optional) – The number of time steps between in_chunk and out_chunk for a single sample. The skip chunk is neither used as a feature (i.e. X) nor a label (i.e. Y) for a single sample. By default, it will NOT skip any time steps.
sampling_stride (int, optional) – Time steps to stride over the i-th sample and (i+1)-th sample. More precisely, let t be the time index of target time series, t[i] be the start time of the i-th sample, t[i+1] be the start time of the (i+1)-th sample, then sampling_stride represents the result of t[i+1] - t[i].
model_init_params (Dict[str, Any]) – All params for initializing the third party model.
fit_params (Dict[str, Any], optional) – All params for fitting third party model except x_train / y_train.
predict_params (Dict[str, Any], optional) – All params for forecasting third party model except x_test / y_test.
udf_ml_dataloader_to_fit_ndarray (Callable, optional) – User defined function for converting MLDataLoader object to a numpy.ndarray object that can be processed by fit method of the third party model. Any third party models that accept numpy array as fit inputs can use this function to build the data for training.
udf_ml_dataloader_to_predict_ndarray (Callable, optional) – User defined function for converting MLDataLoader object to a numpy.ndarray object that can be processed by predict method of the third party model. Any third-party models that accept numpy array as predict inputs can use this function to build the data for prediction.
- Returns
Wrapped time series model wrapper object, currently support SklearnModelWrapper and PyodModelWrapper.
- Return type