paddlets.transform.base

class BaseTransform[source]

Bases: object

Base class for all data transformation classes (named transformers in this module)

Any subclass or transformer needs to inherit from this base class and implement fit(), transform() and fit_transform() methods.

fit(dataset: Union[TSDataset, List[TSDataset]])[source]

Learn the parameters from the dataset needed by the transformer.

Any non-abstract class inherited from this class should implement this method.

The parameters fitted by this method is transformer-specific. For example, the MinMaxScaler needs to compute the MIN and MAX, and the StandardScaler needs to compute the MEAN and STD (standard deviation) from the dataset.

Parameters

dataset (Union[TSDataset, List[TSDataset]]) – dataset from which to fit the transformer.

abstract fit_one(dataset: TSDataset)[source]

Learn the parameters from the dataset needed by the transformer.

Any non-abstract class inherited from this class should implement this method.

The parameters fitted by this method is transformer-specific. For example, the MinMaxScaler needs to compute the MIN and MAX, and the StandardScaler needs to compute the MEAN and STD (standard deviation) from the dataset.

Parameters

dataset (TSDataset) – dataset from which to fit the transformer.

transform(dataset: Union[TSDataset, List[TSDataset]], inplace: bool = False) Union[TSDataset, List[TSDataset]][source]

Apply the fitted transformer on the dataset

Any non-abstract class inherited from this class should implement this method.

Parameters
  • dataset (Union[TSDataset, List[TSDataset]) – dataset to be transformed.

  • inplace (bool, optional) – Set to True to perform inplace transformation. Default is False.

Returns

transformed dataset.

Return type

Union[TSDataset, List[TSDataset]]

abstract transform_one(dataset: TSDataset, inplace: bool = False) TSDataset[source]

Apply the fitted transformer on the dataset

Any non-abstract class inherited from this class should implement this method.

Parameters
  • dataset (TSDataset) – dataset to be transformed.

  • inplace (bool, optional) – Set to True to perform inplace transformation. Default is False.

Returns

transformed dataset.

Return type

TSDataset

transform_n_rows(dataset: TSDataset, n_rows: int, inplace: bool = False) TSDataset[source]

Apply the fitted transformer on the part of the dataset

Parameters
  • dataset (TSDataset) – dataset to be transformed.

  • n_rows (int) – n_rows to be transformed.

  • inplace (bool, optional) – Set to True to perform inplace transformation. Default is False.

Returns

transformed dataset.

Return type

TSDataset

fit_transform(dataset: Union[TSDataset, List[TSDataset]], inplace: bool = False) Union[TSDataset, List[TSDataset]][source]

Combine the above fit and transform into one method, firstly fitting the transformer from the dataset and then applying the fitted transformer on the dataset.

Any non-abstract class inherited from this class should implement this method.

Parameters
  • dataset (Union[TSDataset, List[TSDataset]]) – dataset to process.

  • inplace (bool, optional) – Set to True to perform inplace transformation. Default is False.

Returns

transformed data.

Return type

Union[TSDataset, List[TSDataset]]

inverse_transform(dataset: Union[TSDataset, List[TSDataset]], inplace: bool = False) Union[TSDataset, List[TSDataset]][source]

Inversely transform the dataset output by the transform method.

Differ from other abstract methods, this method is not decorated by abc.abstractmethod. The reason is that not all the transformations can be transformed back inversely, thus, it is neither possible nor mandatory for all sub classes inherited from this base class to implement this method.

In general, other modules such as Pipeline will possibly call this method WITHOUT knowing if the called transform instance has implemented this method. To work around this, instead of simply using pass expression as the default placeholder, this method raises a NotImplementedError to enable the callers (e.g. Pipeline) to use try-except mechanism to identify those data transformation operators that do NOT implement this method.

Parameters
  • dataset (Union[TSDataset, List[TSDataset]]) – dataset to be inversely transformed.

  • inplace (bool, optional) – Set to True to perform inplace transformation. Default is False.

Returns

inverserly transformed dataset.

Return type

TSDataset

Raises

NotImplementedError

inverse_transform_one(dataset: TSDataset, inplace: bool = False) TSDataset[source]

Inversely transform the dataset output by the transform method.

Differ from other abstract methods, this method is not decorated by abc.abstractmethod. The reason is that not all the transformations can be transformed back inversely, thus, it is neither possible nor mandatory for all sub classes inherited from this base class to implement this method.

In general, other modules such as Pipeline will possibly call this method WITHOUT knowing if the called transform instance has implemented this method. To work around this, instead of simply using pass expression as the default placeholder, this method raises a NotImplementedError to enable the callers (e.g. Pipeline) to use try-except mechanism to identify those data transformation operators that do NOT implement this method.

Parameters
  • dataset (TSDataset) – dataset to be inversely transformed.

  • inplace (bool, optional) – Set to True to perform inplace transformation. Default is False.

Returns

inverserly transformed dataset.

Return type

TSDataset

Raises

NotImplementedError

class UdBaseTransform(ud_transformer: object, in_col_names: Optional[Union[str, List[str]]] = None, per_col_transform: bool = False, drop_origin_columns: bool = False, out_col_types: Optional[Union[str, List[str]]] = None, out_col_names: Optional[List[str]] = None)[source]

Bases: BaseTransform

User define base transform.

Parameters
  • ud_transformer (object) – User define or third-party transformer object.

  • in_col_names (Optional[Union[str, List[str]]]) – Column name or names to be transformed.

  • per_col_transform (bool) – Whether each column of data is transformed independently, default False.

  • drop_origin_columns (bool) – Whether to delete the original column, default=False.

  • out_col_types (Optional[Union[str, List[str]]]) – The type of output columns, None values represent automatic inference based on input.

  • out_col_names (Optional[List[str]]) – The name of output columns, None values represent automatic inference based on input.

fit_one(dataset: TSDataset)[source]

Learn the parameters from the dataset needed by the transformer.

Parameters

dataset (TSDataset) – dataset from which to fit the transformer

Returns

self

transform_one(dataset: TSDataset, inplace: bool = False) TSDataset[source]

Transform or inverse_transform the dataset with the fitted transformer.

Parameters
  • dataset (TSDataset) – dataset to be transformed.

  • inplace (bool) – whether to replace the original data. default=False

Returns

TSDataset

inverse_transform_one(dataset: TSDataset, inplace: bool = False) TSDataset[source]

Inversely transform the dataset output by the transform method.

Parameters
  • dataset (TSDataset) – dataset to be inversely transformed.

  • inplace (bool) – Set to True to perform inplace operation and avoid data copy.

Returns

Inversely transformed TSDataset.

Return type

TSDataset