paddlets.transform.sklearn_transforms

class OneHot(cols: ~typing.Union[str, ~typing.List[str]], dtype: object = <class 'numpy.float64'>, handle_unknown: str = 'error', categories: ~typing.Union[str, ~typing.List[str]] = 'auto', drop: bool = False)[source]

Bases: SklearnTransformWrapper

Transform categorical columns with OneHot encoder.

Parameters
  • cols (str|List) – Column(s) to be encoded.

  • handle_unknown (str) – {‘error’, ‘ignore’}, default=’error’

  • drop (bool) – Whether to delete the original column, default=False

  • dtype (object) – Data type, default=float

  • categorie (str|List) – ‘auto’ or a list of array-like, default=’auto’, if categorie is ‘auto’, it determine categories automatically from the dataset.

Returns

None

class Ordinal(cols: Union[str, List[str]], dtype: dtype = dtype('float64'), categories: Union[str, List[str]] = 'auto', unknown_value: Union[None, int] = None, handle_unknown: str = 'error', drop: bool = False)[source]

Bases: SklearnTransformWrapper

Encode categorical features as an integer array.

Parameters
  • cols (str|List) – Name of columns to Encode

  • handle_unknown (str) – {‘error’, ‘use_encoded_value’}, default=’error’

  • drop (bool) – Whether to delete the original column, default=False.

  • dtype (object) – Number type, default=float.

  • unknown_value (str) – int or np.nan, default=None.

  • categorie (str|List) – ‘auto’ or a list of array-like, default=’auto’,if categorie is ‘auto’, it determine categories automatically from the training data. if categorie is list, categories[i] holds the categories expected in the ith column. The passed categories should not mix strings and numeric values, and should be sorted in case of numeric values.

Returns

None

fit(dataset: Union[TSDataset, List[TSDataset]])[source]

Learn the parameters from the dataset needed by the transformer.

Any non-abstract class inherited from this class should implement this method.

The parameters fitted by this method is transformer-specific. For example, the MinMaxScaler needs to compute the MIN and MAX, and the StandardScaler needs to compute the MEAN and STD (standard deviation) from the dataset.

Parameters

dataset (Union[TSDataset, List[TSDataset]]) – dataset from which to fit the transformer.

transform(dataset: Union[TSDataset, List[TSDataset]], inplace: bool = False) Union[TSDataset, List[TSDataset]][source]

Apply the fitted transformer on the dataset

Any non-abstract class inherited from this class should implement this method.

Parameters
  • dataset (Union[TSDataset, List[TSDataset]) – dataset to be transformed.

  • inplace (bool, optional) – Set to True to perform inplace transformation. Default is False.

Returns

transformed dataset.

Return type

Union[TSDataset, List[TSDataset]]

class MinMaxScaler(cols: Optional[Union[str, List[str]]] = None, f_range: tuple = (0, 1), clip: bool = False)[source]

Bases: SklearnTransformWrapper

Transform a dataset by scaling the values of sepcified column(s) to the expected range: [min, max].

The transformation is done by:

X_std = (X - X.min) / (X.max - X.min)

X_scaled = X_std * (max - min) + min

Parameters
  • cols (str|List) – Column name(s) to be scaled.

  • f_range (tuple) – tuple (min, max), default=(0, 1), Desired range of transformed values.

  • clip (bool) – Set to True to clip transformed values of held-out data to provided feature range.

Returns

None

class StandardScaler(cols: Optional[Union[str, List[str]]] = None, with_mean: bool = True, with_std: bool = True)[source]

Bases: SklearnTransformWrapper

Transform a dataset by scaling the values of sepcified column(s) to zero mean and unit variance.

The transformation is done by: z = (x - u) / s.

where u is the MEAN or zero if with_mean=False, and s is the standard deviation or one if with_std=False.

Parameters
  • cols (str|List) – Column name or names to be scaled.

  • with_mean (bool) – If True, center the data before scaling.

  • with_std (bool) – If True, scale the data to unit variance.

Returns

None