paddlets.transform.sklearn_transforms

class OneHot(cols: ~typing.Union[str, ~typing.List[str]], dtype: object = <class 'numpy.float64'>, handle_unknown: str = 'error', categories: ~typing.Union[str, ~typing.List[str]] = 'auto', drop: bool = False)[source]

Bases: SklearnTransformWrapper

Transform categorical columns with OneHot encoder.

Parameters

cols (str|List) – Column(s) to be encoded.
handle_unknown (str) – {‘error’, ‘ignore’}, default=’error’
drop (bool) – Whether to delete the original column, default=False
dtype (object) – Data type, default=float
categorie (str|List) – ‘auto’ or a list of array-like, default=’auto’, if categorie is ‘auto’, it determine categories automatically from the dataset.

Returns

None

class Ordinal(cols: Union[str, List[str]], dtype: dtype = dtype('float64'), categories: Union[str, List[str]] = 'auto', unknown_value: Union[None, int] = None, handle_unknown: str = 'error', drop: bool = False)[source]

Bases: SklearnTransformWrapper

Encode categorical features as an integer array.

Parameters

cols (str|List) – Name of columns to Encode
handle_unknown (str) – {‘error’, ‘use_encoded_value’}, default=’error’
drop (bool) – Whether to delete the original column, default=False.
dtype (object) – Number type, default=float.
unknown_value (str) – int or np.nan, default=None.
categorie (str|List) – ‘auto’ or a list of array-like, default=’auto’,if categorie is ‘auto’, it determine categories automatically from the training data. if categorie is list, categories[i] holds the categories expected in the ith column. The passed categories should not mix strings and numeric values, and should be sorted in case of numeric values.

Returns

None

fit(dataset: Union[TSDataset, List[TSDataset]])[source]

Learn the parameters from the dataset needed by the transformer.

Any non-abstract class inherited from this class should implement this method.

The parameters fitted by this method is transformer-specific. For example, the MinMaxScaler needs to compute the MIN and MAX, and the StandardScaler needs to compute the MEAN and STD (standard deviation) from the dataset.

Parameters: dataset (Union[TSDataset, List[TSDataset]]) – dataset from which to fit the transformer.

transform(dataset: Union[TSDataset, List[TSDataset]], inplace: bool = False) → Union[TSDataset, List[TSDataset]][source]

Apply the fitted transformer on the dataset

Any non-abstract class inherited from this class should implement this method.

Parameters

dataset (Union[TSDataset, List[TSDataset]) – dataset to be transformed.
inplace (bool, optional) – Set to True to perform inplace transformation. Default is False.

Returns

transformed dataset.

Return type

Union[TSDataset, List[TSDataset]]

class MinMaxScaler(cols: Optional[Union[str, List[str]]] = None, f_range: tuple = (0, 1), clip: bool = False)[source]

Bases: SklearnTransformWrapper

Transform a dataset by scaling the values of sepcified column(s) to the expected range: [min, max].

The transformation is done by:

X_std = (X - X.min) / (X.max - X.min)

X_scaled = X_std * (max - min) + min

Parameters

cols (str|List) – Column name(s) to be scaled.
f_range (tuple) – tuple (min, max), default=(0, 1), Desired range of transformed values.
clip (bool) – Set to True to clip transformed values of held-out data to provided feature range.

Returns

None

class StandardScaler(cols: Optional[Union[str, List[str]]] = None, with_mean: bool = True, with_std: bool = True)[source]

Bases: SklearnTransformWrapper

Transform a dataset by scaling the values of sepcified column(s) to zero mean and unit variance.

The transformation is done by: z = (x - u) / s.

where u is the MEAN or zero if with_mean=False, and s is the standard deviation or one if with_std=False.

Parameters

cols (str|List) – Column name or names to be scaled.
with_mean (bool) – If True, center the data before scaling.
with_std (bool) – If True, scale the data to unit variance.

Returns

None