paddlets.transform.sklearn_transforms

class OneHot(cols: ~typing.Union[str, ~typing.List[str]], dtype: object = <class 'numpy.float64'>, handle_unknown: str = 'error', categories: ~typing.Union[str, ~typing.List[str]] = 'auto', drop: bool = False)[source]

Bases: SklearnTransformWrapper

Transform categorical columns with OneHot encoder.

Parameters
  • cols (str|List) – Column(s) to be encoded.

  • handle_unknown (str) – {‘error’, ‘ignore’}, default=’error’

  • drop (bool) – Whether to delete the original column, default=False

  • dtype (object) – Data type, default=float

  • categorie (str|List) – ‘auto’ or a list of array-like, default=’auto’, if categorie is ‘auto’, it determine categories automatically from the dataset.

Returns

None

class Ordinal(cols: Union[str, List[str]], dtype: dtype = dtype('float64'), categories: Union[str, List[str]] = 'auto', unknown_value: Union[None, int] = None, handle_unknown: str = 'error', drop: bool = False)[source]

Bases: SklearnTransformWrapper

Encode categorical features as an integer array.

Parameters
  • cols (str|List) – Name of columns to Encode

  • handle_unknown (str) – {‘error’, ‘use_encoded_value’}, default=’error’

  • drop (bool) – Whether to delete the original column, default=False.

  • dtype (object) – Number type, default=float.

  • unknown_value (str) – int or np.nan, default=None.

  • categorie (str|List) – ‘auto’ or a list of array-like, default=’auto’,if categorie is ‘auto’, it determine categories automatically from the training data. if categorie is list, categories[i] holds the categories expected in the ith column. The passed categories should not mix strings and numeric values, and should be sorted in case of numeric values.

Returns

None

class MinMaxScaler(cols: Optional[Union[str, List[str]]] = None, f_range: tuple = (0, 1), clip: bool = False)[source]

Bases: SklearnTransformWrapper

Transform a dataset by scaling the values of sepcified column(s) to the expected range: [min, max].

The transformation is done by:

X_std = (X - X.min) / (X.max - X.min)

X_scaled = X_std * (max - min) + min

Parameters
  • cols (str|List) – Column name(s) to be scaled.

  • f_range (tuple) – tuple (min, max), default=(0, 1), Desired range of transformed values.

  • clip (bool) – Set to True to clip transformed values of held-out data to provided feature range.

Returns

None

class StandardScaler(cols: Optional[Union[str, List[str]]] = None, with_mean: bool = True, with_std: bool = True)[source]

Bases: SklearnTransformWrapper

Transform a dataset by scaling the values of sepcified column(s) to zero mean and unit variance.

The transformation is done by: z = (x - u) / s.

where u is the MEAN or zero if with_mean=False, and s is the standard deviation or one if with_std=False.

Parameters
  • cols (str|List) – Column name or names to be scaled.

  • with_mean (bool) – If True, center the data before scaling.

  • with_std (bool) – If True, scale the data to unit variance.

Returns

None