paddlets.transform.sklearn_transforms
- class OneHot(cols: ~typing.Union[str, ~typing.List[str]], dtype: object = <class 'numpy.float64'>, handle_unknown: str = 'error', categories: ~typing.Union[str, ~typing.List[str]] = 'auto', drop: bool = False)[source]
Bases:
SklearnTransformWrapperTransform categorical columns with OneHot encoder.
- Parameters
cols (str|List) – Column(s) to be encoded.
handle_unknown (str) – {‘error’, ‘ignore’}, default=’error’
drop (bool) – Whether to delete the original column, default=False
dtype (object) – Data type, default=float
categorie (str|List) – ‘auto’ or a list of array-like, default=’auto’, if categorie is ‘auto’, it determine categories automatically from the dataset.
- Returns
None
- class Ordinal(cols: Union[str, List[str]], dtype: dtype = dtype('float64'), categories: Union[str, List[str]] = 'auto', unknown_value: Union[None, int] = None, handle_unknown: str = 'error', drop: bool = False)[source]
Bases:
SklearnTransformWrapperEncode categorical features as an integer array.
- Parameters
cols (str|List) – Name of columns to Encode
handle_unknown (str) – {‘error’, ‘use_encoded_value’}, default=’error’
drop (bool) – Whether to delete the original column, default=False.
dtype (object) – Number type, default=float.
unknown_value (str) – int or np.nan, default=None.
categorie (str|List) – ‘auto’ or a list of array-like, default=’auto’,if categorie is ‘auto’, it determine categories automatically from the training data. if categorie is list, categories[i] holds the categories expected in the ith column. The passed categories should not mix strings and numeric values, and should be sorted in case of numeric values.
- Returns
None
- fit(dataset: Union[TSDataset, List[TSDataset]])[source]
Learn the parameters from the dataset needed by the transformer.
Any non-abstract class inherited from this class should implement this method.
The parameters fitted by this method is transformer-specific. For example, the MinMaxScaler needs to compute the MIN and MAX, and the StandardScaler needs to compute the MEAN and STD (standard deviation) from the dataset.
- class MinMaxScaler(cols: Optional[Union[str, List[str]]] = None, f_range: tuple = (0, 1), clip: bool = False)[source]
Bases:
SklearnTransformWrapperTransform a dataset by scaling the values of sepcified column(s) to the expected range: [min, max].
The transformation is done by:
X_std = (X - X.min) / (X.max - X.min)
X_scaled = X_std * (max - min) + min
- Parameters
cols (str|List) – Column name(s) to be scaled.
f_range (tuple) – tuple (min, max), default=(0, 1), Desired range of transformed values.
clip (bool) – Set to True to clip transformed values of held-out data to provided feature range.
- Returns
None
- class StandardScaler(cols: Optional[Union[str, List[str]]] = None, with_mean: bool = True, with_std: bool = True)[source]
Bases:
SklearnTransformWrapperTransform a dataset by scaling the values of sepcified column(s) to zero mean and unit variance.
The transformation is done by: z = (x - u) / s.
where u is the MEAN or zero if with_mean=False, and s is the standard deviation or one if with_std=False.
- Parameters
cols (str|List) – Column name or names to be scaled.
with_mean (bool) – If True, center the data before scaling.
with_std (bool) – If True, scale the data to unit variance.
- Returns
None