Mapping¶
- class getml.preprocessors.Mapping(aggregation: ~typing.List[str] = <factory>, min_freq: int = 30, multithreading: bool = True)[source]¶
Bases:
_Preprocessor
A mapping preprocessor maps categorical values, discrete values and individual words in a text field to numerical values. These numerical values are retrieved by aggregating targets in the relational neighbourhood.
You are particularly encouraged to use the mapping preprocessor in combination with
FastProp
.Refer to the User guide for more information.
- Args:
- aggregation (List[
aggregations
], optional): The aggregation function to use over the targets.
Must be from
aggregations
.- min_freq (int, optional):
The minimum number of targets required for a value to be included in the mapping. Range: [0, \(\infty\)]
- multithreading (bool, optional):
Whether you want to apply multithreading.
- aggregation (List[
- Example:
mapping = getml.preprocessors.Mapping() pipe = getml.Pipeline( population=population_placeholder, peripheral=[order_placeholder, trans_placeholder], preprocessors=[mapping], feature_learners=[feature_learner_1, feature_learner_2], feature_selectors=feature_selector, predictors=predictor, share_selected_features=0.5 )
- Note:
Not supported in the getML community edition.
Attributes Summary
Methods Summary
validate
([params])Checks both the types and the values of all instance variables and raises an exception if something is off.
Attributes Documentation
- agg_sets: ClassVar[_Aggregations] = _Aggregations(All=['AVG', 'COUNT', 'COUNT DISTINCT', 'COUNT DISTINCT OVER COUNT', 'COUNT MINUS COUNT DISTINCT', 'KURTOSIS', 'MAX', 'MEDIAN', 'MIN', 'MODE', 'NUM MAX', 'NUM MIN', 'Q1', 'Q5', 'Q10', 'Q25', 'Q75', 'Q90', 'Q95', 'Q99', 'SKEW', 'STDDEV', 'SUM', 'VAR', 'VARIATION COEFFICIENT'], Default=['AVG'], Minimal=['AVG'])¶
- min_freq: int = 30¶
- multithreading: bool = True¶
Methods Documentation