Mapping

class getml.preprocessors.Mapping(aggregation=None, min_freq=30)[source]

Bases: getml.preprocessors.preprocessor._Preprocessor

A mapping preprocessor maps categorical values, discrete values and individual words in a text field to numerical values. These numerical values are retrieved by aggregating targets in the relational neighbourhood.

You are particularly encouraged to use the mapping preprocessor in combination with FastPropModel and FastPropTimeSeries.

Refer to the User guide for more information.

mapping = getml.preprocessors.Mapping()

pipe = getml.pipeline.Pipeline(
    population=population_placeholder,
    peripheral=[order_placeholder, trans_placeholder],
    preprocessors=[mapping],
    feature_learners=[feature_learner_1, feature_learner_2],
    feature_selectors=feature_selector,
    predictors=predictor,
    share_selected_features=0.5
)
Args:

aggregation (List[aggregations], optional):

The aggregation function to use over the targets.

Must be from aggregations.

min_freq (int, optional):

The minimum number of targets required for a value to be included in the mapping. Range: [0, \(\infty\)]

Attributes Summary

agg_sets

Attributes Documentation

agg_sets = Aggregations(All=['AVG', 'COUNT', 'COUNT ABOVE MEAN', 'COUNT BELOW MEAN', 'COUNT DISTINCT', 'COUNT DISTINCT OVER COUNT', 'COUNT MINUS COUNT DISTINCT', 'KURTOSIS', 'MAX', 'MEDIAN', 'MIN', 'MODE', 'NUM MAX', 'NUM MIN', 'Q1', 'Q5', 'Q10', 'Q25', 'Q75', 'Q90', 'Q95', 'Q99', 'SKEW', 'STDDEV', 'SUM', 'VAR', 'VARIATION COEFFICIENT'], Default=['AVG'], Minimal=['AVG'])