EmailDomain

class getml.preprocessors.EmailDomain[source]

Bases: _Preprocessor

The EmailDomain preprocessor extracts the domain from e-mail addresses.

For instance, if the e-mail address is ‘some.guy@domain.com’, the preprocessor will automatically extract @domain.com’.

The preprocessor will be applied to all text columns that were assigned one of the subroles getml.data.subroles.include.email or getml.data.subroles.only.email.

It is recommended that you assign getml.data.subroles.only.email, because it is unlikely that the e-mail address itself is interesting.

Example:
my_data_frame.set_subroles("email", getml.data.subroles.only.email)

domain = getml.preprocessors.EmailDomain()

pipe = getml.Pipeline(
    population=population_placeholder,
    peripheral=[order_placeholder, trans_placeholder],
    preprocessors=[domain],
    feature_learners=[feature_learner_1, feature_learner_2],
    feature_selectors=feature_selector,
    predictors=predictor,
    share_selected_features=0.5
)

Methods Summary

validate([params])

Checks both the types and the values of all instance variables and raises an exception if something is off.

Methods Documentation

validate(params=None)[source]

Checks both the types and the values of all instance variables and raises an exception if something is off.

Args:
params (dict, optional):

A dictionary containing the parameters to validate. If not is passed, the own parameters will be validated.