TextFieldSplitter

class getml.preprocessors.TextFieldSplitter[source]

Bases: getml.preprocessors.preprocessor._Preprocessor

A TextFieldSplitter splits columns with role getml.data.roles.text into relational bag-of-words representations to allow the feature learners to learn patterns based on the prescence of certain words within the text fields.

Refer to the User guide for more information.

text_field_splitter = getml.preprocessors.TextFieldSplitter()

pipe = getml.pipeline.Pipeline(
    population=population_placeholder,
    peripheral=[order_placeholder, trans_placeholder],
    preprocessors=[text_field_splitter],
    feature_learners=[feature_learner_1, feature_learner_2],
    feature_selectors=feature_selector,
    predictors=predictor,
    share_selected_features=0.5
)