TextFieldSplitter¶
- class getml.preprocessors.TextFieldSplitter[source]¶
Bases:
_Preprocessor
A TextFieldSplitter splits columns with role
getml.data.roles.text
into relational bag-of-words representations to allow the feature learners to learn patterns based on the prescence of certain words within the text fields.Text fields will be splitted on a whitespace or any of the following characters:
; , . ! ? - | " \t \v \f \r \n % ' ( ) [ ] { }
Refer to the User guide for more information.
- Example:
text_field_splitter = getml.preprocessors.TextFieldSplitter() pipe = getml.Pipeline( population=population_placeholder, peripheral=[order_placeholder, trans_placeholder], preprocessors=[text_field_splitter], feature_learners=[feature_learner_1, feature_learner_2], feature_selectors=feature_selector, predictors=predictor, share_selected_features=0.5 )
Methods Summary
validate
([params])Checks both the types and the values of all instance variables and raises an exception if something is off.
Methods Documentation