Features¶
- class getml.pipeline.Features(pipeline: str, targets: Sequence[str], data: Optional[Sequence[Feature]] = None)[source]¶
Container which holds a pipeline’s features. Features can be accessed by name, index or with a numpy array. The container supports slicing and is sort- and filterable.
Further, the container holds global methods to request features’ importances, correlations and their respective transpiled sql representation.
Note:
The container is an iterable. So, in addition to
filter()
you can also use python list comprehensions for filtering.- Example:
all_my_features = my_pipeline.features first_feature = my_pipeline.features[0] second_feature = my_pipeline.features["feature_1_2"] all_but_last_10_features = my_pipeline.features[:-10] important_features = [feature for feature in my_pipeline.features if feature.importance > 0.1] names, importances = my_pipeline.features.importances() names, correlations = my_pipeline.features.correlations() sql_code = my_pipeline.features.to_sql()
Methods
correlations
([target_num, sort])Returns the data for the feature correlations, as displayed in the getML monitor.
filter
(conditional)Filters the Features container.
importances
([target_num, sort])Returns the data for the feature importances, as displayed in the getML monitor.
sort
([by, key, descending])Sorts the Features container.
Returns all information related to the features in a pandas data frame.
to_sql
([targets, subfeatures, dialect, ...])Returns SQL statements visualizing the features.
Attributes
Holds the correlations of a
Pipeline
's features.Holds the correlations of a
Pipeline
's features.Holds the names of a
Pipeline
's features.Holds the names of a
Pipeline
's features.