Features

class getml.pipeline.Features(name, targets)

Bases: object

Custom class for handling the features generated by the pipeline.

Example

names, importances = my_pipeline.features.importances()

names, correlations = my_pipeline.features.correlations()

sql_code = my_pipeline.features.to_sql()

Methods Summary

correlations([target_num, sort])

Returns the data for the feature correlations, as displayed in the getML monitor.

importances([target_num, sort])

Returns the data for the feature importances, as displayed in the getML monitor.

to_pandas()

Returns all information related to the features in a pandas data frame.

to_sql()

Returns SQL statements visualizing the features.

Methods Documentation

correlations(target_num=0, sort=True)

Returns the data for the feature correlations, as displayed in the getML monitor.

Parameters
  • target_num (int) – Indicates for which target you want to view the importances. (Pipelines can have more than one target.)

  • sort (bool) – Whether you want the results to be sorted.

Returns

  • The first array contains the names of the features.

  • The second array contains the correlations with the target.

Return type

(numpy.ndarray, numpy.ndarray)

importances(target_num=0, sort=True)

Returns the data for the feature importances, as displayed in the getML monitor.

Parameters
  • target_num (int) – Indicates for which target you want to view the importances. (Pipelines can have more than one target.)

  • sort (bool) – Whether you want the results to be sorted.

Returns

  • The first array contains the names of the features.

  • The second array contains their importances. By definition, all importances add up to 1.

Return type

(numpy.ndarray, numpy.ndarray)

to_pandas()

Returns all information related to the features in a pandas data frame.

to_sql()

Returns SQL statements visualizing the features.

Examples

my_pipeline.features.to_sql()
Raises
  • IOError – If the pipeline could not be found on the engine or the pipeline could not be fitted.

  • KeyError – If an unsupported instance variable is encountered .

  • TypeError – If any instance variable is of wrong type.

Returns

SQLCode

Object representing the features.

Note

Only fitted pipelines (fit()) can hold trained features which can be returned as SQL statements. The dialect is based on the SQLite3 standard.