getml.pipeline¶

Contains handlers for all steps involved in a data science project after data preparation:

automated feature learning
automated feature selection
training and evaluation of machine learning (ML) algorithms
deployment of the fitted models

Example:

We assume that you have already set up your data model using Placeholder, your feature learners (refer to feature_learning) as well as your feature selectors and predictors (refer to predictors, which can be used for prediction and feature selection).

pipe = getml.pipeline.Pipeline(
    tags=["multirel", "relboost", "31 features"],
    population=population_placeholder,
    peripheral=[order_placeholder, trans_placeholder],
    feature_learners=[feature_learner_1, feature_learner_2],
    feature_selectors=feature_selector,
    predictors=predictor,
    share_selected_features=0.5
)

# "order" and "trans" refer to the names of the
# placeholders.
pipe.check(
    population_table=population_training,
    peripheral_tables={"order": order, "trans": trans}
)

pipe.fit(
    population_table=population_training,
    peripheral_tables={"order": order, "trans": trans}
)

pipe.score(
    population_table=population_testing,
    peripheral_tables={"order": order, "trans": trans}
)

Classes¶

`Columns`(pipeline, targets, peripheral[, data])	Container which holds a pipeline’s columns.
`Features`(pipeline, targets[, data])	Container which holds a pipeline’s features.
`Metrics`(name)	Custom class for handling the metrics generated by the pipeline.
`Pipeline`([population, peripheral, …])	A Pipeline is the main class for feature learning and prediction.
`Pipelines`([data])	Container which holds all pipelines associated with the currently running project.
`Scores`(data, latest)	Container which holds the history of all scores associated with a given pipeline.
`SQLCode`(code)	Custom class for handling the SQL code of the features generated by the pipeline.

Functions¶

`delete`(name)	If a pipeline named ‘name’ exists, it is deleted.
`exists`(name)	Returns true if a pipeline named ‘name’ exists.
`list_pipelines`()	Lists all pipelines present in the engine.
`load`(name)	Loads a pipeline from the getML engine into Python.