tune_predictors¶

getml.hyperopt.tune_predictors(pipeline, population_table_training, population_table_validation, peripheral_tables=None, n_iter=0, score=None, num_threads=0)[source]¶

A high-level interface for optimizing the predictors of a getml.Pipeline.

Efficiently optimizes the hyperparameters for the set of predictors (from getml.predictors) of a given pipeline by breaking each predictor’s hyperparameter space down into carefully curated subspaces and optimizing the hyperparameters for each subspace in a sequential multi-step process. For further details about the actual recipes behind the tuning routines refer to tuning routines.

Args:

pipeline (Pipeline):: Base pipeline used to derive all models fitted and scored during the hyperparameter optimization. It defines the data schema and any hyperparameters that are not optimized.
population_table_training(DataFrame):: The population table that pipelines will be trained on.
population_table_validation(DataFrame):: The population table that pipelines will be evaluated on.
peripheral_tables(DataFrame, list or dict): The: peripheral tables used to provide additional information for the population tables.
n_iter (int, optional):: The number of iterations.
score (str, optional):: The score to optimize. Must be from scores.
num_threads (int, optional):: The number of parallel threads to use. If set to 0, the number of threads will be inferred.

Example:

We assume that you have already set up your Pipeline. Moreover, we assume that you have defined a training set and a validation set as well as the peripheral tables.
tuned_pipeline = getml.hyperopt.tune_predictors(
    pipeline=base_pipeline,
    population_table_training=training_set,
    population_table_validation=validation_set,
    peripheral_tables=peripheral_tables)

Returns:: A Pipeline containing tuned predictors.
Raises:: TypeError: If any instance variable is of a wrong type.