transform

Pipeline.transform(population_table, peripheral_tables=None, df_name='', table_name='')[source]

Translates new data into the trained features.

Transforms the data provided in population_table and peripheral_tables into features, which can be used to drive machine learning models. In addition to returning them as numerical array, this method is also able to write the results in a data base called table_name.

Args:
population_table (getml.data.DataFrame):

Main table corresponding to the population Placeholder instance variable. Its target variable(s) will be ignored.

peripheral_tables (List[getml.data.DataFrame]):

Additional tables corresponding to the peripheral Placeholder instance variable. They have to be provided in the exact same order as their corresponding placeholders. A single DataFrame will be wrapped into a list internally.

df_name (str, optional):

If not an empty string, the resulting features will be written into a newly created DataFrame.

table_name (str, optional):

If not an empty string, the resulting features will be written into the database of the same name. See Unified import interface for further information.

Raises:
IOError: If the pipeline could not be found on the engine or

the pipeline could not be fitted.

TypeError: If any input argument is not of proper type. KeyError: If an unsupported instance variable is

encountered.

TypeError: If any instance variable is of wrong type. ValueError: If any instance variable does not match its

possible choices (string) or is out of the expected bounds (numerical).

Return:
numpy.ndarray:

Resulting features provided in an array of the (number of rows in population_table, number of selected features).

or getml.data.DataFrame:

A DataFrame containing the resulting features.

Note:

Only fitted pipelines (fit()) can transform data into features.