target

getml.data.roles.target = 'target'

Numerical response predicted using the resulting features

The associated columns do contain the variables we intend to describe and predict in our data science project. They are neither included in the data model nor in the feature engineering algorithm (since they will be unknown in all future events). But they are such an important part of the analysis that their presence - at least one - is required in every population table (see Tables). They are allowed to be present in peripheral tables too, but won’t be considered during the fitting.

The actual content of the columns needs to be numerical. For regression problems this is straight forward by providing the numerical target variable. In classification, however, all possible values must be encoded as numbers. But don’t worry, the getML engine does not assume an internal ordering for this kind of data. In addition, no NULL values are allowed within the associated columns.

MultirelModel does support multiple targets out of the box. For RelboostModel, on the other hand, you can only train on one at a time. If you have several targets, you need to train separate models (either by providing unique names or using the time-based default names) and specify the corresponding target_num instance variable of RelboostModel. Which number is associated to which target is determined by their ordering in the target_names instance variable in the DataFrame.