Subset

class getml.data.Subset(container_id: str, peripheral: Dict[str, Union[DataFrame, View]], population: Union[DataFrame, View])[source]

A Subset consists of a population table and one or several peripheral tables.

It is passed by a Container, StarSchema and TimeSeries to the Pipeline.

Example:
container = getml.data.Container(
    train=population_train,
    test=population_test
)

container.add(
    meta=meta,
    order=order,
    trans=trans
)

# train and test are Subsets.
# They contain population_train
# and population_test respectively,
# as well as ther peripheral tables
# meta, order and trans.
my_pipeline.fit(container.train)

my_pipeline.score(container.test)

Methods

Attributes

container_id

peripheral

population