random

DataFrame.random(seed=5849)[source]

Create random column.

The numbers will uniformly distributed from 0.0 to 1.0. This can be used to randomly split a population table into a training and a test set

Args:

seed (int): Seed used for the random number generator.

Returns:
VirtualFloatColumn:

FloatColumn containing random numbers

Example:

population = getml.data.DataFrame('population')
population.add(numpy.zeros(100), 'column_01')
print(len(population))
100
idx = population.random(seed=42)
population_train = population.where("population_train", idx > 0.7)
population_test = population.where("population_test", idx <= 0.7)
print(len(population_train), len(population_test))
27 73