FloatColumn

class getml.data.columns.FloatColumn(name='', role='numerical', num=0, df_name='')

Bases: getml.data.columns._Column

Handler for numerical data in the engine.

This is a handler for all numerical data in the getML engine, including time stamps.

Parameters
  • name (str, optional) – Name of the categorical column.

  • role (str, optional) – Role that the column plays.

  • num (int, optional) – Number of the column.

  • df_name (str, optional) – name instance variable of the DataFrame containing this column.

Note

All FloatColumn are immutable and, thus, their content can not be changed directly. All operations altering the underlying data will return a new column, which is purely virtual and has to be added to the DataFrame using its add() method.

This class provides a set of data preparation methods. They are still experimental (and, therefore, not covered in the main documentation) yet but nevertheless widely tested and used internally. Only their signatures might change significantly in following releases.

Attributes Summary

length

num_columns

Methods Summary

abs()

Compute absolute value.

acos()

Compute arc cosine.

alias(alias)

Adds an alias to the column.

as_str()

Transforms column to a string.

asin()

Compute arc sine.

assert_equal([alias])

ASSERT EQUAL aggregation.

atan()

Compute arc tangent.

avg([alias])

AVG aggregation.

cbrt()

Compute cube root.

ceil()

Round up value.

cos()

Compute cosine.

count([alias])

COUNT aggregation.

day()

Extract day (of the month) from a time stamp.

erf()

Compute error function.

exp()

Compute exponential function.

floor()

Round down value.

gamma()

Compute gamma function.

hour()

Extract hour (of the day) from a time stamp.

is_inf()

Determine whether the value is infinite.

is_nan()

Determine whether the value is nan.

lgamma()

Compute log-gamma function.

log()

Compute natural logarithm.

max([alias])

MAX aggregation.

median([alias])

MEDIAN aggregation.

min([alias])

MIN aggregation.

minute()

Extract minute (of the hour) from a time stamp.

month()

Extract month from a time stamp.

round()

Round to nearest.

second()

Extract second (of the minute) from a time stamp.

sin()

Compute sine.

sqrt()

Compute square root.

stddev([alias])

STDDEV aggregation.

sum([alias])

SUM aggregation.

tan()

Compute tangent.

to_numpy([sock])

Transform column to numpy array

update(condition, values)

Returns an updated version of this column.

var([alias])

VAR aggregation.

weekday()

Extract day of the week from a time stamp, Sunday being 0.

year()

Extract year from a time stamp.

yearday()

Extract day of the year from a time stamp.

Attributes Documentation

length
num_columns = 0

Methods Documentation

abs()

Compute absolute value.

acos()

Compute arc cosine.

alias(alias)

Adds an alias to the column. This is useful for joins.

Parameters

alias (str) – The name of the column as it should appear in the new DataFrame.

as_str()

Transforms column to a string.

asin()

Compute arc sine.

assert_equal(alias='new_column')

ASSERT EQUAL aggregation.

Throws an exception if not all values inserted into the aggregation are equal.

Parameters

alias (str) – Name for the new column.

atan()

Compute arc tangent.

avg(alias='new_column')

AVG aggregation.

Parameters

alias (str) – Name for the new column.

cbrt()

Compute cube root.

ceil()

Round up value.

cos()

Compute cosine.

count(alias='new_column')

COUNT aggregation.

Parameters

alias (str) – Name for the new column.

day()

Extract day (of the month) from a time stamp.

If the column is numerical, that number will be interpreted as the number of days since epoch time (January 1, 1970).

erf()

Compute error function.

exp()

Compute exponential function.

floor()

Round down value.

gamma()

Compute gamma function.

hour()

Extract hour (of the day) from a time stamp.

If the column is numerical, that number will be interpreted as the number of days since epoch time (January 1, 1970).

is_inf()

Determine whether the value is infinite.

is_nan()

Determine whether the value is nan.

lgamma()

Compute log-gamma function.

log()

Compute natural logarithm.

max(alias='new_column')

MAX aggregation.

Parameters

alias (str) – Name for the new column.

median(alias='new_column')

MEDIAN aggregation.

alias: Name for the new column.

min(alias='new_column')

MIN aggregation.

alias: Name for the new column.

minute()

Extract minute (of the hour) from a time stamp.

If the column is numerical, that number will be interpreted as the number of days since epoch time (January 1, 1970).

month()

Extract month from a time stamp.

If the column is numerical, that number will be interpreted as the number of days since epoch time (January 1, 1970).

round()

Round to nearest.

second()

Extract second (of the minute) from a time stamp.

If the column is numerical, that number will be interpreted as the number of days since epoch time (January 1, 1970).

sin()

Compute sine.

sqrt()

Compute square root.

stddev(alias='new_column')

STDDEV aggregation.

Parameters

alias (str) – Name for the new column.

sum(alias='new_column')

SUM aggregation.

Parameters

alias (str) – Name for the new column.

tan()

Compute tangent.

to_numpy(sock=None)

Transform column to numpy array

Parameters

sock (optional) – Socket connecting the Python API with the getML engine.

update(condition, values)

Returns an updated version of this column.

All entries for which the corresponding condition is True, are updated using the corresponding entry in values.

Parameters
  • condition (Boolean column) – Condition according to which the update is done

  • values – Values to update with

var(alias='new_column')

VAR aggregation.

Parameters

alias (str) – Name for the new column.

weekday()

Extract day of the week from a time stamp, Sunday being 0.

If the column is numerical, that number will be interpreted as the number of days since epoch time (January 1, 1970).

year()

Extract year from a time stamp.

If the column is numerical, that number will be interpreted as the number of days since epoch time (January 1, 1970).

yearday()

Extract day of the year from a time stamp.

If the column is numerical, that number will be interpreted as the number of days since epoch time (January 1, 1970).