from_db

classmethod DataFrame.from_db(table_name, name=None, roles=None, ignore=False, dry=False, conn=None)[source]

Create a DataFrame from a table in a database.

It will construct a data frame object in the engine, fill it with the data read from table table_name in the connected database (see database), and return a corresponding DataFrame handle.

Args:

table_name (str): Name of the table to be read.

name (str): Name of the data frame to be created. If not passed,

then the table_name will be used.

roles(dict[str, List[str]], optional): A dictionary mapping

the roles to the column names. If this is not passed, then the roles will be sniffed from the table. The roles dictionary should be in the following format:

>>> roles = {"role1": ["colname1", "colname2"], "role2": ["colname3"]}
ignore (bool, optional): Only relevant when roles is not None.

Determines what you want to do with any colnames not mentioned in roles. Do you want to ignore them (True) or read them in as unused columns (False)?

dry (bool, optional): If set to True, then the data

will not actually be read. Instead, the method will only return the roles it would have used. This can be used to hard-code roles when setting up a pipeline.

conn (Connection, optional):

The database connection to be used. If you don’t explicitly pass a connection, the engine will use the default connection.

Raises:

TypeError: If any of the input arguments is of a wrong type. ValueError:

If one of the provided keys in roles does not match a definition in roles.

Returns:

DataFrame:

Handler of the underlying data.

Note:

The created data frame object is only held in memory by the getML engine. If you want to use it in later sessions or after switching the project, you have to called save() method.

In addition to reading data from a table, you can also write an existing DataFrame back into a new one in the same database using to_db() or replace/append to the current instance using the read_db() or read_query() method.

Example:

getml.database.connect_mysql(
    host="relational.fit.cvut.cz",
    port=3306,
    dbname="financial",
    user="guest",
    password="relational"
)

loan = getml.data.DataFrame.from_db(table_name='loan', name='data_frame_loan')