This module provides communication routines to access various databases.

The connect_greenplum(), connect_mariadb(), connect_mysql(), connect_postgres(), and connect_sqlite3() functions establish a connection between a database and the getML engine. During the data import using either the read_db() or read_query() methods of a DataFrame instance or the corresponding from_db() class method all data will be directly loaded from the database into the engine without ever passing the Python interpreter.

In addition, several auxiliary functions that might be handy during the analysis and interaction with the database are provided.


connect_greenplum(dbname, user, password, …)

Creates a new Greenplum database connection.

connect_mariadb(dbname, user, password, host)

Creates a new MariaDB database connection.

connect_mysql(dbname, user, password, host)

Creates a new MySQL database connection.

connect_odbc(server_name[, user, password, …])

Creates a new ODBC database connection.

connect_postgres(dbname, user, password, …)

Creates a new PostgreSQL database connection.

connect_sqlite3([name, time_formats, conn_id])

Creates a new SQLite3 database connection.

copy_table(source_conn, target_conn, …[, …])

Copies a table from one database connection to another.

drop_table(name[, conn])

Drops a table from the database.

execute(query[, conn])

Executes an SQL query on the database.

get(query[, conn])

Executes an SQL query on the database and returns the result as a pandas dataframe.

get_colnames(name[, conn])

Lists the colnames of a table held in the database.


Returns a list handles to all connections that are currently active on the engine.


Lists all tables and views currently held in the database.

read_csv(name, fnames[, quotechar, sep, …])

Reads a CSV file into the database.

read_s3(name, bucket, keys, region[, sep, …])

Reads a list of CSV files located in an S3 bucket.

sniff_csv(name, fnames[, num_lines_sniffed, …])

Sniffs a list of CSV files.

sniff_s3(name, bucket, keys, region[, …])

Sniffs a list of CSV files located in an S3 bucket.



A handle to a database connection on the getML engine.