Greenplum interface

Greenplum [1] is an open source database system maintained by Pivotal Software, Inc. It can be connected to the getML engine using the function connect_greenplum(). But first, make sure your database is running, you have the corresponding hostname, port as well as your user name and password ready, and you can reach it from via your command line.

Import from Greenplum

By selecting an existing table of your database in the from_db() class method, you can create a new DataFrame containing all its data. Alternatively you can use the read_db() and read_query() methods to replace the content of the current DataFrame instance or append further rows based on either a table or a specific query.

Export to Greenplum

You can also write your results back into the Greenplum database. By providing a name for the destination table in getml.pipeline.Pipeline.transform(), the features generated from your raw data will be written back. Passing it into getml.pipeline.Pipeline.predict() generates predictions of the target variables to new, unseen data and stores the result into the corresponding table.