to_pyspark

DataFrame.to_pyspark(spark, name=None)[source]

Creates a pyspark.sql.DataFrame from the current instance.

Loads the underlying data from the getML engine and constructs a pyspark.sql.DataFrame.

Args:
spark (pyspark.sql.SparkSession):

The pyspark session in which you want to create the data frame.

name (str or None):

The name of the temporary view to be created on top of the pyspark.sql.DataFrame, with which it can be referred to in Spark SQL (refer to pyspark.sql.DataFrame.createOrReplaceTempView()). If none is passed, then the name of this getml.DataFrame will be used.

Returns:
pyspark.sql.DataFrame:

Pyspark equivalent of the current instance including its underlying data.