mmtfPyspark.ml.pythonRDDToDataset module

pythonRDDToDataset.py:

This class converts a PythonRDD<Row> to a Dataset<Row>. This method only supports simple data types and all data need to be not null.

get_dataset(data, colNames)[source]
Converts a PythonRDD<Row> to a Dataset<Row>. This method only
supports simple data types and all data need to be not null.
Parameters:

data : PythonRDD

PythonRDD of row objects

colNames : list

names of the columns in a row