mmtfPyspark.ml.datasetClassifier module

datasetClassifier.py

Runs binary and multi-class classifiers on a given dataset. Dataset are read as Parquet file. The dataset must contain a feature vector named “features” and a classification column. The column name of the classification column must be specified on the command lines.

main(argv)[source]