mmtfPyspark.datasets.polymerSequenceExtractor module

polymerSequenceExtractor.py:

Creates a dataset of polymer sequences using the full sequence used in the experiment (i.e., the “SEQRES” record in PDB files).

get_dataset(structures)[source]

Returns a dataset of polymer sequence contained in PDB entries using the full sequence used in the experimnet (i.e., the “SEQRES” record in PDB files)

Parameters:

structures : pythonRDD

a set of PDB structures

Returns:

dataset

dataset with interacting residue and atom information