mmtfPyspark.datasets.secondaryStructureElementExtractor module

secondaryStructureElementExtractor.py

Returns a datset of continuous segments of protein sequence with the specified DSSP secondary structure code (E, H, C) of a minimum length.

Examples

sequence label
TFIVTA ALTGTYE E E
get_dataset(structure, label, length=None)[source]

Returns a dataset of continuous segments of protein sequence with the specified DSSP secondary structure code (E, H, C) of a minimum length.

Parameters:

structure : structure

label : str

DSSP secondary structure label (E, H, C)

length : int

minimum length of secondary structure segment

Returns:

dataset

dataset of continuous segments of protein sequence