Swiss Model Dataset

This demo shows how to access metadata for SWISS-MODEL homology models.

Reference

Bienert S, Waterhouse A, de Beer TA, Tauriello G, Studer G, Bordoli L, Schwede T (2017). The SWISS-MODEL Repository - new features and functionality, Nucleic Acids Res. 45(D1):D313-D319. * https://dx.doi.org/10.1093/nar/gkw1132

Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, Kiefer F, Gallo Cassarino T, Bertoni M, Bordoli L, Schwede T(2014). The SWISS-MODEL Repository - modelling protein tertiary and quaternary structure using evolutionary information, Nucleic Acids Res. 42(W1):W252–W258. * https://doi.org/10.1093/nar/gku340

Imports

In [1]:
from pyspark.sql import SparkSession
from mmtfPyspark.datasets import swissModelDataset

Configure Spark Session

In [2]:
spark = SparkSession.builder\
                    .master("local[*]")\
                    .appName("SwissModelDatasetDemo") \
                    .getOrCreate()

Download metadata for Swiss-Model homology

In [3]:
# list of uniProtIds to be retrived from Swiss-Model
uniProtIds = ['P36575','P24539','O00244']

ds = swissModelDataset.get_swiss_models(uniProtIds)

Show results

In [5]:
df = ds.toPandas()
df.head()
Out[5]:
ac sequence from to qmean qmean_norm gmqe coverage oligo-state method template identity similarity coordinates md5 md5
0 P36575 MSKVFKKTSSNGKLSIYLGKRDFVDHVDTVEPIDGVVLVDPEYLKC... 2 371 -3.206383 0.658018 0.757 0.953608 monomer Homology 1suj.1.A 68.664848 0.504633 https://swissmodel.expasy.org/repository/unipr... 1b2ec664c28f6cde36c416b6a66fc591 1b2ec664c28f6cde36c416b6a66fc591
1 P24539 MLSRVVLSAAATAAPSLKNAAFLGPGVLQATRTFHTGQPHLVPVPP... 76 249 -2.543623 0.669841 0.656 0.679688 monomer Homology 5ara.1.S 84.482758 0.547889 https://swissmodel.expasy.org/repository/unipr... 138e5aeaf02a8fa2e9c52264e5383033 138e5aeaf02a8fa2e9c52264e5383033
2 O00244 MPKHEFSVDMTCGGCAEAVSRVLNKLGGVKYDIDLPNKKVCIESEH... 1 68 1.047134 0.842332 0.987 1.000000 homo-2-mer Homology 1fe4.1.B 100.000000 0.606865 https://swissmodel.expasy.org/repository/unipr... 34f221f64be3395aa958786b84dfc0da 34f221f64be3395aa958786b84dfc0da

Terminate Spark

In [7]:
sc.stop()