This demo shows how to access metadata for SWISS-MODEL homology models.
Bienert S, Waterhouse A, de Beer TA, Tauriello G, Studer G, Bordoli L, Schwede T (2017). The SWISS-MODEL Repository - new features and functionality, Nucleic Acids Res. 45(D1):D313-D319. * https://dx.doi.org/10.1093/nar/gkw1132
Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, Kiefer F, Gallo Cassarino T, Bertoni M, Bordoli L, Schwede T(2014). The SWISS-MODEL Repository - modelling protein tertiary and quaternary structure using evolutionary information, Nucleic Acids Res. 42(W1):W252–W258. * https://doi.org/10.1093/nar/gku340
In [1]:
from pyspark.sql import SparkSession
from mmtfPyspark.datasets import swissModelDataset
In [2]:
spark = SparkSession.builder\
.master("local[*]")\
.appName("SwissModelDatasetDemo") \
.getOrCreate()
In [3]:
# list of uniProtIds to be retrived from Swiss-Model
uniProtIds = ['P36575','P24539','O00244']
ds = swissModelDataset.get_swiss_models(uniProtIds)
In [5]:
df = ds.toPandas()
df.head()
Out[5]:
ac | sequence | from | to | qmean | qmean_norm | gmqe | coverage | oligo-state | method | template | identity | similarity | coordinates | md5 | md5 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | P36575 | MSKVFKKTSSNGKLSIYLGKRDFVDHVDTVEPIDGVVLVDPEYLKC... | 2 | 371 | -3.206383 | 0.658018 | 0.757 | 0.953608 | monomer | Homology | 1suj.1.A | 68.664848 | 0.504633 | https://swissmodel.expasy.org/repository/unipr... | 1b2ec664c28f6cde36c416b6a66fc591 | 1b2ec664c28f6cde36c416b6a66fc591 |
1 | P24539 | MLSRVVLSAAATAAPSLKNAAFLGPGVLQATRTFHTGQPHLVPVPP... | 76 | 249 | -2.543623 | 0.669841 | 0.656 | 0.679688 | monomer | Homology | 5ara.1.S | 84.482758 | 0.547889 | https://swissmodel.expasy.org/repository/unipr... | 138e5aeaf02a8fa2e9c52264e5383033 | 138e5aeaf02a8fa2e9c52264e5383033 |
2 | O00244 | MPKHEFSVDMTCGGCAEAVSRVLNKLGGVKYDIDLPNKKVCIESEH... | 1 | 68 | 1.047134 | 0.842332 | 0.987 | 1.000000 | homo-2-mer | Homology | 1fe4.1.B | 100.000000 | 0.606865 | https://swissmodel.expasy.org/repository/unipr... | 34f221f64be3395aa958786b84dfc0da | 34f221f64be3395aa958786b84dfc0da |
In [7]:
sc.stop()