mmtfPyspark.datasets.myVariantDataset module

myVariantDataset.py

This class queries and retrieves missense variations using the MyVariant.info web services for a list of UniProt ids.

References

Examples

Get all missense variations for a list of Uniprot Ids:

>>> uniprotIds = ['P15056']    # BRAF
>>> ds = MyVariantDataset.get_variations(uniprotIds)
>>> ds.show()

Return missense variations that match a query

>>> uniprotIds = ['P15056']    # BRAF
>>> query = "clinivar.rcv.clinical_significance:pathogenic"
...       + "OR linivar.rcv.clinical_significance:likely pathogenic"
>>> ds = MyVariantDataset.get_variations(uniprotIds, query)
>>> ds.show()
+-------------------+---------+
|        variationId|uniprotId|
+-------------------+---------+
|chr7:g.140454006G>T|   P15056|
|chr7:g.140453153A>T|   P15056|
|chr7:g.140477853C>A|   P15056|
+-------------------+---------+
get_variations(uniprotIds, query='')[source]

Returns a dataset of missense variabtions for a list of Uniprot Ids and a MyVariant.info query.

Parameters:

uniprotIds : list

list of Uniprot Ids

query : str

MyVariant.info query string [‘’]

Returns:

dataset

dataset with variation Ids and Uniprot Ids or null if no data are found

References

query syntax http://myvariant.info/docs/

Examples

>>> uniprotIds = ['P15056']    # BRAF
>>> query = "clinivar.rcv.clinical_significance:pathogenic"
...         + "OR linivar.rcv.clinical_significance:likely pathogenic"
>>> ds = MyVariantDataset.get_variations(uniprotIds, query)