Title: | Analyses of Protein Post-Translational Modifications |
---|---|
Description: | Contains utilities for the analysis of post-translational modifications (PTMs) in proteins, with particular emphasis on the sulfoxidation of methionine residues. Features include the ability to download, filter and analyze data from the sulfoxidation database 'MetOSite'. Utilities to search and characterize S-aromatic motifs in proteins are also provided. In addition, functions to analyze sequence environments around modifiable residues in proteins can be found. For instance, 'ptm' allows to search for amino acids either overrepresented or avoided around the modifiable residues from the proteins of interest. Functions tailored to test statistical hypothesis related to these differential sequence environments are also implemented. Further and detailed information regarding the methods in this package can be found in (Aledo (2020) <https://metositeptm.com>). |
Authors: | Juan Carlos Aledo [aut, cre] |
Maintainer: | Juan Carlos Aledo <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.1 |
Built: | 2025-02-17 04:00:07 UTC |
Source: | https://github.com/cran/ptm |
Finds the path to an executable.
.get.exepath(prg)
.get.exepath(prg)
prg |
name of the executable. |
Returns the absolute path.
Gets a web resource.
.get.url(url, n_tries = 3)
.get.url(url, n_tries = 3)
url |
url to be reached. |
n_tries |
number of tries. |
Returns the response or an error message.
Returns the residue found at the requested position.
aa.at(at, target, uniprot = TRUE)
aa.at(at, target, uniprot = TRUE)
at |
the position in the primary structure of the protein. |
target |
a character string specifying the UniProt ID of the protein of interest or, alternatively, the sequence of that protein. |
uniprot |
logical, if TRUE the argument 'target' should be an ID. |
Please, note that when uniprot is set to FALSE, target can be the string returned by a suitable function, such as get.seq or other.
Returns a single character representing the residue found at the indicated position in the indicated protein.
Juan Carlos Aledo
is.at(), aa.comp()
## Not run: aa.at(28, 'P01009')
## Not run: aa.at(28, 'P01009')
Returns a table with the amino acid composition of the target protein.
aa.comp(target, uniprot = TRUE, reference = 'human', init = FALSE)
aa.comp(target, uniprot = TRUE, reference = 'human', init = FALSE)
target |
a character string specifying the UniProt ID of the protein of interest or, alternatively, the sequence of that protein. |
uniprot |
logical, if TRUE the argument 'target' should be an ID. |
reference |
amino acid frequencies (in percent) of the proteinogenic amino acids to be used as reference. It should be either 'human', 'up' (composition of proteins in UniProt in 2019). Alternatively, the user can pass as argument any vector with 20 values to be used as reference. |
init |
logical, whether remove or not the first residue (initiation methionine) from the sequence. |
Returns a list where the first element is a dataframe with the observed and expected frequencies for each amino acid, the second element is the result of the Chi-squared test. In addition, a plot to reflect potential deviations from the reference standard composition is shown.
Juan Carlos Aledo
is.at(), renum.pdb(), renum.meto(), renum(), aa.at()
aa.comp('MPSSVSWGILLLAGLCCLVPVSLAEDPQGDAAQK', uniprot = FALSE)
aa.comp('MPSSVSWGILLLAGLCCLVPVSLAEDPQGDAAQK', uniprot = FALSE)
Imports a protein sequence from a selected database.
get.seq(id, db = 'uniprot', as.string = TRUE)
get.seq(id, db = 'uniprot', as.string = TRUE)
id |
the identifier of the protein of interest. |
db |
a character string specifying the desired database; it must be one of 'uniprot' or 'metosite'. |
as.string |
logical, if TRUE the imported sequence will be returned as a character string. |
MetOSite uses the same type of protein ID than UniProt.
Returns a protein sequence either as a character vector or a as a character string.
get.seq('P01009')
get.seq('P01009')
Checks that internet resource works properly and fail gracefully when not.
gracefully_fail(call, timeout = 10, ...)
gracefully_fail(call, timeout = 10, ...)
call |
url of the resource. |
timeout |
set maximum request time in seconds. |
... |
further named parameters, such as query, headers, etc. |
To be used as an ancillary function.
The response object or NULL when the server does not respond properly.
thefactmachine
https://gist.github.com/thefactmachine/18279b7796c0836d9188
gracefully_fail("http://httpbin.org/delay/2")
gracefully_fail("http://httpbin.org/delay/2")
A dataset containing data regarding human MetO sites oxidized by H2O2.
hmeto
hmeto
A data frame with 4472 rows and 15 variables:
UniProt ID of the oxidized protein
the protein's name
the position of the MetO site in the primary structure
conditions under which the oxidation experiment was carried out
array with all the sites oxidized in that protein
primary key identifying the site
sequence environment of the MetO site
sequence environment of a non oxidized Met from the same protein
Intrinsically Disordered Proteins, 0: the protein is not found in DisProt; 1: the protein contains disordered regions; 2: the protein may contain disordered regions but the experimental evidences are ambiguous
Intrinsically Disordered Region, TRUE: the MetO site belong to the IDR, FALSE: the MetO site doesn't belong to the IDR
protein abundance, in ppm
protein length, in number of residues
number of methionine residues
relative frequency of Met in that protein
whether the protein has been described to be oxidized in vivo, in vitro or under both conditions
Checks if a given amino acid is at a given position.
is.at(at, target, aa = 'M', uniprot = TRUE)
is.at(at, target, aa = 'M', uniprot = TRUE)
at |
the position in the primary structure of the protein. |
target |
a character string specifying the UniProt ID of the protein of interest or, alternatively, the sequence of that protein. |
aa |
the amino acid of interest. |
uniprot |
logical, if TRUE the argument 'target' should be an ID. |
Please, note that when uniprot is set to FALSE, target can be the string returned by a suitable function, such as get.seq or other.
Returns a boolean. Either the residue is present at that position or not.
Juan Carlos Aledo
aa.at(), aa.comp()
## Not run: is.at(28, 'P01009', 'Q')
## Not run: is.at(28, 'P01009', 'Q')
Lists proteins found in MetOSite with names matching the keyword.
meto.list(keyword)
meto.list(keyword)
keyword |
a character string corresponding to the keyword |
This function returns a dataframe with the uniprot id, the protein name and the species, for those proteins present into MetOSite whose name contains the keyword.
Juan Carlos Aledo
Valverde et al. 2019. Bioinformatics 35:4849-4850 (PMID: 31197322)
meto.search(), meto.scan()
meto.list('inhibitor')
meto.list('inhibitor')
Scans a given protein in search of MetO sites.
meto.scan(up_id, report = 1)
meto.scan(up_id, report = 1)
up_id |
a character string corresponding to the UniProt ID. |
report |
it should be a natural number between 1 and 3. |
When the 'report' parameter has been set to 1, this function returns a brief report providing the position, the function category and literature references concerning the residues detected as MetO, if any. If we wish to obtain a more detailed report, the option should be: report = 2. Finally, If we want a detailed and printable report (saved in the current directory), we should set report = 3
This function returns a report regarding the MetO sites found, if any, in the protein of interest.
Juan Carlos Aledo
Valverde et al. 2019. Bioinformatics 35:4849-4850 (PMID: 31197322)
meto.search(), meto.list()
meto.scan('P01009')
meto.scan('P01009')
Searches for specific MetO sites filtering MetOSite according to the selected criteria.
meto.search(highthroughput.group = TRUE, bodyguard.group = TRUE, regulatory.group = TRUE, gain.activity = 2, loss.activity = 2, gain.ppi = 2, loss.ppi = 2, change.stability = 2, change.location = 2, organism = -1, oxidant = -1)
meto.search(highthroughput.group = TRUE, bodyguard.group = TRUE, regulatory.group = TRUE, gain.activity = 2, loss.activity = 2, gain.ppi = 2, loss.ppi = 2, change.stability = 2, change.location = 2, organism = -1, oxidant = -1)
highthroughput.group |
logical, when FALSE the sites described in a high-throughput study (unknown effect) are filtered out. |
bodyguard.group |
logical, when FALSE the sites postulated to function as ROS sink (because when oxidized no apparent effect can be detected) are filtered out. |
regulatory.group |
logical, when FALSE the sites whose oxidation affect the properties of the protein (and therefore may be involved in regulation) are filtered out. |
gain.activity |
introduce 1 or 0 to indicate whether the oxidation of the selected sites implies a gain of activity or not, respectively. If we do not wish to use this property to filter, introduce 2. |
loss.activity |
introduce 1 or 0 to indicate whether or not the oxidation of the selected sites implies a loss of activity or not, respectively. If we do not wish to use this property to filter, introduce 2. |
gain.ppi |
introduce 1 or 0 to indicate whether the oxidation of the selected sites implies a gain of protein-protein interaction or not, respectively. If we do not wish to use this property to filter, introduce 2. |
loss.ppi |
introduce 1 or 0 to indicate whether or not the oxidation of the selected sites implies a loss of protein-protein interaction or not, respectively. If we do not wish to use this property to filter, introduce 2. |
change.stability |
introduce 1 or 0 to indicate whether the oxidation of the selected sites leads to a change in the protein stability or not, respectively. If we do not wish to use this property to filter, introduce 2. |
change.location |
introduce 1 or 0 to indicate whether or not the oxidation of the selected sites implies a change of localization or not, respectively. If we do not wish to use this property to filter, introduce 2. |
organism |
a character string indicating the scientific name of the species of interest, or -1 if we do not wish to filter by species. |
oxidant |
a character string indicating the oxidant, or -1 if we do not wish to filter by oxidants. |
Note that all the arguments of this function are optional. We only pass an argument to the function when we want to use that parameter to filter. Thus, meto.search() will return all the MetO sites found in the database MetOSite.
This function returns a dataframe with a line per MetO site.
Juan Carlos Aledo
Valverde et al. 2019. Bioinformatics 35:4849-4850 (PMID: 31197322)
meto.scan(), meto.list()
meto.search(organism = 'Homo sapiens', oxidant = 'HClO')
meto.search(organism = 'Homo sapiens', oxidant = 'HClO')
Computes the pairwise distance matrix between two sets of points
pairwise.dist(a, b, squared = TRUE)
pairwise.dist(a, b, squared = TRUE)
a , b
|
matrices (NxD) and (MxD), respectively, where each row represents a D-dimensional point. |
squared |
return containing squared Euclidean distance |
Euclidean distance matrix (NxM). An attribute "squared" set to the
value of param squared
is provided.
pairwise.dist(matrix(1:9, ncol = 3), matrix(9:1, ncol = 3))
pairwise.dist(matrix(1:9, ncol = 3), matrix(9:1, ncol = 3))
Computes distances to the closest aromatic residues.
saro.dist(pdb, threshold = 7, rawdata = FALSE)
saro.dist(pdb, threshold = 7, rawdata = FALSE)
pdb |
either the path to the PDB file of interest or the 4-letters identifier. |
threshold |
distance in ångströms, between the S atom and the aromatic ring centroid, used as threshold. |
rawdata |
logical to indicate whether we also want the raw distance matrix between delta S and aromatic ring centroids. |
For each methionyl residue this function computes the distances to the closest aromatic ring from Y, F and W. When that distance is equal or lower to the threshold, it will be computed as a S-aromatic motif.
The function returns a dataframe with as many rows as methionyl residues are found in the protein. The distances in ångströms to the closest tyrosine, phenylalanine and triptophan are given in the columns, as well as the number of S-aromatic motifs detected with each of these amino acids. Also a raw distance matrix can be provided.
Juan Carlos Aledo
Reid, Lindley & Thornton, FEBS Lett. 1985, 190:209-213.
saro.motif(), saro.geometry()
## Not run: saro.dist('1CLL')
## Not run: saro.dist('1CLL')
Computes distances and angles of S-aromatic motifs.
saro.geometry(pdb, rA, chainA = 'A', rB, chainB = 'A')
saro.geometry(pdb, rA, chainA = 'A', rB, chainB = 'A')
pdb |
either the path to the PDB file of interest or the 4-letters identifier. |
rA |
numeric position of one of the two residues involved in the motif. |
chainA |
a character indicating the chain to which belong the first residue. |
rB |
numeric position of the second residue involved in the motif. |
chainB |
a character indicating the chain to which belong the second residue. |
The distance between the delta sulfur atom and the centroid of the aromatic ring is computed, as well as the angle between this vector and the one perpendicular to the plane containing the aromatic ring. Based on the distance (d) and the angle (theta) the user decide whether the two residues are considered to be S-bonded or not (usually when d < 7 and theta < 60º).
The function returns a dataframe providing the coordinates of the sulfur atom and the centroid (centroids when the aromatic residue is tryptophan), as well as the distance (ångströms) and the angle (degrees) mentioned above.
Juan Carlos Aledo
Reid, Lindley & Thornton, FEBS Lett. 1985, 190, 209-213.
saro.motif(), saro.dist()
## Not run: saro.geometry('1CLL', rA = 141, rB = 145)
## Not run: saro.geometry('1CLL', rA = 141, rB = 145)
Searches for S-aromatic motifs in proteins.
saro.motif(pdb, threshold = 7, onlySaro = TRUE)
saro.motif(pdb, threshold = 7, onlySaro = TRUE)
pdb |
either the path to the PDB file of interest or the 4-letters identifier. |
threshold |
distance in ångströms, between the S atom and the aromatic ring centroid, used as threshold. |
onlySaro |
logical, if FALSE the output includes information about Met residues that are not involved in S-aromatic motifs. |
For each methionyl residue taking place in a S-aromatic motif, this function computes the aromatic residues involved, the distance between the delta sulfur and the aromatic ring's centroid, as well as the angle between the sulfur-aromatic vector and the normal vector of the plane containing the aromatic ring.
The function returns a dataframe reporting the S-aromatic motifs found for the protein of interest.
Juan Carlos Aledo
Reid, Lindley & Thornton, FEBS Lett. 1985, 190, 209-213.
saro.dist(), saro.geometry()
## Not run: saro.motif('1CLL')
## Not run: saro.motif('1CLL')
Computes the cross product of two vectors in three-dimensional euclidean space.
xprod(...)
xprod(...)
... |
vectors involved in the cross product. |
For each methionyl residue taking place in a S-aromatic motif, this function computes the aromatic residue involved, the distance between the delta sulfur and the aromatic ring's centroid, as well as the angle between the sulfur-aromatic vector and the normal vector of the plane containing the aromatic ring.
This function returns a vector that is orthogonal to the plane containing the two vector used as arguments.
Juan Carlos Aledo
xprod(c(1,1,1), c(1,2,1))
xprod(c(1,1,1), c(1,2,1))