Description of the methodology used
Sequence based predictions
The DynaMine backbone and sidechain dynamics and conformational propensities are described in:
From protein sequence to dynamics and disorder with DynaMine
Nature Communications 4, 3741 (2013).
The DynaMine webserver: Predicting protein dynamics from sequence
Nucleic Acids Research 42, W264-W270 (2014).
The EFoldMine early folding predictions are described in:
Exploring the Sequence-based Prediction of Folding Initiation Sites in Proteins
Scientific Reports 7, 8826 (2017).
The DisoMine disorder predictions are described in:
Prediction of disordered regions in proteins with recurrent Neural Networks and protein dynamics
bioRxiv 2020.05.25.115253 (2020).
The Agmata beta-sheet aggregation predictions are described in:
Accurate prediction of protein beta-aggregation with generalized statistical potentials
Bioinformatics 36, 2076-2081 (2020).
With these predictions we try to capture the 'emergent' properties of the proteins, so the inherent biophysical propensities encoded in the sequence, rather than the behavior of a final folded state. This relevant as proteins are dynamic even when folded, and might not fold at all (as with intrinsically disordered proteins).
These predictions are single-sequence based, and for multiple sequence alignments (MSAs) for a target protein the median/quartile/outlier information in the plots is derived by running the predictions on each sequence separately, mapping them back to the MSA, and looking at the 'biophysical variation' observed in evolution in relation to the residues of the original protein. These are currently not available for Agmata as the method is computationally too expensive.
The ShiftCrypt method calculates, on the basis of NMR chemical shift values for a protein, a single per-residue value that reflects that residues' biophysical behavior. This article contains further examples and information about the meaning of the ShiftCrypt values:
G. Orlando, D. Raimondi, W.F. Vranken (2019) Auto-encoding NMR chemical shifts from their native vector space to a residue-level biophysical index
Nat. Comm., 10, 2511.