Mobi 2.0


Mobi 2.0 is a completely re-written version of the Mobi software for the determination of intrinsically disordered and mobile regions from PDB structures. Mobi 2.0 is designed to aggregate information from different PDB structures mapping to the same UniProt sequence. See methods for more information.

Missing residues are calculated by comparing the sequence used in the experiment (SEQRES in PDB files) and the resolved residues (ATOM field in PDB). Missing residues are evaluated for all types of experiments (X-ray, NMR, etc.).

Mobi 2.0 exploits SIFTS for both retrieving missing residues and mapping PDB structures to UniProt entries. In particular it considers the following lines of SIFTS XML files:

<residueDetail dbSource=”PDBe” property=”Annotation”>Not_Observed</residueDetail>

to detect missing residues, and:

<crossRefDb dbSource=”PDB” dbCoordSys=”PDBresnum” dbAccessionId=”3q26″ dbResNum=”15″ dbResName=”ASP” dbChainId=”A”/>
<crossRefDb dbSource=”UniProt” dbCoordSys=”UniProt” dbAccessionId=”P0AEX9″ dbResNum=”40″ dbResName=”D”/>

to map PDB residues to UniProt.

Mobi 2.0 infers High Temperature (HT) disorder by looking at the B-factor values in the PDB file. The B-factor is transformed according to the following formula:

HT = B-factor / (2.0 * Wilson_B * c)

HT is cut at 1.0 and c is an empirical value (Wilson_b_factor in the software source). Disorder is assigned when HT > 0.5.
Wilson B values map to crystal resolution as in the following table (Resultion (Å) » Wilson B):

0.00 »  10.0
1.00 »  11.0
1.25 »  14.0
1.50 »  18.0
1.75 »  23.0
2.25 »  36.0
2.50 »  44.0
2.75 »  54.0
3.00 »  66.0
3.25 »  82.0
3.50 »  93.0
3.75 » 112.0
4.00 » 135.0
4.25 » 162.0
4.50 » 194.0
4.75 » 233.0
5.00 » 280.0
5.25 » 336.0
5.50 » 404.0
5.75 » 485.0
6.00 » 550.0

Mobi is a module of Mobi 2.0 to find regions with different conformations among all the models in a NMR ensemble. It has been published in 2010(1) as a webserver. Mobi has been optimised to replicate the ordered-disordered definition used in CASP8. Mobi superimposes all models in the NMR ensemble by using TM-Align. Each position is assigned as disordered if the average Scaled Distance (SD) is below a threshold. The SD formula is:

SD = 1/(1 + (d/d0)²)

Where d is the distance between two corresponding Cα atoms and d0 is the normalisation scaled distance factor.

Post-processing

Disorder is also assigned when models have different secondary structure or when all models are C (coil) or S (non-hydrogen bond bend).
Patterns on the left are replaced with patterns on the right in order to remove spurious assignments:

1011   »   1111
1101   »   1111
10011  »  11111
11001  »  11111
01010  »  00000
00100  »  00000
001100 » 000000

Also:

110 » 111

If the third position is mobile according to Phi, Psi and Scaled Distance Standard Deviation, and the previous amino acid is mobile according to Psi definition.

011 » 111

If the non-mobile is mobile according to Phi, Psi and Scaled Distance Standard Deviation, as well as the next amino acid is mobile according to Phi definition

Training

Mobi has been trained by measuring the F-score on 18 NMR structures from CASP8. The max F-score reached after the optimization is 93.9.

Thresholds were selected as follows. For each of the 18 protein structures and each of the two structural alignment program used (TM-Score and Theseus), a grid search was performed with a leave one out cross-validation.

ParameterRangeStepOptimal threshold
Ca Distance, d0 (Å)1.0 – 10.01.0=4.0
Average Scaled Distance0.60 – 1.000.01<0.85
Scaled Distance Standard Deviation0.01 – 0.200.01>0.09
Angle (Phi, Psi) Standard Deviation (∘)2.5 – 40.02.5>20.0
  1. Martin AJ, Walsh I, Tosatto SC. MOBI: a web server to define and visualize structural mobility in NMR protein ensembles. Bioinformatics. 2010. 26(22):2916-2917

Linear Interacting Peptides (LIPs) correspond to structure fragments that directly interact with another subunit and have a linear, non-globular, structure. The Structural Linearity (SL) is calculated by considering the Residue Interaction Network (RIN) generated by the RING software. SL is calculated for each residue considering a window of 11 consecutive residues and measuring:

SL = inter_sc / (intra + intra_long * 4.0)

Where inter_sc is the number of inter-chain contacts involving at least one side-chain atom, intra is the sum of all intra-chain contacts of the blob and intra_long corresponds to long-range contacts (sequence separation > 7). The last term allows to filter out linear strands that form β-sheets.

Training

SL parameters were trained by considering the ANCHOR dataset and visually evaluating the overlap of LIPs and ANCHOR examples.

The same consensus strategy is applied separately for each disorder definition (missing residues, high temperature, mobility, LIPs).
Disorder/order assignment is very strict, i.e. 90% agreement is required. All position that does not reach the threshold are considered “context-dependent”, i.e. behaving as ordered or disordered depending on different conditions (pH, temperature, binding state, etc.)

  1. Martin AJ, Walsh I, Tosatto SC. MOBI: a web server to define and visualize structural mobility in NMR protein ensembles. Bioinformatics. 2010. 26(22):2916-2917
  2. Potenza E, Di Domenico T, Walsh I, Tosatto SC. MobiDB 2.0: an improved database of intrinsically disordered and mobile proteins. Nucleic Acids Res. 2015. 43(D1):D315-D320
  3. Piovesan D, Tabaro F, Mičetić I, Necci M, Quaglia F, Oldfield CJ, Aspromonte MC, Davey NE, Davidović R, Dosztányi Z, Elofsson A, Gasparini A, Hatos A, Kajava AV, Kalmar L, Leonardi E, Lazar T, Macedo-Ribeiro S, Macossay-Castillo M, Meszaros A, Minervini G, Murvai N, Pujols J, Roche DB, Salladini E, Schad E, Schramm A, Szabo B, Tantos A, Tonello F, Tsirigos KD, Veljković N, Ventura S, Vranken W, Warholm P, Uversky VN, Dunker AK, Longhi S, Tompa P, Tosatto SC. DisProt 7.0: a major update of the database of disordered proteins. Nucleic Acids Res. 2017. 45(D1):D219-D227