IDPs Infrastructure


To address the challenges posed by IDRs and illuminate the unexplored regions of the genome that are inaccessible to standard homology-based methods, we have developed a suite of cutting-edge tools and databases. Among these resources, the DisProt database has been maintained since 2016 and is considered the gold standard for intrinsic disorder annotation. It is meticulously curated from the literature, providing valuable insights into the characteristics and functions of IDPs.

Additionally, we created the MobiDB database, which offers IDR predictions for the entire protein universe. This database incorporates IDR annotations derived from processing the Protein Data Bank (PDB), a valuable resource for structural information. By combining experimental and computational approaches, MobiDB provides a comprehensive view of IDRs and their functional implications.

Our collaboration with the European Bioinformatics Institute (EMBL-EBI) has enabled us to integrate IDR predictions into InterPro, a widely used resource for protein sequence analysis. This ensures that IDR annotations are propagated into core data resources, including UniProtKB, enhancing the accessibility and utility of IDR information through our MobiDB-lite software.

Furthermore, at the BiocomputingUP Lab, we are dedicated to critically assessing protein intrinsic disorder. We recognize the importance of rigorous evaluation and validation of computational predictions and experimental data related to IDPs/IDRs. To ensure the reliability and accuracy of our resources, the predictions generated by our tools and algorithms undergo thorough validation against experimentally determined data. New methods are included in MobiDB-lite based on the results of the Critical Assessment of protein Intrinsic Disorder (CAID) challenge. By comparing our predictions with available structural and functional annotations as provided by the DisProt database, we validate the quality and reliability of our predictions.At the BiocomputingUP Lab, we understand the challenges associated with studying IDPs/IDRs, including their inherent heterogeneity and the limitations of available experimental techniques. Therefore, we employ a multidisciplinary approach that combines computational methods, experimental data, and expert knowledge to provide a comprehensive and critical assessment of protein intrinsic disorder. To this end we maintain and develop the Protein Ensemble Database (PED), since 2021. PED is the main repository for the deposition of structural ensembles of IDPs. It contains descriptors of the qualitative and quantitative properties of the ensembles and manually curated metadata.