
Group Leader
ACADEMIC PROFILES
SOCIAL
REPOSITORIES
CONTACTS
+39 049 827 6260
+39 049 827 6269
BIOGRAPHY
Silvio C. E. Tosatto is currently Full Professor of Bioinformatics and Head of the BioComputing UP lab at the Department of Biomedical Sciences of the University of Padua (Italy). Within ELIXIR, the European infrastructure for blife science data, he is deputy Head of Node of ELIXIR Italy, ExCo of the Data Platform, co-lead of the Cellular & Molecular Research priority area as well as co-lead of the Machine Learning focus group.
ACADEMIC POSITION
Full professor
since (10/2016)
ACADEMIC CAREER & DEGREES
- 2002 – PhD (Dr. rer. nat., Grade: Magna cum laude) in bioinformatics (computer science)
Universität Mannheim – Germany - 1998 – Graduate in Computer Science & Business Administration (Diplom Wirtschaftsinformatiker)
Universität Mannheim – Germany
LANGUAGES
English
Spanish
German
Italian
(Fluent)
(Fluent)
(Native)
(Native)
2025
Journal Articles
Typhaine Paysan-Lafosse; Antonina Andreeva; Matthias Blum; Sara Rocio Chuguransky; Tiago Grego; Beatriz Lazaro Pinto; Gustavo A Salazar; Maxwell L Bileschi; Felipe Llinares-López; Laetitia Meng-Papaxanthos; Lucy J Colwell; Nick V Grishin; R. Dustin Schaeffer; Damiano Clementel; Silvio C. E Tosatto; Erik Sonnhammer; Valerie Wood; Alex Bateman
The Pfam protein families database: Embracing AI/ML Journal Article
In: Nucleic Acids Research, vol. 53, no. D1, pp. D523-D534, 2025, (Cited by: 15; Open Access).
@article{SCOPUS_ID:85214397377,
title = {The Pfam protein families database: Embracing AI/ML},
author = {Typhaine Paysan-Lafosse and Antonina Andreeva and Matthias Blum and Sara Rocio Chuguransky and Tiago Grego and Beatriz Lazaro Pinto and Gustavo A Salazar and Maxwell L Bileschi and Felipe Llinares-López and Laetitia Meng-Papaxanthos and Lucy J Colwell and Nick V Grishin and R. Dustin Schaeffer and Damiano Clementel and Silvio C. E Tosatto and Erik Sonnhammer and Valerie Wood and Alex Bateman},
url = {https://www.scopus.com/record/display.uri?eid=2-s2.0-85214397377&origin=inward},
doi = {10.1093/nar/gkae997},
year = {2025},
date = {2025-01-01},
journal = {Nucleic Acids Research},
volume = {53},
number = {D1},
pages = {D523-D534},
publisher = {Oxford University Press},
abstract = {© 2025 The Author(s) 2024.The Pfam protein families database is a comprehensive collection of protein domains and families used for genome annotation and protein structure and function analysis (https://www.ebi.ac.uk/interpro/). This update describes major developments in Pfam since 2020, including decommissioning the Pfam website and integration with InterPro, harmonization with the ECOD structural classification, and expanded curation of metagenomic, microprotein and repeat-containing families. We highlight how AlphaFold structure predictions are being leveraged to refine domain boundaries and identify new domains. New families discovered through large-scale sequence similarity analysis of AlphaFold models are described. We also detail the development of Pfam-N, which uses deep learning to expand family coverage, achieving an 8.8% increase in UniProtKB coverage compared to standard Pfam. We discuss plans for more frequent Pfam releases integrated with InterPro and the potential for artificial intelligence to further assist curation. Despite recent advances, many protein families remain to be classified, and Pfam continues working toward comprehensive coverage of the protein universe.},
note = {Cited by: 15; Open Access},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Damiano Clementel; Paula Nazarena Arrías; Soroush Mozaffari; Zarifa Osmanli; Ximena Aixa Castro; RepeatsDB Curators; Carlo Ferrari; Andrey V. Kajava; Silvio C. E. Tosatto; Alexander Miguel Monzon
RepeatsDB in 2025: expanding annotations of structured tandem repeats proteins on AlphaFoldDB Journal Article
In: Nucleic Acids Research, vol. 53, no. D1, pp. D575-D581, 2025, (Cited by: 5; Open Access).
@article{SCOPUS_ID:85211995276,
title = {RepeatsDB in 2025: expanding annotations of structured tandem repeats proteins on AlphaFoldDB},
author = {Damiano Clementel and Paula Nazarena Arrías and Soroush Mozaffari and Zarifa Osmanli and Ximena Aixa Castro and RepeatsDB Curators and Carlo Ferrari and Andrey V. Kajava and Silvio C. E. Tosatto and Alexander Miguel Monzon},
url = {https://www.scopus.com/record/display.uri?eid=2-s2.0-85211995276&origin=inward},
doi = {10.1093/nar/gkae965},
year = {2025},
date = {2025-01-01},
journal = {Nucleic Acids Research},
volume = {53},
number = {D1},
pages = {D575-D581},
publisher = {Oxford University Press},
abstract = {© The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research.RepeatsDB (URL: https://repeatsdb.org) stands as a key resource for the classification and annotation of Structured Tandem Repeat Proteins (STRPs), incorporating data from both the Protein Data Bank (PDB) and AlphaFoldDB. This latest release features substantial advancements, including annotations for over 34 000 unique protein sequences from >2000 organisms, representing a fifteenfold increase in coverage. Leveraging state-of-the-art structural alignment tools, RepeatsDB now offers faster and more precise detection of STRPs across both experimental and predicted structures. Key improvements also include a redesigned user interface and enhanced web server, providing an intuitive browsing experience with improved data searchability and accessibility. A new statistics page allows users to explore database metrics based on repeat classifications, while API enhancements support scalability to manage the growing volume of data. These advancements not only refine the understanding of STRPs but also streamline annotation processes, further strengthening RepeatsDB’s role in advancing our understanding of STRP functions.},
note = {Cited by: 5; Open Access},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Alessio Del Conte; Hamidreza Ghafouri; Damiano Clementel; Ivan Mičetić; Damiano Piovesan; Silvio C. E Tosatto; Alexander Miguel Monzon
DRMAAtic: Dramatically improve your cluster potential Journal Article
In: Bioinformatics Advances, vol. 5, no. 1, 2025, (Cited by: 0; Open Access).
@article{SCOPUS_ID:105008238034,
title = {DRMAAtic: Dramatically improve your cluster potential},
author = {Alessio Del Conte and Hamidreza Ghafouri and Damiano Clementel and Ivan Mičetić and Damiano Piovesan and Silvio C. E Tosatto and Alexander Miguel Monzon},
url = {https://www.scopus.com/record/display.uri?eid=2-s2.0-105008238034&origin=inward},
doi = {10.1093/bioadv/vbaf112},
year = {2025},
date = {2025-01-01},
journal = {Bioinformatics Advances},
volume = {5},
number = {1},
publisher = {Oxford University Press},
abstract = {© 2025 The Author(s).Motivation The accessibility and usability of high-performance computing (HPC) resources remain significant challenges in bioinformatics, particularly for researchers lacking extensive technical expertise. While Distributed Resource Managers (DRMs) optimize resource utilization, the complexities of interfacing with these systems often hinder broader adoption. DRMAAtic addresses these challenges by integrating the Distributed Resource Management Application API (DRMAA) with a user-friendly RESTful interface, simplifying job management across diverse HPC environments. This framework empowers researchers to submit, monitor, and retrieve computational jobs securely and efficiently, without requiring deep knowledge of underlying cluster configurations. Results We present DRMAAtic, a flexible and scalable tool that bridges the gap between web interfaces and HPC infrastructures. Built on the Django REST Framework, DRMAAtic supports seamless job submission and management via HTTP calls. Its modular architecture enables integration with any DRM supporting DRMAA APIs and offers robust features such as role-based access control, throttling mechanisms, and dependency management. Successful applications of DRMAAtic include the RING web server for protein structure analysis, the CAID Prediction Portal for disorder and binding predictions, and the Protein Ensemble Database deposition server. These deployments demonstrate DRMAAtic's potential to enhance computational workflows, improve resource efficiency, and facilitate open science in life sciences.},
note = {Cited by: 0; Open Access},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Matthias Blum; Antonina Andreeva; Laise Cavalcanti Florentino; Sara Rocio Chuguransky; Tiago Grego; Emma Hobbs; Beatriz Lazaro Pinto; Ailsa Orr; Typhaine Paysan-Lafosse; Irina Ponamareva; Gustavo A Salazar; Nicola Bordin; Peer Bork; Alan Bridge; Lucy Colwell; Julian Gough; Daniel H Haft; Ivica Letunic; Felipe Llinares-López; Aron Marchler-Bauer; Laetitia Meng-Papaxanthos; Huaiyu Mi; Darren A Natale; Christine A Orengo; Arun P Pandurangan; Damiano Piovesan; Catherine Rivoire; Christian J. A Sigrist; Narmada Thanki; Françoise Thibaud-Nissen; Paul D Thomas; Silvio C. E Tosatto; Cathy H Wu; Alex Bateman
InterPro: The protein sequence classification resource in 2025 Journal Article
In: Nucleic Acids Research, vol. 53, no. D1, pp. D444-D456, 2025, (Cited by: 57; Open Access).
@article{SCOPUS_ID:85214359849,
title = {InterPro: The protein sequence classification resource in 2025},
author = {Matthias Blum and Antonina Andreeva and Laise Cavalcanti Florentino and Sara Rocio Chuguransky and Tiago Grego and Emma Hobbs and Beatriz Lazaro Pinto and Ailsa Orr and Typhaine Paysan-Lafosse and Irina Ponamareva and Gustavo A Salazar and Nicola Bordin and Peer Bork and Alan Bridge and Lucy Colwell and Julian Gough and Daniel H Haft and Ivica Letunic and Felipe Llinares-López and Aron Marchler-Bauer and Laetitia Meng-Papaxanthos and Huaiyu Mi and Darren A Natale and Christine A Orengo and Arun P Pandurangan and Damiano Piovesan and Catherine Rivoire and Christian J. A Sigrist and Narmada Thanki and Françoise Thibaud-Nissen and Paul D Thomas and Silvio C. E Tosatto and Cathy H Wu and Alex Bateman},
url = {https://www.scopus.com/record/display.uri?eid=2-s2.0-85214359849&origin=inward},
doi = {10.1093/nar/gkae1082},
year = {2025},
date = {2025-01-01},
journal = {Nucleic Acids Research},
volume = {53},
number = {D1},
pages = {D444-D456},
publisher = {Oxford University Press},
abstract = {© 2025 The Author(s) 2024.InterPro (https://www.ebi.ac.uk/interpro) is a freely accessible resource for the classification of protein sequences into families. It integrates predictive models, known as signatures, from multiple member databases to classify sequences into families and predict the presence of domains and significant sites. The InterPro database provides annotations for over 200 million sequences, ensuring extensive coverage of UniProtKB, the standard repository of protein sequences, and includes mappings to several other major resources, such as Gene Ontology (GO), Protein Data Bank in Europe (PDBe) and the AlphaFold Protein Structure Database. In this publication, we report on the status of InterPro (version 101.0), detailing new developments in the database, associated web interface and software. Notable updates include the increased integration of structures predicted by AlphaFold and the enhanced description of protein families using artificial intelligence. Over the past two years, more than 5000 new InterPro entries have been created. The InterPro website now offers access to 85 000 protein families and domains from its member databases and serves as a long-Term archive for retired databases. InterPro data, software and tools are freely available.},
note = {Cited by: 57; Open Access},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Maria Cristina Aspromonte; Alessio Del Conte; Shaowen Zhu; Wuwei Tan; Yang Shen; Yexian Zhang; Qi Li; Maggie Haitian Wang; Giulia Babbi; Samuele Bovo; Pier Luigi Martelli; Rita Casadio; Azza Althagafi; Sumyyah Toonsi; Maxat Kulmanov; Robert Hoehndorf; Panagiotis Katsonis; Amanda Williams; Olivier Lichtarge; Su Xian; Wesley Surento; Vikas Pejaver; Sean D. Mooney; Uma Sunderam; Rajgopal Srinivasan; Alessandra Murgia; Damiano Piovesan; Silvio C. E. Tosatto; Emanuela Leonardi
CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs) Journal Article
In: Human Genetics, vol. 144, no. 2, pp. 227-242, 2025, (Cited by: 1; Open Access).
@article{SCOPUS_ID:85217180047,
title = {CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs)},
author = {Maria Cristina Aspromonte and Alessio Del Conte and Shaowen Zhu and Wuwei Tan and Yang Shen and Yexian Zhang and Qi Li and Maggie Haitian Wang and Giulia Babbi and Samuele Bovo and Pier Luigi Martelli and Rita Casadio and Azza Althagafi and Sumyyah Toonsi and Maxat Kulmanov and Robert Hoehndorf and Panagiotis Katsonis and Amanda Williams and Olivier Lichtarge and Su Xian and Wesley Surento and Vikas Pejaver and Sean D. Mooney and Uma Sunderam and Rajgopal Srinivasan and Alessandra Murgia and Damiano Piovesan and Silvio C. E. Tosatto and Emanuela Leonardi},
url = {https://www.scopus.com/record/display.uri?eid=2-s2.0-85217180047&origin=inward},
doi = {10.1007/s00439-024-02722-w},
year = {2025},
date = {2025-01-01},
journal = {Human Genetics},
volume = {144},
number = {2},
pages = {227-242},
publisher = {Springer Science and Business Media Deutschland GmbH},
abstract = {© The Author(s) 2025.The Genetics of Neurodevelopmental Disorders Lab in Padua provided a new intellectual disability (ID) Panel challenge for computational methods to predict patient phenotypes and their causal variants in the context of the Critical Assessment of the Genome Interpretation, 6th edition (CAGI6). Eight research teams submitted a total of 30 models to predict phenotypes based on the sequences of 74 genes (VCF format) in 415 pediatric patients affected by Neurodevelopmental Disorders (NDDs). NDDs are clinically and genetically heterogeneous conditions, with onset in infant age. Here, we assess the ability and accuracy of computational methods to predict comorbid phenotypes based on clinical features described in each patient and their causal variants. We also evaluated predictions for possible genetic causes in patients without a clear genetic diagnosis. Like the previous ID Panel challenge in CAGI5, seven clinical features (ID, ASD, ataxia, epilepsy, microcephaly, macrocephaly, hypotonia), and variants (Pathogenic/Likely Pathogenic, Variants of Uncertain Significance and Risk Factors) were provided. The phenotypic traits and variant data of 150 patients from the CAGI5 ID Panel Challenge were provided as training set for predictors. The CAGI6 challenge confirms CAGI5 results that predicting phenotypes from gene panel data is highly challenging, with AUC values close to random, and no method able to predict relevant variants with both high accuracy and precision. However, a significant improvement is noted for the best method, with recall increasing from 66% to 82%. Several groups also successfully predicted difficult-to-detect variants, emphasizing the importance of variants initially excluded by the Padua NDD Lab.},
note = {Cited by: 1; Open Access},
keywords = {},
pubstate = {published},
tppubtype = {article}
}