
Group Leader
ACADEMIC PROFILES
SOCIAL
REPOSITORIES
CONTACTS
+39 049 827 6260
+39 049 827 6269
BIOGRAPHY
Silvio C. E. Tosatto is currently Full Professor of Bioinformatics and Head of the BioComputing UP lab at the Department of Biomedical Sciences of the University of Padua (Italy). Within ELIXIR, the European infrastructure for blife science data, he is deputy Head of Node of ELIXIR Italy, ExCo of the Data Platform, co-lead of the Cellular & Molecular Research priority area as well as co-lead of the Machine Learning focus group.
ACADEMIC POSITION
Full professor
since (10/2016)
ACADEMIC CAREER & DEGREES
- 2002 – PhD (Dr. rer. nat., Grade: Magna cum laude) in bioinformatics (computer science)
Universität Mannheim – Germany - 1998 – Graduate in Computer Science & Business Administration (Diplom Wirtschaftsinformatiker)
Universität Mannheim – Germany
LANGUAGES
English
Spanish
German
Italian
(Fluent)
(Fluent)
(Native)
(Native)
2025
Journal Articles
Typhaine Paysan-Lafosse; Antonina Andreeva; Matthias Blum; Sara Rocio Chuguransky; Tiago Grego; Beatriz Lazaro Pinto; Gustavo A Salazar; Maxwell L Bileschi; Felipe Llinares-López; Laetitia Meng-Papaxanthos; Lucy J Colwell; Nick V Grishin; R. Dustin Schaeffer; Damiano Clementel; Silvio C. E Tosatto; Erik Sonnhammer; Valerie Wood; Alex Bateman
The Pfam protein families database: Embracing AI/ML Journal Article
In: Nucleic Acids Research, vol. 53, no. D1, pp. D523-D534, 2025, (Cited by: 10; Open Access).
@article{SCOPUS_ID:85214397377,
title = {The Pfam protein families database: Embracing AI/ML},
author = {Typhaine Paysan-Lafosse and Antonina Andreeva and Matthias Blum and Sara Rocio Chuguransky and Tiago Grego and Beatriz Lazaro Pinto and Gustavo A Salazar and Maxwell L Bileschi and Felipe Llinares-López and Laetitia Meng-Papaxanthos and Lucy J Colwell and Nick V Grishin and R. Dustin Schaeffer and Damiano Clementel and Silvio C. E Tosatto and Erik Sonnhammer and Valerie Wood and Alex Bateman},
url = {https://www.scopus.com/record/display.uri?eid=2-s2.0-85214397377&origin=inward},
doi = {10.1093/nar/gkae997},
year = {2025},
date = {2025-01-01},
journal = {Nucleic Acids Research},
volume = {53},
number = {D1},
pages = {D523-D534},
publisher = {Oxford University Press},
abstract = {© 2025 The Author(s) 2024.The Pfam protein families database is a comprehensive collection of protein domains and families used for genome annotation and protein structure and function analysis (https://www.ebi.ac.uk/interpro/). This update describes major developments in Pfam since 2020, including decommissioning the Pfam website and integration with InterPro, harmonization with the ECOD structural classification, and expanded curation of metagenomic, microprotein and repeat-containing families. We highlight how AlphaFold structure predictions are being leveraged to refine domain boundaries and identify new domains. New families discovered through large-scale sequence similarity analysis of AlphaFold models are described. We also detail the development of Pfam-N, which uses deep learning to expand family coverage, achieving an 8.8% increase in UniProtKB coverage compared to standard Pfam. We discuss plans for more frequent Pfam releases integrated with InterPro and the potential for artificial intelligence to further assist curation. Despite recent advances, many protein families remain to be classified, and Pfam continues working toward comprehensive coverage of the protein universe.},
note = {Cited by: 10; Open Access},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Damiano Clementel; Paula Nazarena Arrías; Soroush Mozaffari; Zarifa Osmanli; Ximena Aixa Castro; RepeatsDB Curators; Carlo Ferrari; Andrey V. Kajava; Silvio C. E. Tosatto; Alexander Miguel Monzon
RepeatsDB in 2025: expanding annotations of structured tandem repeats proteins on AlphaFoldDB Journal Article
In: Nucleic Acids Research, vol. 53, no. D1, pp. D575-D581, 2025, (Cited by: 4; Open Access).
@article{SCOPUS_ID:85211995276,
title = {RepeatsDB in 2025: expanding annotations of structured tandem repeats proteins on AlphaFoldDB},
author = {Damiano Clementel and Paula Nazarena Arrías and Soroush Mozaffari and Zarifa Osmanli and Ximena Aixa Castro and RepeatsDB Curators and Carlo Ferrari and Andrey V. Kajava and Silvio C. E. Tosatto and Alexander Miguel Monzon},
url = {https://www.scopus.com/record/display.uri?eid=2-s2.0-85211995276&origin=inward},
doi = {10.1093/nar/gkae965},
year = {2025},
date = {2025-01-01},
journal = {Nucleic Acids Research},
volume = {53},
number = {D1},
pages = {D575-D581},
publisher = {Oxford University Press},
abstract = {© The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research.RepeatsDB (URL: https://repeatsdb.org) stands as a key resource for the classification and annotation of Structured Tandem Repeat Proteins (STRPs), incorporating data from both the Protein Data Bank (PDB) and AlphaFoldDB. This latest release features substantial advancements, including annotations for over 34 000 unique protein sequences from >2000 organisms, representing a fifteenfold increase in coverage. Leveraging state-of-the-art structural alignment tools, RepeatsDB now offers faster and more precise detection of STRPs across both experimental and predicted structures. Key improvements also include a redesigned user interface and enhanced web server, providing an intuitive browsing experience with improved data searchability and accessibility. A new statistics page allows users to explore database metrics based on repeat classifications, while API enhancements support scalability to manage the growing volume of data. These advancements not only refine the understanding of STRPs but also streamline annotation processes, further strengthening RepeatsDB’s role in advancing our understanding of STRP functions.},
note = {Cited by: 4; Open Access},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Matthias Blum; Antonina Andreeva; Laise Cavalcanti Florentino; Sara Rocio Chuguransky; Tiago Grego; Emma Hobbs; Beatriz Lazaro Pinto; Ailsa Orr; Typhaine Paysan-Lafosse; Irina Ponamareva; Gustavo A Salazar; Nicola Bordin; Peer Bork; Alan Bridge; Lucy Colwell; Julian Gough; Daniel H Haft; Ivica Letunic; Felipe Llinares-López; Aron Marchler-Bauer; Laetitia Meng-Papaxanthos; Huaiyu Mi; Darren A Natale; Christine A Orengo; Arun P Pandurangan; Damiano Piovesan; Catherine Rivoire; Christian J. A Sigrist; Narmada Thanki; Françoise Thibaud-Nissen; Paul D Thomas; Silvio C. E Tosatto; Cathy H Wu; Alex Bateman
InterPro: The protein sequence classification resource in 2025 Journal Article
In: Nucleic Acids Research, vol. 53, no. D1, pp. D444-D456, 2025, (Cited by: 13; Open Access).
@article{SCOPUS_ID:85214359849,
title = {InterPro: The protein sequence classification resource in 2025},
author = {Matthias Blum and Antonina Andreeva and Laise Cavalcanti Florentino and Sara Rocio Chuguransky and Tiago Grego and Emma Hobbs and Beatriz Lazaro Pinto and Ailsa Orr and Typhaine Paysan-Lafosse and Irina Ponamareva and Gustavo A Salazar and Nicola Bordin and Peer Bork and Alan Bridge and Lucy Colwell and Julian Gough and Daniel H Haft and Ivica Letunic and Felipe Llinares-López and Aron Marchler-Bauer and Laetitia Meng-Papaxanthos and Huaiyu Mi and Darren A Natale and Christine A Orengo and Arun P Pandurangan and Damiano Piovesan and Catherine Rivoire and Christian J. A Sigrist and Narmada Thanki and Françoise Thibaud-Nissen and Paul D Thomas and Silvio C. E Tosatto and Cathy H Wu and Alex Bateman},
url = {https://www.scopus.com/record/display.uri?eid=2-s2.0-85214359849&origin=inward},
doi = {10.1093/nar/gkae1082},
year = {2025},
date = {2025-01-01},
journal = {Nucleic Acids Research},
volume = {53},
number = {D1},
pages = {D444-D456},
publisher = {Oxford University Press},
abstract = {© 2025 The Author(s) 2024.InterPro (https://www.ebi.ac.uk/interpro) is a freely accessible resource for the classification of protein sequences into families. It integrates predictive models, known as signatures, from multiple member databases to classify sequences into families and predict the presence of domains and significant sites. The InterPro database provides annotations for over 200 million sequences, ensuring extensive coverage of UniProtKB, the standard repository of protein sequences, and includes mappings to several other major resources, such as Gene Ontology (GO), Protein Data Bank in Europe (PDBe) and the AlphaFold Protein Structure Database. In this publication, we report on the status of InterPro (version 101.0), detailing new developments in the database, associated web interface and software. Notable updates include the increased integration of structures predicted by AlphaFold and the enhanced description of protein families using artificial intelligence. Over the past two years, more than 5000 new InterPro entries have been created. The InterPro website now offers access to 85 000 protein families and domains from its member databases and serves as a long-Term archive for retired databases. InterPro data, software and tools are freely available.},
note = {Cited by: 13; Open Access},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Maria Cristina Aspromonte; Alessio Del Conte; Shaowen Zhu; Wuwei Tan; Yang Shen; Yexian Zhang; Qi Li; Maggie Haitian Wang; Giulia Babbi; Samuele Bovo; Pier Luigi Martelli; Rita Casadio; Azza Althagafi; Sumyyah Toonsi; Maxat Kulmanov; Robert Hoehndorf; Panagiotis Katsonis; Amanda Williams; Olivier Lichtarge; Su Xian; Wesley Surento; Vikas Pejaver; Sean D. Mooney; Uma Sunderam; Rajgopal Srinivasan; Alessandra Murgia; Damiano Piovesan; Silvio C. E. Tosatto; Emanuela Leonardi
CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs) Journal Article
In: Human Genetics, 2025, (Cited by: 1; Open Access).
@article{SCOPUS_ID:85217180047,
title = {CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs)},
author = {Maria Cristina Aspromonte and Alessio Del Conte and Shaowen Zhu and Wuwei Tan and Yang Shen and Yexian Zhang and Qi Li and Maggie Haitian Wang and Giulia Babbi and Samuele Bovo and Pier Luigi Martelli and Rita Casadio and Azza Althagafi and Sumyyah Toonsi and Maxat Kulmanov and Robert Hoehndorf and Panagiotis Katsonis and Amanda Williams and Olivier Lichtarge and Su Xian and Wesley Surento and Vikas Pejaver and Sean D. Mooney and Uma Sunderam and Rajgopal Srinivasan and Alessandra Murgia and Damiano Piovesan and Silvio C. E. Tosatto and Emanuela Leonardi},
url = {https://www.scopus.com/record/display.uri?eid=2-s2.0-85217180047&origin=inward},
doi = {10.1007/s00439-024-02722-w},
year = {2025},
date = {2025-01-01},
journal = {Human Genetics},
publisher = {Springer Science and Business Media Deutschland GmbH},
abstract = {© The Author(s) 2025.The Genetics of Neurodevelopmental Disorders Lab in Padua provided a new intellectual disability (ID) Panel challenge for computational methods to predict patient phenotypes and their causal variants in the context of the Critical Assessment of the Genome Interpretation, 6th edition (CAGI6). Eight research teams submitted a total of 30 models to predict phenotypes based on the sequences of 74 genes (VCF format) in 415 pediatric patients affected by Neurodevelopmental Disorders (NDDs). NDDs are clinically and genetically heterogeneous conditions, with onset in infant age. Here, we assess the ability and accuracy of computational methods to predict comorbid phenotypes based on clinical features described in each patient and their causal variants. We also evaluated predictions for possible genetic causes in patients without a clear genetic diagnosis. Like the previous ID Panel challenge in CAGI5, seven clinical features (ID, ASD, ataxia, epilepsy, microcephaly, macrocephaly, hypotonia), and variants (Pathogenic/Likely Pathogenic, Variants of Uncertain Significance and Risk Factors) were provided. The phenotypic traits and variant data of 150 patients from the CAGI5 ID Panel Challenge were provided as training set for predictors. The CAGI6 challenge confirms CAGI5 results that predicting phenotypes from gene panel data is highly challenging, with AUC values close to random, and no method able to predict relevant variants with both high accuracy and precision. However, a significant improvement is noted for the best method, with recall increasing from 66% to 82%. Several groups also successfully predicted difficult-to-detect variants, emphasizing the importance of variants initially excluded by the Padua NDD Lab.},
note = {Cited by: 1; Open Access},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Maria Cristina Aspromonte; Alessio Del Conte; Roberta Polli; Demetrio Baldo; Francesco Benedicenti; Elisa Bettella; Stefania Bigoni; Stefania Boni; Claudia Ciaccio; Stefano D’Arrigo; Ilaria Donati; Elisa Granocchio; Isabella Mammi; Donatella Milani; Susanna Negrin; Margherita Nosadini; Fiorenza Soli; Franco Stanzial; Licia Turolla; Damiano Piovesan; Silvio C. E. Tosatto; Alessandra Murgia; Emanuela Leonardi
Genetic variants and phenotypic data curated for the CAGI6 intellectual disability panel challenge Journal Article
In: Human Genetics, 2025, (Cited by: 0; Open Access).
@article{SCOPUS_ID:86000084600,
title = {Genetic variants and phenotypic data curated for the CAGI6 intellectual disability panel challenge},
author = {Maria Cristina Aspromonte and Alessio Del Conte and Roberta Polli and Demetrio Baldo and Francesco Benedicenti and Elisa Bettella and Stefania Bigoni and Stefania Boni and Claudia Ciaccio and Stefano D’Arrigo and Ilaria Donati and Elisa Granocchio and Isabella Mammi and Donatella Milani and Susanna Negrin and Margherita Nosadini and Fiorenza Soli and Franco Stanzial and Licia Turolla and Damiano Piovesan and Silvio C. E. Tosatto and Alessandra Murgia and Emanuela Leonardi},
url = {https://www.scopus.com/record/display.uri?eid=2-s2.0-86000084600&origin=inward},
doi = {10.1007/s00439-025-02733-1},
year = {2025},
date = {2025-01-01},
journal = {Human Genetics},
publisher = {Springer Science and Business Media Deutschland GmbH},
abstract = {© The Author(s) 2025.Neurodevelopmental disorders (NDDs) are common conditions including clinically diverse and genetically heterogeneous diseases, such as intellectual disability, autism spectrum disorders, and epilepsy. The intricate genetic underpinnings of NDDs pose a formidable challenge, given their multifaceted genetic architecture and heterogeneous clinical presentations. This work delves into the intricate interplay between genetic variants and phenotypic manifestations in neurodevelopmental disorders, presenting a dataset curated for the Critical Assessment of Genome Interpretation (CAGI6) ID Panel Challenge. The CAGI6 competition serves as a platform for evaluating the efficacy of computational methods in predicting phenotypic outcomes from genetic data. In this study, a targeted gene panel sequencing has been used to investigate the genetic causes of NDDs in a cohort of 415 paediatric patients. We identified 60 pathogenic and 49 likely pathogenic variants in 102 individuals that accounted for 25% of NDD cases in the cohort. The most mutated genes were ANKRD11, MECP2, ARID1B, ASH1L, CHD8, KDM5C, MED12 and PTCHD1 The majority of pathogenic variants were de novo, with some inherited from mildly affected parents. Loss-of-function variants were the most common type of pathogenic variant. In silico analysis tools were used to assess the potential impact of variants on splicing and structural/functional effects of missense variants. The study highlights the challenges in variant interpretation especially in cases with atypical phenotypic manifestations. Overall, this study provides valuable insights into the genetic causes of NDDs and emphasises the importance of understanding the underlying genetic factors for accurate diagnosis, and intervention development in neurodevelopmental conditions.},
note = {Cited by: 0; Open Access},
keywords = {},
pubstate = {published},
tppubtype = {article}
}