Title: Deciphering molecular mechanisms of disease upon mutation via semi-suprvised learning

Speaker: Predrag Radivojac

Abstract: A major goal in computational biology is the development of algorithms, analysis techniques, and tools towards deep mechanistic understanding of life at a molecular level. In the process, computational biology must take advantage of the new developments in artificial intelligence and machine learning, and then extend beyond pattern analysis to provide testable hypotheses for experimental scientists. This talk will focus on our contributions to this process and relevant related work. We will first discuss the development of machine learning techniques for partially observable domains such as molecular biology; in particular, methods for accurate estimation of frequency of occurrence of hard-to-measure and rare events. We will show some identifiability results in parametric and nonparametric situations as well as how such frequencies can be used to correct estimated model accuracies. We will then show how these methods play key roles in inferring protein cellular roles and phenotypic effects of genomic mutations, with an emphasis on understanding the molecular mechanisms of human genetic disease. We further assessed the value of these methods in the wet lab where we tested the molecular mechanisms behind selected de novo mutations in a cohort of individuals with neurodevelopmental disorders. Finally, we will discuss implications on future research in machine learning, genome interpretation, and precision health.

Bio: Predrag Radivojac joined Northeastern University as a Professor in the Khoury College of Computer Sciences. Prior to joining Northeastern he was a Professor of Computer Science at Indiana University Bloomington and Associate Chair in the Department of Computer Science.

Prof. Radivojac’s primary research interests include computational biology and machine learning. He is motivated to improve our understanding of life at a molecular level and how molecular events affect higher level phenotypes. His group addresses such questions through the development of algorithms and analysis techniques related to the function of biological macromolecules, mass spectrometry proteomics, genome interpretation, and precision health; e.g., he is interested in elucidating the molecular mechanisms of disease consequent to genetic variation. In the area of machine learning, Prof. Radivojac’s research addresses foundational and applied problems in semi-supervised learning, structured-output learning, and active learning, and investigates topics such as kernels and distance functions (e.g., metrics) across data types and analysis techniques. He is also interested in performance evaluation of machine learning algorithms, especially in the hierarchical structured-output domains and cases of selection bias that often arise in the open world setting.

Prof. Radivojac received the National Science Foundation CAREER Award in 2007 and is an August-Wilhelm Scheer Visiting Professor at Technical University of Munich (TUM) as well as an honorary member of the Institute for Advanced Study at TUM. Prof. Radivojac co-directed all of data sciences and informatics within the multi-campus Precision Health Initiative of Indiana University. He is currently an Editorial Board member for the journal Bioinformatics, Associate Editor for PLoS Computational Biology, and serves his third (elected) term on the Board of Directors of the International Society for Computational Biology (ISCB).

Paper