Research at the Mostafavi lab focuses on developing computational and statistical approaches for integrating and interpreting diverse types of genomics data, with the ultimate goal of disentangling meaningful molecular associations for common and complex pathologies, such as neurodegenerative and psychiatric disorders.
1. Network-based integration of heterogeneous genomics data to predict gene function:
An unrealized goal of computational molecular biology is to determine the functional roles of all genes (or proteins) in a cell, and in a context-specific manner. Even in a well-studied organism like the budding yeast, the precise function of ~15% of the protein-coding genes remains unknown. This situation is more severe for humans, where ~40% of protein-coding genes and the majority of non-coding regulatory RNAs, are still uncharacterized in terms of functionality. The lack of systematic knowledge about gene function makes it difficult to interpret the results of “disease studies”, and further is a main obstacle in developing pathway and network based approaches that can tackle the genetic heterogeneity of complex diseases. To address these challenge, we have been developing network-based data integration methods for combining multiple types of genomics data to make accurate prediction about functional role of uncharacterized genes. Current area of research involves cross-species function transfer approaches, inference of function at isoform and splicing levels, and exploration of tissue-specificity of functional interactions between genes.
2. Modeling the impact of genetic variation on multiple types of cellular traits:
Over the last decade, genome-wide association studies (GWAS) have identified a large number of disease-associated genetic loci. As most of these loci fall within the non-coding regions of the human genome, their functional role, and the corresponding downstream cellular events that leads to disease is unknown. As part of multiple large-scale studies (ImmVar, DGN, GTEx and ROSMAP) we have been developing methods for predicting the impact of genetic variation on varying types of downstream cellular traits and gene networks. We are especially interested in developing integrative approaches for predicting the impact of genetic variation for multiple types of cellular traits, such as methylation, histone modification, and acetylation.
3. Modeling known and hidden confounding factors to identify meaningful associations in complex disease:
Common diseases (e.g., cancer, diabetes and neurodegenerative diseases like Alzheimer’s) are multifactorial and have contributions from many genetic and environmental factors. Further, these factors act upon intricately-organized biological systems that have many components and interactions among them. Understanding disease origin and pathogenesis, and designing effective treatments, requires teasing apart these risk factors and understanding their cascading effects on molecular networks. To address these challenges we have been developing computational approaches for inferring and modeling known and hidden confounding factors in the context of complex diseases like Major Depressive Disorder and Alzheimer’s disease. As part of this work, we are also interested in developing approaches that model gene regulatory and functional networks that aid in identifying the key (and likely causal) dysregulatory events in a given disease context.