PhD in Bioinformatics and Integrative Genomics, Division of Health Sciences and Technology, MIT 2011-2018
Thesis: Evolutionary signatures for unearthing functional elements in the human transcriptome
Faculty advisor: Aviv Regev
Hi, I'm Jenny Chen. I am interested in understanding how genes encode for social behavioral traits. I take advantage of deer mice of the genus Peromyscus where monogamy has independently evolved multiple times to investigate the molecular mechanisms underlying mating and parental care behaviors. I use techniques spanning classical linkage studies, animal ethology, comparative genomics, and single-cell transcriptomics.
I am currently a MOSAIC K99/R00 scholar and formerly a Harvard Data Science Fellow. My work is a collaboration between the Hoekstra Lab and the Eddy Lab. Prior to this, I received my PhD from MIT where I was advised by Aviv Regev. My publications are listed on my Google Scholar profile and my codebase is available through github.
The evolution of innate behaviors is in part due to genetic variation acting in the nervous system. Gene regulation may be particularly important because it can evolve in a modular brain-region specific fashion through the concerted action of cis- and trans-regulatory changes. To investigate transcriptional variation and its regulatory basis across the brain, we perform RNA-seq on ten different brain subregions in two sister species of deer mice (Peromyscus maniculatus and P. polionotus) - which differ in a range of innate behaviors, including their social system - and their F1 hybrids. We find that interspecific differential expression is pervasive but varies considerably across brain regions. Through analysis of F1 hybrids, we find that much of this modularity is due to cis-regulatory divergence. Together, these results highlight the modularity of gene expression differences and divergence in the brain, which may be key to explain how the evolution of brain gene expression can contribute to the astonishing diversity of animal behaviors.
Paper: Evolution of gene expression across brain regions in behaviorally divergent deer mice. bioRxiv, Sept 2023.
Many methods exist for using comparative sequence data to infer gene function, such as using the extent of sequence conservation to predict essentiality. However, few methods exist to do the same for cross-species expression data. Here, I developed a method that uses a model of continuous trait evolution, the Ornstein-Uhlenbeck process, to estimate the evolutionary optimal distribution of a gene's expression from RNA-seq data across 17 mammals. I showed that these distributions give us clues to a gene's function. For example, essential genes have evolutionary distributions with very small variances, while secreted genes have large variances. I then demonstrated the power of this distribution to infer function from expression patterns, such as detecting expression pathways under directional selection that may underlie lineage-specific phenotypes, or identifying deleterious expression of disease-causing genes from patient data. This work provides a statistical framework for interpreting expression data across species and in disease.
Paper: A quantitative model for characterizing the evolutionary history of mammalian gene
expression. Genome Research, Jan 2019.
The development of RNA-sequencing technology led to the discovery that thousands, of long noncoding RNAs (lncRNAs) are pervasively transcribed in the mammalian genome. Yet, the function of these genes are unknown. In a collaboration with Manuel Garber (UMass Medical), we developed a computational tool, slncky, that identifies lncRNAs from RNA-sequencing data and examines the evolutionary history of these genes. Our tool discovers orthologous lncRNAs and calculates metrics relevant for noncoding transcript evolution such as transcript-transcript identity (what percentage of the transcript sequence is transcribed in another species) and splice site conservation. Our analysis revealed distinct patterns of selection, uncovering several classes of lncRNAs that likely each have distinct functions. Out of the tens and thousands of currently annotated lncRNAs, we identify 233 constrained lncRNAs which are browsable through the slncky Evolution Browser.
Paper: Evolutionary analysis across mammals reveals distinct classes of long non-coding RNAs.
Genome Biology, January 2016.
Thesis: Evolutionary signatures for unearthing functional elements in the human transcriptome
Faculty advisor: Aviv Regev
Thesis: Analysis of Transcription Factor ChIP-Seq Datasets
Faculty advisor: Gill Bejerano
jennifer underscore chen at fas dot harvard dot edu
Rm 1008, 16 Divinity Avenue
Cambridge, MA