About Me

Who Am I?

Hi, I'm Jenny Chen. I am interested in understanding the evolution of gene regulation and gene expression across mammalian species, and applying my research to improve the diagnosis and treatment of genetic diseases. I am primarily computational, though I have been known to occasionally step foot in the wet lab.

I am currently a Harvard Data Science Postdoctoral Fellow, working with the Hoekstra Lab and the Eddy Lab. Prior to this, I received my PhD from the Bioinformatics and Integrative Genomics (BIG) program at MIT, where I was advised by Aviv Regev. You can check out my publications through my Google Scholar profile.

I also enjoy sharing my science adventures through my twitter and instagram. My codebase is available through github.


Published Projects

The evolutionary history of a gene is frequently used to predict its function and relationship to phenotypic traits. However, current comparative genomics methods focus primarily on sequence conservation, and few methods exist for interpreting comparative expression data.

Using RNA-sequencing data from 7 tissues across 17 mammalian species, I showed that expression evolution across mammals is a nonlinear process and accurately modeled by mathematical model called the Ornstein-Uhlenbeck (OU) process. I demonstrate how to use this model to identify expression pathways underlying conserved and lineage-specific phenotypes. Furthermore, I show how to use this model to estimate the distribution of each gene’s optimal expression level in a given tissue. For example, we may estimate a gene's "optimal expression distribution" to be 25 +/- 1.5 units in the liver, based on our evolutionary data. This distribution can then be used to detect deleterious expression levels (i.e. levels outside the optimal expression distribution) in patient expression data.

This work provides a statistical framework for interpreting expression data across species and in disease. Evolutionary estimates of optimal gene expression is available at the EVEE Gene Browser.

Paper: A quantitative model for characterizing the evolutionary history of mammalian gene
. Genome Research, Jan 2019.

The development of RNA-sequencing technology led to the discovery that thousands, of long noncoding RNAs (lncRNAs) are pervasively transcribed in the mammalian genome. Yet, the function of these genes are unknown.

In a collaboration with Manuel Garber (UMass Medical), we developed a computational tool, slncky, that identifies lncRNAs from RNA-sequencing data and examines the evolutionary history of these genes. Our tool discovers orthologous lncRNAs and calculates metrics relevant for noncoding transcript evolution such as transcript-transcript identity (what percentage of the transcript sequence is transcribed in another species) and splice site conservation.

Our analysis revealed distinct patterns of selection, uncovering several classes of lncRNAs that likely each have distinct functions. Out of the tens and thousands of currently annotated lncRNAs, we identify 233 constrained lncRNAs which are browsable through the slncky Evolution Browser.

Paper: Evolutionary analysis across mammals reveals distinct classes of long non-coding RNAs.
   Genome Biology, January 2016.

In the summer of 2018, I had the awesome opportunity to visit the first molecular biology lab inside the Amazon Rainforest, and to use nanopore sequencing technology to sequence samples ranging from squirrel monkeys to dung beetles. This project is a collaboration between Mrinalini Watsa and Gideon Erkenswick, founders of The Green Lab, as well as Aaron Pomerantz and Stefan Prost, who are working to establish mobile laboratories to accelerate discovery in conservation and ecological genomics.

You can learn more about this project here:


Research Experience

Graduate Student, Aviv Regev Lab, MIT 2012-2018

Developed statistical models and computational tools for inferring biological function from evolutionary signatures of gene expression and transcription across mammalian species.


Get in touch

jennifer underscore chen at fas dot harvard dot edu

Rm 1008, 16 Divinity Avenue
Cambridge, MA