Bo Li / Core 3

Bo Li’s research interests include next generation sequencing data analysis–in particular, RNA-Seq data analysis–and statistical learning. He finished his PhD at University of Wisconsin-Madison, where he developed RSEM, one of the most widely used RNA-Seq transcript abundance estimation tools. RSEM was used in big national projects such as TCGA ( Then he moved to Prof. Lior Pachter’s lab as a Postdoc researcher. At Pachter lab, he works on a diverse number of topics, such as RNA-Seq systems biology, single-cell RNA-Seq data analysis. Now his work focuses on building computational models of DMS (dimethyl sulphate) -Seq data, which is a transcriptome-wide, in vivo “version” of SHAPE-seq.

For more information please see Research.

James Lloyd / Project 1
The generation of network of NMD-targeted splicing events: Alternative splicing can generate much diversity in the transcriptome and proteome and is regulated by proteins called splicing factors. While the many of the targets of some splicing factors are known, many splicing outcomes are hidden by nonsense-mediated mRNA decay (NMD). Splicing events that introduce a premature stop codon are not commonly seen in RNA-seq analysis given they are degraded by NMD. To fully appreciate the range of targets a splicing factor acts on, we will knockdown or over-express a given splicing factor in both NMD-competent and -incompetent cell lines. Then RNA will be collected and sequenced, and a computational pipeline developed in the Brenner group will be used to identify putative alternative splicing events dependent on the splicing factor of study, including those normally degraded by NMD. This will be integrated with available physical protein-RNA interaction data to yield a high confidence list of splicing events directly regulated by a splicing factor. This will be performed on a number of candidates from the SR and hnRNP groups of splicing factors. We will then generate a network depicting the relationships between different splicing factors and, between splicing factors and transcripts. The relationships between splicing factors could be examined using an approach similar to that used in [1]. The prevalence of cross-regulation, where one splicing factor alters the splicing of the primary transcript encoding another splicing factor to produce more NMD-targeted variant will be of particular interest. Machine learning techniques will be applied to our RNA-seq datasets to better understand the cis-sequences of the primary transcript that recruit a particular splicing factor. Including isoforms that would normally be lost through NMD will increase the power of such an analysis. Functional enrichment of the targets of different splicing factors will also be analyzed to gain insights into the biological role of these splicing events.

It is also known that many splicing events are stress-responsive [2]. Once the network we have detailed above is complete, we can begin to expose cells to stresses such as heat shock, DNA damage and hypoxia and monitor the cells for differential alternative splicing in NMD-competent and -incompetent cells. This will reveal additional splicing data that we can combine with our network analysis to predict what splicing factors may have a role in response to a tested stress. Response assays to knockdown or over-expression of a splicing factor will confirm if our predictions were accurate. Together, this will identify previously missed interactions between splicing factors and transcripts as well as link splicing factors to important biological processes.

For more information please see Research.