Friday, February 21, 2014, 11:00am
Postdoctoral Researcher, University of California, Berkeley
"Optimal Transcriptome Assembly: From Information Theory to Software"
Abstract: High throughput sequencing of RNA has emerged in the last few years as a powerful method that enables discovery of novel transcripts and alternatively spliced isoforms of genes, along with accurate estimates of gene expression. In this work, we study the fundamental limits of de novo transcriptome assembly using RNA shotgun-sequencing, where the sequencing technology extracts short reads from the RNA transcripts. We propose a new linear-time algorithm for transcriptome reconstruction and derive sufficient conditions on the length of reads under which the algorithm will succeed. We then compare them with necessary conditions that we derive for reconstruction by any algorithm, and show that the proposed algorithm is near-optimal on a real data set. On the way, we show that the NP-hard problem of decomposing a flow into the fewest number of paths can be solved in linear time for a family of instances, which approximates biologically relevant instances. We also describe the construction of a software package for RNA assembly based on this theory and show that it obtains significant improvements in reconstruction accuracy over state-of-the-art software.
Bio: Sreeram Kannan is currently a postdoctoral researcher at the University of California, Berkeley. He received his Ph.D. in Electrical Engineering and M.S. in mathematics from the University of Illinois Urbana- Champaign. He is a co-recipient of the Van Valkenburg research award from UIUC, Qualcomm Roberto Padovani Scholarship, the Qualcomm Cognitive Radio Contest first prize, the S.V.C. Aiya medal from the Indian Institute of Science, and Intel India Student Research Contest first prize. His research interests include applications of information theory and approximation algorithms to wireless networks and, more recently, to computational biology.
Hosted by: EECS