Dear all,
Next week, we have the pleasure of having Prof. Paul Medvedev give a talk in the colloquium.
The seminar will be held on Monday, February 12th, at 14:00. Location: C220.
The title, abstract and bio appear below.
Looking forward to seeing you, Sagie and Liat
*Title:* Scalable methods for k-mer based biological sequence analysis
*Abstract:* Scalable analysis of biological sequences often starts by breaking long strings into their constituent k-mers. A k-mer is simply a substring of a short fixed length k. Compact data structures and efficient algorithms for storing and analyzing k-mer datasets have therefore become one of the bottlenecks for biological discovery. In this talk, I will present several techniques we have developed to push the boundaries of what is possible with such datasets. I will present the spectrum-preserving string set representation (RECOMB 2020, best paper award) as well as space-efficient data structures for querying large sequence archives (RECOMB 2017). Time permitting, I will also present our work on the use of sketching algorithms to estimate sequence similarity from k-mer sets (RECOMB 2021 and ISMB 2022).
*Bio*: Paul Medvedev is a Professor in the Department of Computer Science and Engineering and the Department of Biochemistry and Molecular Biology and the Director of the Center for Computational Biology and Bioinformatics at the Pennsylvania State University. His research focus is on developing computer science techniques for analysis of biological data and on answering fundamental biological questions using such methods. Prior to joining Penn State in 2012, he was a postdoc at the University of California, San Diego and a visiting scholar at the Oregon Health & Sciences University and the University of Bielefeld. He received his Ph.D. from the University of Toronto in 2010, his M.Sc. from the University of Southern Denmark in 2004, and his B.S. from the University of California, Los Angeles in 2002.