Seminar: Building Scalable Big Data Pipelines: Graph Processing and Genome Assembly

Kisung Lee
Assistant Professor, LSU Division of Computer Science and Engineering
Friday September 4, 2020
3:00 pm
Location: online-only
Abstract
The volume of real-world data in many domains is growing at an unprecedented rate. To address such big data challenges, I have been working on several research projects for designing and building scalable techniques and frameworks. This presentation will focus specifically on two distributed frameworks, one for scalable assembly of third-generation genome sequences and the other for scalable graph data processing using a NoSQL store.
I will first present a distributed genome assembly framework that can assemble large-scale third-generation sequence datasets using thousands of cores, resulting in faster assembly. The framework is built on the map-reduce computation model. I will then describe a distributed graph processing framework for iterative algorithms. The framework utilizes a disk-based NoSQL system to process big graph data in a scalable manner while improving the overall performance through several optimization techniques.
Bio
Dr. Kisung Lee is an assistant professor in the Division of Computer Science and Engineering at Louisiana State ¾Å¾Å¸£ÀûÍø. He received his doctoral degree in computer science from the Georgia Institute of Technology in 2015. During his doctoral study, he spent three summers at IBM Research T.J. Watson as a research intern. His research interests lie in the intersection of big data and distributed data-intensive systems. He is also working on research problems in spatial data management, social network analytics, and bioinformatics. He is a recipient of the Tiger Athletic Foundation Undergraduate Teaching Award in 2020. He served as a Program Committee Vice-Chair for IEEE BigData 2018.