The Chinese University of Hong Kong
Department of Computer Science and Engineering

Seminar

Title: Protein Interaction Module Detection Using Spectral Clustering
Date: December 21, 2006 (Thursday)
Time: 4:00 p.m. - 5:00 p.m.
Venue: Room 1027, 10/F, Ho Sin-hang Engineering Building,
The Chinese University of Hong Kong,
Shatin, N.T.
Speaker: Professor Chris Ding
Lawrence Berkeley National Laboratory
USA

ABSTRACT:

Proteins carry out most cellular processes as protein modules. Systematic identification of these protein functional modules provide essential knowledge linking proteome dynamics to cellular function and phenotype. This is one of the most challenging tasks at present since most genomes are successfully sequenced and genes identified. We give a brief introduction to the rapidly growing field of genomics and clarify the vital role of protein interaction studies.

We then describe the spectral (eigenvector-related) clustering approach that is used to compute dense clusters from the protein interaction network. Spectral clustering recently emerge as the state-of-art methods for data clustering, image segmentation, VLSI circuit placement, etc. After a very brief historic review, we start with K-means clustering and show its solution is given by the principal component analysis (PCA). We then discuss well-motivated graph clustering objective functions of the ratio cut, normalized cut and min-max cut graph clustering and show that the optimal solution for the cluster membership indicators turn out to be the eigenvectors of the graph Laplacian matrix. This matrix-based approach is going through a renaissance with large number of new developments. We discuss perturbation analysis which reveals the self-aggregation property. We discuss connectivity matrix approach, which give good global solution for extracting cluster structure.

We present a large number of results on protein interaction modules discovered in S. cerevisiae (yeast) and Pyrococcus, Sulfolobus, Halobacterium, important micro-organisms for environment studies. Some of these discover protein complexes have been experimentally verified independently by our collaborators. We discuss the biological significance of the discovered protein modules.

BIOGRAPHY:

Chris Ding is a staff computer scientist at Lawrence Berkeley National Laboratory. He received a Ph.D. from Columbia University and did research at California Institute of Technology and Jet Propulsion Laboratory before joining the Berkeley Lab. His research focus on bioinformatics and machine learning / data mining. He develops efficient clustering and graph algorithms using matrix computation.

Enquiries: Miss Temmy So at tel 2609 8444

For more information, please refer to http://www.cse.cuhk.edu.hk/seminar

**** ALL ARE WELCOME ****