PI: Prof. James CHENG (CSE, CUHK)

Project description

This research proposes to develop effective and scalable techniques for managing and analyzing big graph data, and to apply them for searching and analyzing multimedia data, especially data in online social networks (e.g., Facebook, Twitter, Google+, LinkedIn) and online shopping platforms (e.g., eBay, Amazon, Taobao). Graph is widely used to model online social network structures, as well as many complicated data types such as images and pictures, videos, and other interactivity contents, etc. We propose to study fundamental issues of big graph data research, including the study of elementary substructures and important properties of massive graphs, the modeling of these graphs for prediction and extrapolation studies, and the applications of big graph data in areas that may create impacts in both industry and academia.

Current progress

A new general-purpose, distributed graph-computing system, called Pregel+, has been developed to process different types of graphs and implement different graph algorithms. The details of the Pregel+ system are documented in Pregel+’s webpage: http://www.cse.cuhk.edu.hk/pregelplus/.

Work to be done

The next stage of the project is to apply the Pregel+ system to develop a set of graph analytics tools for studying massive real-world graphs, and to apply the research results for industry applications.


Da Yan, James Cheng, Yi Lu, and Wilfred Ng. Effective Techniques for Message Reduction and Load Balancing in Distributed Graph Computation. To Appear in WWW 2015.

Yi Lu, James Cheng, Da Yan, and Huanhuan Wu. Large-Scale Distributed Graph Computing Frameworks: An Experimental Evaluation. In PVLDB 2015, Volume 8, Number 3, Pages 281-292, 2014.

Da Yan, James Cheng, Kai Xing, Yi Lu, Wilfred Ng, and Yingyi Bu. Pregel Algorithms for Graph Connectivity Problems with Performance Guarantees. In PVLDB 2014, Volume 7, Number 14, Pages 1821-1832, 2014.