Publications |
| DBLP PubMed BibTeX |
|
*: Co-first authorship, $: Co-corresponding authorship Journal papers:
, Chao Cheng and Mark Gerstein,Machine Learning and Genome Annotation: A Match Meant to be?, Genome Biology, (in press). Preprint
The ENCODE Project Consortium,An Integrated Encyclopedia of DNA Elements in the Human Genome, Nature, vol. 489, no. 7414, pp. 57-74, (2012). Full Text
, Chao Cheng, Nitin Bhardwaj, James B. Brown, Jing Leng, Anshul Kundaje, Joel Rozowsky, Ewan Birney, Peter Bickel, Michael Snyder and Mark Gerstein,Classification of Human Genomic Regions based on Experimentally-determined Binding Sites of More Than 100 Transcription-related Factors, Genome Biology, vol. 13, issue 9, no. R48, (2012). Preprint Full Text
Mark B. Gerstein*$, Anshul Kundaje*, Manoj Hariharan*, Stephen G. Landt*, Koon-Kiu Yan*, Chao Cheng*, Xinmeng Jasmine Mu*, Ekta Khurana*, Joel Rozowsky*, Roger Alexander*, Renqiang Min*, Pedro Alves*, Alexej Abyzov, Nick Addleman, Nitin Bhardwaj, Alan P. Boyle, Philip Cayting, Alexandra Charos, David Z. Chen, Yong Cheng, Declan Clarke, Catharine Eastman, Ghia Euskirchen, Seth Frietze, Yao Fu, Jason Gertz, Fabian Grubert, Arif Harmanci, Preti Jain, Maya Kasowski, Phil Lacroute, Jing (Jane) Leng, Jin Lian, Hannah Monahan, Henriette O'Geen, Zhengqing Ouyang, E. Christopher Partridge, Dorrelyn Patacsil, Florencia Pauli, Debasish Raha, Lucia Ramirez, Timothy E. Reddy, Brian Reed, Minyi Shi, Teri Slifer, Jing Wang, Linfeng Wu, Xinqiong Yang, , Gili Zilberman-Schapira, Serafim Batzoglou, Arend Sidow, Peggy J. Farnham, Richard M. Myers, Sherman M. Weissman and Michael Snyder$,Architecture of the Human Regulatory Network Derived from ENCODE Data, Nature, vol. 489, no. 7414, pp. 91-100, (2012). Full Text
Chao Cheng, Roger Alexander, Renqiang Min, Jing Leng, , Joel Rozowsky, Koon-kiu Yan, Xianjun Dong, Sarah Djebali, Yijun Ruan, Carrie A Davis, Piero Carninci, Timo Lassman, Thomas R. Gingeras, Roderic Guigó Serra, Ewan Birney, Zhiping Weng, Michael Snyder and Mark Gerstein,Understanding Transcriptional Regulation by Integrative Analysis of Transcription Factor Binding Data, Genome Research, vol. 22, no. 9, pp. 1658-1667, (2012). Preprint Full Text
Hai-Lu Zhao*$, Yi Sui*, Chun-Feng Qiao, , Ross K.K. Leung, Stephen K.W. Tsui, Harriet K.T. Wong, Xun Zhu, Jennifer J. Siu, Lan He, Jing Guan, Li-Zhong Liu, Heung-Man Lee, Hong-Xi Xu, Peter C.Y. Tong and Juliana C.N. Chan$,Sustained Antidiabetic Effects of a Berberine-containing Chinese Herbal Medicine through Regulation of Hepatic Gene Expression, Diabetes, vol. 61, no. 4, pp. 933-943, (2012). Preprint Full Text (subscription required)
Chao Cheng, Chong Shou, , Koon-Kiu Yan and Mark B. Gerstein,Genome-wide Analysis of Chromatin Features Identifies Chromatin-Sensitive and Chromatin-Insensitive Classes of Yeast Transcription Factors, Genome Biology, vol. 12, issue 11, no. R111, (2011). Preprint Full Text
, Lukas Utz*, Simon Sitwell, Xihao Hu, Sachdev S. Sidhu, Benjamin E. Turk, Mark Gerstein$ and Philip M. Kim$,Identification of Specificity Determining Residues in Peptide Recognition Domains using an Information Theoretic Approach Applied to Large-Scale Binding Maps, BMC Biology, vol. 9, no. 53, (2011). Preprint Full Text
, Ben Kao, Xinjie Zhu, Chun Kit Chui, Sau Dan Lee and David W. Cheung,Mining Order-Preserving Submatrices from Data with Repeated Measurements, IEEE Transactions on Knowledge and Data Engineering (TKDE), (in press). (Preliminary version: ICDM 2008) Preprint Full Text (subscription required)
The ENCODE Project Consortium,A User's Guide to the Encyclopedia of DNA Elements (ENCODE), PLoS Biology, vol. 9, issue 4, e1001046, (2011). Preprint Full Text
Chao Cheng, Koon-Kiu Yan, , Joel Rozowsky, Roger Alexander, Chong Shou and Mark Gerstein,A Statistical Framework for Modeling Gene Expression using Chromatin Features with Application to modENCODE Datasets, Genome Biology, vol. 12, issue 2, no. R15, (2011). Preprint Full Text
Zhi John Lu*, , Guilin Wang, Chong Shou, LaDeana W. Hillier, Ekta Khurana, Ashish Agarwal, Raymond Auerbach, Joel Rozowsky, Chao Cheng, Masaomi Kato, David M. Miller, Frank Slack, Michael Snyder, Robert H. Waterston, Valerie Reinke and Mark Gerstein,Prediction and Characterization of Non-coding RNAs in C. elegans by Integrating Conservation, Secondary Structure and High Throughput Sequencing and Array Data, Genome Research, vol. 21, no. 2, pp. 276-285, (2011). Preprint Full Text
Justin Jee*, Joel Rozowsky*, , Lucas Lochovsky, Robert Bjornson, Guoneng Zhong, Zhengdong Zhang, Yutao Fu, Jie Wang, Zhiping Weng and Mark Gerstein,ACT: Aggregation and Correlation Toolbox for Analyses of Genome Tracks, Bioinformatics, vol. 27, no 8, pp. 1152-1154, (2011). Preprint Full Text
Mark B. Gerstein*$, Zhi John Lu*, Eric L. Van Nostrand*, Chao Cheng*, Bradley I. Arshinoff*, Tao Liu*, *, Rebecca Robilotto*, Andreas Rechtsteiner*, Kohta Ikegami*, Pedro Alves*, Aurelien Chateigner*, Marc Perry*, Mitzi Morris*, Raymond K. Auerbach*, Xin Feng*, Jing Leng*, Anne Vielle*, Wei Niu*, Kahn Rhrissorrakrai*, Ashish Agarwal, Roger P. Alexander, Galt Barber, Cathleen M. Brdlik, Jennifer Brennan, Jeremy Jean Brouillet, Adrian Carr, Ming-Sin Cheung, Hiram Clawson, Sergio Contrino, Luke O. Dannenberg, Abby F. Dernburg, Arshad Desai, Lindsay Dick, Andrea C. Dose, Jiang Du, Thea Egelhofer, Sevinc Ercan, Ghia Euskirchen, Brent Ewing, Elise A. Feingold, Reto Gassman, Peter J. Good, Phil Green, Francois Gullier, Michelle Gutwein, Mark S. Guyer, Lukas Habegger, Ting Han, Jorja G. Henikoff, Stefan R. Henz, Angie Hinrichs, Heather Holster, Tony Hyman, A. Leo Iniguez, Judith Janette, Morten Jensen, Masaomi Kato, W. James Kent, Ellen Kephart, Vishal Khivansara, Ekta Khurana, John K. Kim, Paulina Kolasinska-Zwierz, Eric C. Lai, Isabel Latorre, Amber Leahey, Suzanna Lewis, Paul Lloyd, Lucas Lochovsky, Rebecca F. Lowdon, Yaniv Lubling, Rachel Lyne, Michael MacCoss, Sebastian D. Mackowiak, Marco Mangone, Sheldon McKay, Desirea Mecenas, Gennifer Merrihew, David M. Miller III, Andrew Muroyama, John I. Murray, Siew-Loon Ooi, Hoang Pham, Taryn Phippen, Elicia A. Preston, Nikolaus Rajewsky, Gunnar Ratsch, Heidi Rosenbaum, Joel Rozowsky, Kim Rutherford, Peter Ruzanov, Mihail Sarov, Rajkumar Sasidharan, Andrea Sboner, Paul Scheid, Eran Segal, Hyunjin Shin, Chong Shou, Frank J. Slack, Cindie Slightam, Richard Smith, William C. Spencer, E.O. Stinson, Scott Taing, Teruaki Takasaki, Dionne Vafeados, Ksenia Voronina, Guilin Wang, Nicole L. Washington, Christina Whittle, Beijing Wu, Koon-Kiu Yan, Georg Zeller, Zheng Zha, Mei Zhong, Xingliang Zhou, modENCODE Consortium, Julie Ahringer$, Susan Strome$, Kristin C. Gunsalus$, Gos Micklem$, X. Shirley Liu$, Valerie Reinke$, Stuart K. Kim$, LaDeana W. Hillier$, Steven Henikoff$, Fabio Piano$, Michael Snyder$, Lincoln Stein$, Jason D. Lieb$, Robert H. Waterston$,Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project, Science, vol. 330, no. 6012, pp. 1775-1787, (2010). Full Text
Xiyan Li, Tara A. Gianoulis, , Mark Gerstein and Michael Snyder,Extensive in vivo Metabolite-Protein Interactions Revealed by Large-Scale Systematic Analyses, Cell, vol. 143, no. 4, pp. 639-650, (2010). Preprint Full Text (subscription required)
Wang Kay Ngai, Ben Kao, Reynold Cheng, Michael Chau, Sau Dan Lee, David W. Cheung and ,Metric and Trigonometric Pruning for Clustering of Uncertain Data in 2D Geometric Space, Information Systems, vol. 36, issue 2, pp. 476-497, (2011). Preprint Full Text (subscription required)
Prianka V. Patel*, Tara A. Gianoulis*, Robert D. Bjornson, , Donald M. Engelman and Mark B. Gerstein,Analysis of Membrane Proteins in Metagenomics: Networks of Correlated Environmental Features and Protein Families, (Chosen as volume cover) Genome Research, vol. 20, no. 7, pp. 960-971, (2010). Preprint Full Text (subscription required)
, Roger P. Alexander, Koon-Kiu Yan and Mark Gerstein,Improved Reconstruction of In Silico Gene Regulatory Networks by Integrating Knockout and Perturbation Data, (DREAM3 in silico challenge best performer paper), PLoS ONE, vol. 5, issue 1, no. e8121, (2010). Preprint Full Text
, Philip M. Kim, Drew McDermott and Mark Gerstein,Multi-level Learning: Improving the Prediction of Protein, Domain and Residue Interactions by Allowing Information Flow between Levels, BMC Bioinformatics, vol. 10, no. 241, (2009). Preprint Full Text
Smith Tsang, Ben Kao, , Wai-Shing Ho and Sau Dan Lee,Decision Trees for Uncertain Data, IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 23, no. 1, pp. 64-78, (2011). (Preliminary version: ICDE 2009) Preprint Full Text (subscription required)
and Mark Gerstein,Training Set Expansion: An Approach to Improving the Reconstruction of Biological Networks from Limited and Uneven Reliable Interactions, Bioinformatics, vol. 25, no. 2, pp. 243-250, (2009). Preprint Full Text
Matthew Scotch, and Kei-Hoi Cheung,Development of Grid-like Applications for Public Health using Web 2.0 Mashup Techniques, Journal of the American Medical Informatics Association (JAMIA), vol. 15, no. 6, pp. 783-786, (2008). Preprint Full Text (subscription required)
Kei-Hoi Cheung, , Jeffrey P. Townsend and Matthew Scotch,HCLS 2.0/3.0: Health Care and Life Sciences Data Mashup using Web 2.0/3.0, Journal of Biomedical Informatics (JBI), vol. 41, issue 5, pp. 694-705, (2008). Preprint Full Text
, Lin Cheung, David W. Cheung, Liping Jing and Michael K. Ng,A Semi-supervised Approach to Projected Clustering with Applications to Microarray Data, International Journal of Data Mining and Bioinformatics (IJDMB), vol. 3, no. 3, pp. 229-259, (2009). (Preliminary version: ICDE 2005) Preprint Full Text (subscription required)
, Prianka Patel*, Philip M. Kim, Donald M. Engelman, Drew McDermott and Mark Gerstein,An Integrated System for Studying Residue Coevolution in Proteins, Bioinformatics, vol. 24, no. 2, pp. 290-292, (2008). Preprint Full Text
Long J. Lu, Andrea Sboner, Yuanpeng J. Huang, Hao Xin Lu, Tara A. Gianoulis, , Philip M. Kim, Gaetano T. Montelione and Mark B. Gerstein,Comparing Classical Pathways and Modern Networks: Towards the Development of an Edge Ontology, Trends in Biochemical Sciences (TIBS), vol. 32, issue 7, pp. 320-331, (2007). Preprint Full Text (subscription required)
Minghua Zhang, Ben Kao, David W. Cheung and ,Mining Periodic Patterns with Gap Requirement from Sequences, ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 1, issue 2, article 7, (2007). (Preliminary version: SIGMOD 2005) Preprint PDF (subscription required)
Andrew K. Smith, Kei-Hoi Cheung, , Martin Schultz and Mark B. Gerstein,LinkHub: A Semantic Web System for Efficiently Handling Complex Graphs of Proteomics Identifier Relationships that Facilitates Cross-database Queries and Information Retrieval, BMC Bioinformatics, vol. 8 (Suppl 3), no. S5, (2007). (Preliminary version: SeS 2006) Preprint Full Text
Lin Cheung, , David W. Cheung, Ben Kao, and Michael K. NgOn Mining Micro-array Data by Order-Preserving Submatrix, International Journal of Bioinformatics Research and Applications (IJBRA), vol. 3, no. 1, pp. 42-64, (2007). (Preliminary version: BMDE 2006) Preprint Full Text (subscription required)
, Haiyuan Yu, Philip M. Kim, Martin Schultz and Mark Gerstein,The tYNA Platform for Comparative Interactomics: A Web Tool for Managing, Comparing and Mining Multiple Networks, (Featured in Journal of Proteome Research and Canadian Bioinformatics Help Desk Newsletter) Bioinformatics, vol. 22, no. 23, pp. 2968-2970, (2006). Preprint Full Text
Eric Lo, , King-Ip Lin, and David W. Cheung,Progressive Skylining over Web-Accessible Databases, Data & Knowledge Engineering (DKE), vol. 57, issue 2, pp. 122-147, (2006). Preprint Full Text (subscription required)
Kei-Hoi Cheung, , Andrew Smith, Remko de Knikker, Andy Masiar and Mark Gerstein,YeastHub: A Semantic Web Use Case for Integrating Data in the Life Sciences Domain, Bioinformatics, vol. 21, supp 1, pp. i85-i96, (2005). (Also in ISMB 2005) Preprint PDF
S. I. Ao, , Michael Ng, David Cheung, Pui-Yee Fong, Ian Melhado and Pak C Sham,CLUSTAG: Hierarchical Clustering and Graph Methods for Selecting Tag SNPs, Bioinformatics, vol. 21, no. 8, pp. 1735-1736, (2005). Preprint Full Text
Kei-Hoi Cheung, Remko de Knikker, Youjun Guo, Guoneng Zhong, Janet Hager, , Albert K. H. Kwan, Peter Li and David W. Cheung,Biosphere: the Interoperation of Web Services in Microarray Cluster Analysis, Applied Bioinformatics, vol. 3, no. 4, pp. 253-256, (2004). Preprint Full Text
, David W. Cheung, Michael K. Ng and Kei-Hoi Cheung,Identifying Projected Clusters from Gene Expression Profiles, Journal of Biomedical Informatics (JBI), vol. 37, issue 5, pp. 345-357, (2004). (Preliminary version: BIBE 2004) Preprint Full Text (subscription required)
, David W. Cheung and Michael K. Ng,HARP: A Practical Projected Clustering Algorithm, IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 16, no. 11, pp. 1387-1397, (2004). Preprint Full Text (subscription required)
Remko de Knikker, Youjun Guo, Jin-long Li, Albert K. H. Kwan, , David W. Cheung and Kei-Hoi Cheung,A Web Services Choreography Scenario for Interoperating Bioinformatics Applications, BMC Bioinformatics, vol. 5, no. 25, (2004). Preprint Full Text , Michael K. Ng and David W. Cheung, A Review on Projected Clustering Algorithms, International Journal of Applied Mathematics (IJAM), vol. 13, no. 1, pp. 35-47, (2003). Preprint |
Conference/workshop proceedings:
Combining Multiple Models in Reconstructing in silico Regulatory Networks, in The 5th Annual RECOMB Satellite on Regulatory Genomics, the 4th Annual RECOMB Satellite on Systems Biology, and the 3rd Annual DREAM Reverse Engineering Challenges, (2008). (Invited abstract for best performer) Preprint Smith Tsang, Ben Kao, , Wai-Shing Ho and Sau Dan Lee, Decision Trees for Uncertain Data, in IEEE International Conference on Data Engineering (ICDE), (2009). Preprint Full Text (subscription required) Chun Kit Chui, Ben Kao, and Sau Dan Lee, Mining Order-Preserving Submatrices from Data with Repeated Measurements, in IEEE International Conference on Data Mining (ICDM), (2008). Preprint Full Text (subscription required) Songhua Xu, Suchao Chen, , Francis C. M. Lau and Xueying Qin, A Two-Stage Audio Retrieval Method for Searching Unannotated Audio Clips, in The IEEE International Symposium on Multimedia (ISM), (2008). Preprint Full Text (subscription required) , Michael K. Ng and David W. Cheung, Input Validation for Semi-supervised Clustering, in Workshop on Foundations of Data Mining and Novel Techniques in High Dimensional Structural and Unstructured Data (FDM), (2006). Preprint Full Text (subscription required) Wang Kay Ngai, Ben Kao, Chun Kit Chui, Reynold Cheng, Michael Chau and , Efficient Clustering of Uncertain Data, in IEEE International Conference on Data Mining (ICDM), (2006). Preprint Full Text (subscription required) Andrew K. Smith, Kei-Hoi Cheung, , Martin Schultz and Mark B. Gerstein, LinkHub: A Semantic Web System for Efficiently Handling Complex Graphs of Proteomics Identifier Relationships that Facilitates Cross-database Queries and Information Retrieval, in International Workshop on Semantic e-Science, (2006). Preprint , Peishen Qi, Martin Schultz, David W. Cheung and Kei-Hoi Cheung, SemBiosphere: A Semantic Web Approach to Recommending Microarray Clustering Services, in The Pacific Symposium on Biocomputing (PSB), (2006). Preprint PDF Kei-Hoi Cheung, , Andrew Smith, Remko de Knikker, Andy Masiar and Mark Gerstein, YeastHub: A Semantic Web Use Case for Integrating Data in the Life Sciences Domain, in International Conference on Intelligent Systems for Molecular Biology (ISMB), (2005). Preprint PDF Minghua Zhang, Ben Kao, David Cheung and , Mining Periodic Patterns with Gap Requirement from Sequences, in ACM SIGMOD International Conference on Management of Data, (2005). Preprint Full Text (subscription required) Lin Cheung, , David W. Cheung, Ben Kao and Michael K. Ng, On Mining Micro-Array Data by Order-Preserving Submatrix, in International Workshop on Biomedical Data Engineering (BMDE), (2005). Preprint Full Text (subscription required) , David W. Cheung and Michael K. Ng, On Discovery of Extremely Low-Dimensional Clusters using Semi-Supervised Projected Clustering, in IEEE International Conference on Data Engineering (ICDE), (2005). Preprint Full Text (subscription required) Kei-Hoi Cheung, Andrew Smith, , Michael Seringhaus, Shawn M. Douglas and Mark Gerstein, A Semantic Web Approach to Integrating Heterogeneous Yeast Genome Data, in W3C Workshop on Semantic Web for Life Sciences (2004). Preprint PDF , David W. Cheung, Michael K. Ng and Kei-Hoi Cheung, Identifying Projected Clusters from Gene Expression Profiles, in IEEE Symposium on BioInformatics and BioEngineering (BIBE), (2004). Preprint Full Text (subscription required) , David W. Cheung and Michael K. Ng, A Highly-Usable Projected Clustering Algorithm for Gene Expression Profiles, in ACM SIGKDD Workshop on Data Mining in Bioinformatics (BIOKDD), (2003). Preprint PDF |
Conference/workshop abstracts/posters:
Mapping of Two Human Genomes with a Single Molecule Nanochannel Array Platform for Genome-wide Structural Variation Analysis and de novo Sequence Assembly, European Conference of Human Genetics (ESHG), (2013). |
Book chapters:
Semantic Web Approach to Database Integration in the Life Sciences, in Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences (eds. C Baker and K Cheung, Springer, NY), (2007). |
Theses:
Computational Reconstruction of Biological Networks, PhD Thesis, Yale University, (2009). , HARP: A Practical Projected Clustering Algorithm for Mining Gene Expression Data, Master Thesis, Department of Computer Science, The University of Hong Kong, (2003). Preprint Full Text |
Technical reports:
Mining Order-Preserving Submatrices from Data with Repeated Measurements, Technical Report TR-2008-16, Department of Computer Science, The University of Hong Kong, (2008). Preprint PDF , Learning from Instance-level Relationships, Technical Report, Department of Computer Science, Yale University, (2006). Preprint , David W. Cheung and Michael K. Ng, On Discovery of Extremely Low-Dimensional Clusters using Semi-Supervised Projected Clustering, Technical Report TR-2004-08, Department of Computer Science, The University of Hong Kong, (2004). Preprint PDF |