====== CSC4170 Web Intelligence and Social Computing ====== [ [[:teaching:csc4170:discussions:2009|Discussion Forum]] | [[:teaching:csc4170:blogs:2009|Blogs]] ] ==== Breaking News ==== * **Dec 14, 2009**. The project demo of CSC4170 will be held on this Wednesday(16th, Dec) at 924B of Ho Sin-Hang Engineering Building. You are required to go to following link to sign a timeslot. Each group has 15 minutes, and each group is required to attend 15 minutes earlier than their timeslot. The hard copy report is required to submit at the demo time. http://spreadsheets.google.com/ccc?key=0ApPqLXd5MIzkdGIwOGJsOXlsbHE1Mnd1QXA1eUc3Unc&hl=en * **November 24, 2009**. Sample answer of asgn2 is released. * **November 11, 2009**. Please make sure that you register for the upcoming Public Lecture by Dr. Craig Mundie on "Rethinking Computing" on November 16, 2009 at 4:00 pm at the LT6 Teaching Complex at Western Campus (TCWC). This talk is mandatory for the class so please sign-up <[[http://www.cuhk.edu.hk/ms-cu-jl/|here]]>. * **November 8, 2009**. The class on Monday (November 9) will be canceled because Prof. King is out of town for a conference. * **October 27, 2009**. We will have our Guest Speaker, Dr. Raman Chandrasekar from Microsoft Research on Monday, Novebmer 2, 2009 from 10:30 am to 11:30 am on the topic of "//Page Hunt: Improving Search Engines Using Human Computation Games//" in Room 121. Everyone is required to attend. * **October 14, 2009**. Specifications of programming assignment 1 is updated, and more testcases are provided. * **September 15, 2009**. Due to the typhoon No. 8 signal still being raised at 8:00 am, our class for today has been canceled. The tutorial will still be carried out in SHB 507 at 6:30 pm. * **September 28, 2009**. I have updated the class notes on Social Network Analysis and uploaded the latest one on Graph Mining. * **September 28, 2009**. The project due date has been moved back one week due to the various delays we have had. ==== Extra Credit Assignments ==== ===== 2009-10 Term 1 ===== | ^ Lecture I ^ Lecture II ^ Tutorial I ^ Tutorial II ^ ^ Time | M5, 12:30 pm - 1:15 pm | T3-T4, 10:30 am - 12:15 pm | T11 6:30 pm - 7:15 pm | TBA | ^ Venue | ERB 706 | ERB 408 | SHB 507 | TBA | The Golden Rule of CSC4170: No member of the CSC4170 community shall take unfair advantage of any other member of the CSC4170 community. ====== Course Description ====== This course introduces fundamental as well as applied computational techniques for collaborative and collective intelligence of group behaviours on the Internet. The course topics include, but are not limited to: web intelligence, web data mining, knowledge discovery on the web, web analytics, web information retrieval, learning to rank, ranking algorithms, relevance feedback, collaborative filtering, recommender systems, human/social computation, social games, opinion mining, sentiment analysis, models and theories about social networks, large graph and link-based algorithms, social marketing, monetization of the web, security/privacy issues related to web intelligence and social computing, etc. ===== Learning Objectives ===== ===== Learning Outcomes ===== ===== Learning Activities ===== - Lectures - Tutorials - Web resources - Videos - Quizzes - Examinations ====== Personnel ====== | ^ Lecturer ^ Tutor ^ Tutor ^ ^ Name | [[:home|Irwin King]] | Tom Chao Zhou | Xin Xin | ^ Email | king AT cse.cuhk.edu.hk | czhou AT cse.cuhk.edu.hk | xxin AT cse.cuhk.edu.hk | ^ Office | Rm 908 | Room114A | Room101 | ^ Telephone | 2609 8398 | | | ^ Office Hour(s) | * M10, Monday 4:30 to 5:30\\ \\ * T3, Tuesday 10:30 to 11:30 | Tuesday 15:30 to 16:30| Tuesday 15:30 to 16:30 | Note: This class will be taught in English. Homework assignments and examinations will be conducted in English. ====== Syllabus ====== The pdf files are created in Acrobat 6.0. Please obtain the correct version of the [[http://www.adobe.com/prodindex/acrobat/readstep.html#reader | Acrobat Reader]] from Adobe. ^ Week ^ Date ^ Topics ^ Tutorials ^ Homework & Events ^ Resources ^ | 1 | 7/9 | Introduction to Web Intelligence and Social Computing\\ Web 2.0\\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/CSC4170-01-Introduction.pdf|01-Introduction.pdf]] | | | [[http://www.cse.cuhk.edu.hk/~king/PUB/podcasts/Tim%20O'Reilly%20on%20Web%202.0.mp3|Tim O'Reilly on Web 2.0, The Economist, 20/3/2009]] | | 2 | 14/9 | Introduction to Web Intelligence and Social Computing\\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PUB/CSC4170-090914-01.aiff|Podcast090914-01]] | {{:teaching:csc4170:web-crawler.ppt|Regular Expressions Web Crawler}} | | [[http://www.analytictech.com/networks.pdf|Introduction to Social Networks]] | | 3 | 21/9 | Social Networks-Theory\\ Graph Theory\\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/CSC4170-02-SNA.pdf|02-SNA.pdf]] OLD! \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/CSC4170-02-SNA-01.pdf|02-SNA-01.pdf]] OLD!\\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/CSC4170-02-SNA-02.pdf|02-SNA-02.pdf]] NEW! | {{:teaching:csc4170:graph_visualization.ppt|Graph Visualization}} | [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/hw1_09.pdf|HW #1]]\\ (Due on or before 6:30 pm, Friday, 2 October, 2009)\\ {{:teaching:csc4170:csc4170-asgn1-noname1207.xls|grading-asgn1}} | [[http://si.umich.edu/~rfrost/courses/SI110/readings/In_Out_and_Beyond/Granovetter.pdf|SWT Theory]] | | 4 | 28/9 | Graph Mining \\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/CSC4170-03-GraphMining-01.pdf|03-GraphMining-01.pdf]] \\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PUB/CSC4170-090928-01.aiff|Podcast090928-01]]\\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PUB/CSC4170-090929-01.aiff|Podcast090929-01]] | Graph Mining Algorithms\\ {{:teaching:csc4170:hits.ppt|}} | | [[http://demonstrations.wolfram.com/SamplesOfRandomGraphs/|Generating Random Graphs]]\\ [[http://www.geocities.com/dharwadker/clique/|The Clique Algorithm]] | | 5 | 5/10 | Link Analysis\\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/CSC4170-04-LinkAnalysis-01.pdf|04-LinkAnalysis-01.pdf]] NEW!\\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PUB/CSC4170-091006-01.aiff|Podcast091006-01]] | PageRank, HITS, etc. | [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/hw2_09.pdf|HW #2]]\\ {{:teaching:csc4170:hw2_sampleanswer.pdf|hw2 sample answer}}\\ {{:teaching:csc4170:csc4170-asgn2-noname1207.xls|grading-asgn2}}\\ {{:teaching:csc4170:programming1_09.pdf|HW Programming #1}} {{:teaching:csc4170:newprogramming1_testcases.rar|HW Programming #1 Testcases}} \\ \\ (Due on or before 6:30 pm, Monday, 19 October, 2009)\\ {{:teaching:csc4170:csc4170-programming-noname1207.xls|grading-programming}} | [[http://nlp.stanford.edu/IR-book/|Introduction to Information Retrieval]] | | 6 | 12/10 | Learning to Rank\\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/CSC4170-05-Learning2Rank-01.pdf|05-Learning2Rank-01.pdf]] OLD!\\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/CSC4170-05-Learning2Rank-02.pdf|05-Learning2Rank-02.pdf]] NEW!\\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PUB/CSC4170-091012-01.aiff|Podcast091012-01]]\\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PUB/CSC4170-091013-01.aiff|Podcast091013-01]] | {{:teaching:csc4170:pagerank.ppt|PageRank}} | {{:teaching:csc4170:project_09.pdf|Project Specification}} \\ {{:teaching:csc4170:ml-data_0.zip|Movie Dataset}} \\ | | | 7 | 19/10 | Recommender Systems I \\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/CSC4170-06-Recommender-01.pdf|06-Recommender-01.pdf]] | {{:teaching:csc4170:evaluation.ppt|Evaluation Methods}} | | | | 8 | 26/10 | Recommender Systems II\\ Query Expansion\\ \\ [[http://www.cse.cuhk.edu.hk/~king/PUB/CIKM2008-QuerySuggestion.pdf|CIKM2008 Query Suggestion]] | {{:teaching:csc4170:deng_entropybiasedmodel_sigir_talk.ppt|QF/IQF}} | {{:teaching:csc4170:hw3_09R.pdf|HW #3}}\\ (Due: Monday,23 November,18:30) | | | 9 | 2/11 | Human Computation/Social Games\\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/CSC4170-07-HumanComputation-01.pdf|07-HumanComputation-01.pdf]] NEW! | {{:teaching:csc4170:humancomputation.ppt|}} | | Guest Speaker | | 10 | 9/11 | Crowdsourcing | {{:teaching:csc4170:languagemodel.ppt|language model}} | | | | 11 | 16/11 | Q&A\\ Virtual Communities\\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/CSC4170-08-QandA.pdf|CSC4170-08-QandA.pdf]] | Wikis, Blogs, etc. | {{:teaching:csc4170:hw4_09.pdf|HW #4}}\\ (Due: Friday, 4 December 4, 2009, 18:30)\\ {{:teaching:csc4170:csc4170-asgn4-noname1208.xls|gradings asgn4}}\\ {{:teaching:csc4170:hw4_sampleanswer.pdf|hw4 sample answer}} | | | 12 | 23/11 | Privacy and Security of Information\\ Education, Policy\\ \\ [[http://www.cse.cuhk.edu.hk/~king/csc4170/PDF/CSC4170-09-Security.pdf|09-Security.pdf]]\\ NEW! | | | | | 13 | 30/11 | Wrap Up\\ \\ Project Presentations | | | [[http://edutechwiki.unige.ch/en/EduTech_Wiki:Books/Social_computing_in_education|EduTech on Social Computing in Education]] | * Web 2.0 * Ajax, CSS, * Social Media * blogs, microblogs, wikis, mashup, ====== Class Project ====== ===== Class Project Presentation Schedule ===== - TING KAM CHEUNG & MA MING CHAO - YANG NGAI KEUNG - YAU MING HIU & CHOW TSZ YEUNG - LAM KA LOK - TUNG WO HOU - WONG YUK KI & TO KA CHUN - LI WAI WA & TSO XIN - TSANG HO KWAN & TANG CHI CHIU - ZACK BUSH - HO CHUN KIU - LEE WING HUNG ===== Class Project Presentation Requirements===== - For each group, the total time for the presentation is 15 minutes, including 12 minutes for the talk and 3 minutes for Q&A. The presentation will follow the order above. Since this class will last until the end of all the presentations, if the time is not suitable for you, you can tell us to change your order. \\ - In the presentation, there is no demo part. The demo part is an independent process divided into two sub-sections. The first section will be hold in tutorial time on Dec. 1st. In this section, all the groups should demo your program to the two tutors. The tutors will guide you to revise your program. The second section will be hold on Wednesday, Dec. 16th. In this section, Prof. King will check your program before the final submission of your codes. \\ - For groups implementing graphical algorithms, you should explain one algorithm as detailed as you can in the presentation. You should give an example with the structure of nodes, values, and your calculations. You also need to analyze the complexity of your algorithms and test whether your algorithms can be applied in large graphs. For other groups, you should focus on three aspects including the motivation of your idea, the detailed algorithms, and the justification of your methods comparing to naive methods through experiments. ====== Examination Matters ====== ===== Examination Schedule ===== | ^ Time ^ Venue ^ Notes ^ ^ Midterm Examination\\ Written | TBD | TBD | TBD | ^ Midterm Examination\\ Programming | TBD | TBD | TBD | ^ Final Examination | 9/12/2009 Wed.\\ 9:30 am to 11:30 am | Room 103, John Fulton Centre | The final examination covers all materials presented in the class. | * [[http://rgsntl.rgs.cuhk.edu.hk/rws_prd_life/main1.asp|CUHK Registration and Examination]] ===== Written Midterm Matters ===== - The midterm will test your knowledge of the materials. - Answer all questions using the answer booklet. There will be more available at the venue if needed. - Write legibly. Anything we cannot decipher will be considered incorrect. ====== Grade Assessment Scheme ====== ^ Homework\\ Assignments ^ Project Report ^ Project Presentation ^ Final Examination ^ | 20% | 20% | 10% | 50% | -Assignments (20%) -Written assignment -Optional quizzes - Project (30%) - Report (20%) - Presentation (10%) -Final Examination (50%) -Extra Credit (There is no penalty for not doing the extra credit problems. Extra credit will only help you in borderline cases.) ====== Required Background ====== - Pre-requisites -- CSC 1110 or 1130 or its equivalent. (Not for students who have taken CSC 2520). ====== Reference Books ====== ====== Book Sources ====== - **Academic & Professional Book Centre**, 1H Cheong Ming Bldg., 80-86 Argyle St., Kowloon, 2398-2191, 2391-7430 (fax) - **Caves Books (H. K.)**, 4B Ferry St., G/F., Yaumatei, Kowloon, 2780-0987, 2771-2298 - **Man Yuen Book Company**, 45 Parkes street, Jordan Road, Kowloon, Hong Kong, 2366-0594. Not very large, Asian edition books, fair price, wide range, some 10% discount. - **Swindon Book Co. Ltd**, 13-15 Lock Road, Tsim Sha Tsiu, Kowloon, 2366-8001. One of the largest book stores in Hong Kong, exchange rate is not favorable. - **Hongkong Book Centre**, 522-7064. A branch of the Swindon book shop. ====== FAQ ====== - **Q: What is departmental guideline for plagiarism?**\\ A: If a student is found plagiarizing, his/her case will be reported to the Department Discipline Committee. If the case is proven after deliberation, the student will automatically fail the course in which he/she committed plagiarism. The definition of plagiarism includes copying of the whole or parts of written assignments, programming exercises, reports, quiz papers, mid-term examinations. The penalty will apply to both the one who copies the work and the one whose work is being copied, unless the latter can prove his/her work has been copied unwittingly. Furthermore, inclusion of others' works or results without citation in assignments and reports is also regarded as plagiarism with similar penalty to the offender. A student caught plagiarizing during tests or examinations will be reported to the Faculty Office and appropriate disciplinary authorities for further action, in addition to failing the course. - **Q: What is ACM ICPC?** \\ A: Association of Computer Machinery International Collegiate Programming Contest. Teams from CUHK have done quite well in the previous years. More information on the CSE's programming team can be found at http://www.cse.cuhk.edu.hk/~acmprog. - **Q: What are some of the common mistakes made in online and real-time contest?**\\ A: There are a few common mistakes. Please check out [[http://www.acm.org/crossroads/xrds7-5/contests.html|this site]] for more information. ====== Resources ====== -[[http://pajek.imfm.si/doku.php|Pajek, a network analysis and visualization program.]] -[[http://vlado.fmf.uni-lj.si/pub/networks/data/default.htm|Package for Large Network Analysis]] -[[http://www.analytictech.com/downloaduc6.htm|UCINET 6]] -[[http://www.analytictech.com/Netdraw/netdraw.htm|Netdraw]] -[[http://stat.gamma.rug.nl/stocnet/|StOCNET]] ===== Social Networks-Theory Graph Theory ===== http://www.cs.purdue.edu/homes/neville/courses/aaai08-tutorial.html \\ http://cs.stanford.edu/people/jure/icml09networks/ \\ http://www.ofcom.org.uk/advice/media_literacy/medlitpub/medlitpubrss/socialnetworking/report.pdf \\ ===== Graph Mining ===== http://www.cs.cmu.edu/~deepay/mywww/papers/csur06.pdf \\ http://cs.stanford.edu/people/jure/talks/www08tutorial/ \\ http://www.xifengyan.net/tutorial/KDD08_graph_partI.pdf \\ http://www.xifengyan.net/tutorial/KDD08_graph_partII.pdf ===== Link Analysis===== http://analytics.ijs.si/events/Tutorial-TextMiningLinkAnalysis-KDD2007-SanJose-Aug2007/ \\ http://www.sigkdd.org/explorations/issues/7-2-2005-12/1-Getoor.pdf \\ http://www.ncjrs.gov/pdffiles1/nij/grants/219552.pdf \\ http://delab.csd.auth.gr/~dimitris/papers/ENVO07LARskm.pdf ===== Learning to Rank===== http://www2009.org/pdf/T7A-LEARNING%20TO%20RANK%20TUTORIAL.pdf\\ http://radlinski.org/papers/LearningToRank_NESCAI08.pdf\\ http://www.aclweb.org/anthology/P/P09/P09-5005.pdf\\ http://www.cse.iitb.ac.in/~soumen/doc/www2007/TutorialSlides.pdf ===== Recommender Systems===== http://en.wikipedia.org/wiki/Recommender_system http://www.deitel.com/ResourceCenters/Web20/RecommenderSystems/RecommenderSystemsTutorialsandWebcasts/tabid/1313/Default.aspx http://www.computer.org/portal/web/csdl/doi/10.1109/TKDE.2005.99 http://www.springerlink.com/content/n881136032u8k111/ http://www.csd.abdn.ac.uk/~jmasthof/Publications/WPRSIUI07.pdf ===== Q & A ===== http://lml.bas.bg/ranlp2005/tutorials/magnini.ppt \\ http://tcc.itc.it/research/textec/topics/question-answering/Tut-Prager.ppt \\ http://en.wikipedia.org/wiki/Question_answering \\ http://trec.nist.gov/pubs/trec9/papers/webclopedia.pdf \\ http://domino.watson.ibm.com/library/CyberDig.nsf/papers/D12791EAA13BB952852575A1004A055C/$File/rc24789.pdf \\ http://www.umiacs.umd.edu/~jimmylin/publications/Lin_Katz_EACL2003_tutorial.pdf \\ http://answers.yahoo.com/ \\ http://zhidao.baidu.com/ \\ http://wenda.tianya.cn/wenda/ \\ http://hk.knowledge.yahoo.com/ \\ ===== Human Computation/Social Games ===== http://www.gwap.com/gwap/ \\ http://www.cs.cmu.edu/~biglou/ \\ ===== Opinion Mining/Sentiment Analysis ===== http://www.cs.uic.edu/~liub/FBS/opinion-mining-sentiment-analysis.pdf \\ http://www.cs.cornell.edu/home/llee/omsa/omsa-published.pdf \\ http://www.cs.cmu.edu/~wcohen/10-802/sentiment-sep-4.ppt \\ ===== Visualization ===== -[[http://manyeyes.alphaworks.ibm.com/manyeyes/|Many Eyes Visualization]] ===== Programming ===== -[[http://networkx.lanl.gov/|NetworkX, a Python package for complex networks]] -[[http://www.wolfram.com/|Mathematica from Wolfram]] -[[http://demonstrations.wolfram.com/|Wolfram Demonstrations]]