====== Document of CLANS ====== ===== Information on Person ===== ^ API ^ Description ^ | getRelatedPerson(personName) | Return a list of people whose name is related with personName | | getPersonName(pid) | | | getPersonAge(pid) | | | getPersonBirthday(pid) | | | getPersonGender(pid) | | | getPersonHometown(pid) | | | getPersonIntroduction(pid) | | | getPersonTimeline(pid, duration) | | | getPersonTimeline(pid, latestEventNumber) | | | getSchoolMatesSocialNetwork(pid, duration, neighbour_number, layer) | | | getWorkMatesSocialNetwork(pid, duration, neighbour_number, layer) | | | getFellowTownsmenMatesSocialNetwork(pid, duration, neighbour_number, layer) | | | getFriendsSocialNetwork(pid, duration, neighbour_number, layer) | | ===== Information on Company ===== ^ API ^ Description ^ | getRelatedCompany(companyName) | Return a list of people whose name is related with personName | | getCompanyStockName(stock_id) | | | getCompanyAddress(stock_id) | | | getCompanyFullName(stock_id) | | | getCompanyIndustry(stock_id) | | | getCompanyRevenue(stock_id, year) | | | getCompanyPerformance(stock_id, year) | | | getCompanyStockReturn(stock_id, year) | | | getCompanyEmployeeChangeTimeline(stock_id, position_rank, duration) | | | getEmployeesSocialNetwork(stock_id, position_list, duration, neighbour_number, layer) | | | getCompanySocialNetwork(stock_id, duration, neighbour_number, layer) | | | getSectorSocialNetwork(stock_id, sector, duration, neighbour_number, layer) | | ===== Example ===== - [[projs:clans:docs:example_function_2|Example Function]] - {{:projs:clans:docs:s_3.png?500| Page name}} ===== Search (Alex)===== ====== Function Description ====== We use Lucene to create the index of our company and person data, and then we could implement the search function. * [[projs:clans:docs:alex:index|Build Index]] * [[projs:clans:docs:alex:search|Search]] ====== Related Information ====== Lucene -- http://lucene.apache.org/ Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. ====== Change Log ====== ===== Data Schema (Jingyi)===== ==== Function Description ==== [[projs:clans:docs:data_schema_jingyi:addinfo|AddInfo]] [[projs:clans:docs:data_schema_jingyi:addemployee|AddEmployee]] ==== Data Schema ==== [[projs:clans:docs:data_schema_jingyi:personxml|person xml file]] [[projs:clans:docs:data_schema_jingyi:companyxml|company xml file]] ===== Entity Disambiguation (Xu) ===== ==== Function Description ==== There are two main steps to do name entity disambiguation: ---- First, we use the train set to extract filter keywords to a certain person. * [[projs:clans:docs:entity_disambiguation_xu:extractkeywords|extractkeywords]] Then, use the extracted filter keywords to judge each posts is related to the person in our database or not. * [[projs:clans:docs:entity_disambiguation_xu:testrelevancy|testrelevancy]] ==== Related Information ==== Related works * [[http://www.sciencedirect.com/science/article/pii/S0957417413001516]] * [[http://acl.ldc.upenn.edu/W/W03/W03-0405.pdf]] * [[http://wwwconference.org/2005a/cdrom/docs/p463.pdf]] * [[http://www.di.unipi.it/~ferragin/cikm2010.pdf]] * [[http://edgar.meij.pro/wp-content/papercite-data/pdf/wsdm-2012-meij.pdf]] ===== Social Network Modeling by Search Engine (Xu) ===== ===== Visualization (Qian, Ken, Sunny)===== ==== Additional Features ==== * Loading sign * Drag button * Share on weibo, facebook, twitter and so on ==== Home Page ==== === Description === Two main functions: * User sign in/out/up * Search function === More to improve === * Solution for Misspelling * Solution for searching like 'xiaosuining' ==== Search Result ==== === Description === Two columns of search results, one for person and one for company. === More to improve === * Search results show the news containing the key words. * Search results show description of a person/company. ==== Person Show ==== === Features === * 基本信息 * //关系网权值走势图// * 照片 * //成长轨迹图//(故乡,求学地点,工作地点等,Google map) * 时间轴 * 校友关系网图 * 工作关系网图 * 多项选择关系网图(节点特征值:centrality value, community) * //最短路径//(P2P和P2C) * //相似度图表//(P2P和P2C) * //微博消息//(Top & Hot) * //media// ==== Company Show ==== === Features === * 基本信息 * 照片 * //股票当前走势及历史情况// * //盈亏情况走势图// * //公司地点及分公司分布图// * 时间轴 * 公司内部关系网 * 公司外部关系网 * 多项选择网图(节点特征值:centrality value, community) * //最短路径//(C2C和C2P) * //比较相似度图表//(C2C和C2P) * //微博消息//(Top & Hot) * //media// ===== Prediction (Zhi)===== * [[projs/clans/docs/prediction_lib_utilanalz|Utilanalz]], a lib for drawing and analyzing data. ===== Information Processing (Zhi, Hang)===== * [[projs/clans/docs/infoPro_lib_utilData|UtilData]], a lib for simple data processing. * [[projs/clans/docs/infoPro_lib_stanfordParserPipeCn|stanfordParserPipeCn]], a python interface for stanfordParser. ===== Social Media Crawling (Junfeng)===== **//[[projs:clans:docs:social_media_crawling_junfeng:versions|Version Change Log]]//** ==== Code File and Function Description ==== === sqlite2cookie.py === //Parse cookie from cookie sqlite (For Firefox and Chrome browser)// //[[projs:clans:docs:social_media_crawling_junfeng:sqlite2cookie|sqlite2cookie()]]// === dbOperP.py === //[[projs:clans:docs:social_media_crawling_junfeng:dboperp|Class: dbOperator]]// //Operate the data base (select, insert)// === sinaMBCSearchBug.py === //[[projs:clans:docs:social_media_crawling_junfeng:sinambcsearchbug|Class: WeiBoCBug]]// //The Crawler for company// === sinaMBPSearchBug.py === //[[projs:clans:docs:social_media_crawling_junfeng:sinambpsearchbug|Class: WeiBoPBug]]// //The Crawler for person// === sinaBug.py === //The Crawler master, control the crawler and make it run continually// // linux run command: // // ‘nohup /local/fdm/python27/bin/python /local/fdm/weibobug/sinaBug.py &’ // ===== Social Network Analysis (Lily) ===== [[projs:clans:docs:snbf|Class: SocialNetwork]] [[projs:clans:docs:snd|Class Distribution]] [[projs:clans:docs:sps|Class ShortestPathServer]] ===== Data Source (Lily, Zhi -database) ===== Describe data and data scheme in CLANS database. ====database tables==== * [[projs/clans/docs/dataset_db_companyinfor|db::companyinfor]] * [[projs/clans/docs/dataset_db_company_rank|db::company_rank]] * [[projs/clans/docs/dataset_db_company_employee|db::company_employee]] * [[projs/clans/docs/dataset_db_relation_company|db::relation_company]] * [[projs/clans/docs/dataset_db_relation_company_component|db::relation_company_component]] ===prediction related=== * [[projs/clans/docs/dataset_db_stock_return|db::stock_return]] ===personal information related=== * [[projs/clans/docs/dataset_db_newdata_people_edu|db::newdata_people_edu]] * [[projs/clans/docs/dataset_db_newdata_people_work|db::newdata_people_work]] * [[projs/clans/docs/dataset_db_newdata_peopleinfor|db::newdata_peopleinfor]]