Table of Contents

Document of CLANS

Information on Person

API Description
getRelatedPerson(personName) Return a list of people whose name is related with personName
getPersonName(pid)
getPersonAge(pid)
getPersonBirthday(pid)
getPersonGender(pid)
getPersonHometown(pid)
getPersonIntroduction(pid)
getPersonTimeline(pid, duration)
getPersonTimeline(pid, latestEventNumber)
getSchoolMatesSocialNetwork(pid, duration, neighbour_number, layer)
getWorkMatesSocialNetwork(pid, duration, neighbour_number, layer)
getFellowTownsmenMatesSocialNetwork(pid, duration, neighbour_number, layer)
getFriendsSocialNetwork(pid, duration, neighbour_number, layer)

Information on Company

API Description
getRelatedCompany(companyName) Return a list of people whose name is related with personName
getCompanyStockName(stock_id)
getCompanyAddress(stock_id)
getCompanyFullName(stock_id)
getCompanyIndustry(stock_id)
getCompanyRevenue(stock_id, year)
getCompanyPerformance(stock_id, year)
getCompanyStockReturn(stock_id, year)
getCompanyEmployeeChangeTimeline(stock_id, position_rank, duration)
getEmployeesSocialNetwork(stock_id, position_list, duration, neighbour_number, layer)
getCompanySocialNetwork(stock_id, duration, neighbour_number, layer)
getSectorSocialNetwork(stock_id, sector, duration, neighbour_number, layer)

Example

  1.  Page name

Search (Alex)

Function Description

We use Lucene to create the index of our company and person data, and then we could implement the search function.

Related Information

Lucene – http://lucene.apache.org/

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

Change Log

Data Schema (Jingyi)

Function Description

AddInfo

AddEmployee

Data Schema

person xml file

company xml file

Entity Disambiguation (Xu)

Function Description

There are two main steps to do name entity disambiguation:


First, we use the train set to extract filter keywords to a certain person.

Then, use the extracted filter keywords to judge each posts is related to the person in our database or not.

Related Information

Related works

Social Network Modeling by Search Engine (Xu)

Visualization (Qian, Ken, Sunny)

Additional Features

Home Page

Description

Two main functions:

More to improve

Search Result

Description

Two columns of search results, one for person and one for company.

More to improve

Person Show

Features

Company Show

Features

Prediction (Zhi)

Information Processing (Zhi, Hang)

Social Media Crawling (Junfeng)

Version Change Log

Code File and Function Description

sqlite2cookie.py

Parse cookie from cookie sqlite (For Firefox and Chrome browser)

sqlite2cookie()

dbOperP.py

Class: dbOperator

Operate the data base (select, insert)

sinaMBCSearchBug.py

Class: WeiBoCBug

The Crawler for company

sinaMBPSearchBug.py

Class: WeiBoPBug

The Crawler for person

sinaBug.py

The Crawler master, control the crawler and make it run continually

linux run command:

‘nohup /local/fdm/python27/bin/python /local/fdm/weibobug/sinaBug.py &’

Social Network Analysis (Lily)

Class: SocialNetwork

Class Distribution

Class ShortestPathServer

Data Source (Lily, Zhi -database)

Describe data and data scheme in CLANS database.

database tables

prediction related

personal information related