System Code

G-thinker system code is just a set of header files, one may include them into the application code and compile using GCC. The current system code runs in Linux and should be compiled with C++11 enabled.

Toy Graph

Each line represents a vertex.

We provide two version of toy graph in here. One has a label on each vertex, the another one has no label on vertex.

Line format:(no label)    vertex-ID    \t    number-of-neighbors    neighbor1-ID    neighbor2-ID    •••

Line format:(with label)    vertex-ID    label    \t    neighbor1-ID    label    neighbor2-ID    label    •••

To run the application code with the toy graph, put the data to HDFS as follows:

hadoop fs -mkdir /toyFolder

hadoop fs -put { your_local_directory_for_toy.txt }/toy.txt    /toyFolder

hadoop fs -put { your_local_directory_for_label_toy.txt }/label_toy.txt    /toyFolder

To check the result, type the following command:

hadoop fs -cat /toyOutput/*

Partition Code

For G-thinker, we firstly do graph partitioning in order to partition the graph into N blocks. The Application program is running on N workers with each worker charging one block.
We provide three basic partitioner in here. HashPartitioner is a baseline through a hash function on vertex.ID. BalancePartitioner firstly partitions the graph into small pieces and distributes them onto the mapping worker, considering the balance of capacity of each worker. LDGPartitioner is not only based on balancing strategy, it follows an algorithm by Linear Deterministic Greedy on jump-edges among each worker.

Application Code