Preprocessing Sample Data

We illustrate how to run the programs using the USA road network dataset on Blogel download page. In this dataset, each vertex only have two coordinates (x, y), and we need to make each vertex have three coordinates (x, y, 0) since our terrain programs take coordinates of the form (x, y, z).

For this purpose, one may put the downloaded dataset to HDFS and run this Java MapReduce program over it. We assume the processed data is under HDFS path "/usaxyz".

We then partition the graph into blocks using Blogel's 2D partitioner. Please run this code which is compiled with Quegel system code. The partitioned graph is under HDFS path "/usaxyz_2".

Since our Quegel program is vertex-centric, we need to merge the vertices of a block into one line, so that the block will be treated as a vertex in our Quegel program. For this purpose, run this Java MapReduce program over the partitioned graph under "/usaxyz_2". We assume the merged data is under HDFS path "/usaxyz_3", and it can be used as input to our Quegel program.