Pregel+ Communication Primitives

All the communication primitives of our system are defined in utils/communication.h. These functions are implemented using MPI, and are called by the system (e.g., in worker's run() function). If you want to change/extend the logic of our system, you may also need to call these functions.

The first set of functions aggregate the value of each worker, and returns the aggregated value.

Every worker calls: int all_sum(int my_copy)

Every worker calls: long long master_sum_LL(long long my_copy)

Every worker calls: long long all_sum_LL(long long my_copy)

Every worker calls: all_bor(char my_copy)

For example, in function Worker::run() of basic/Worker.h, evey worker executes active_vnum() = all_sum(active_count) to sum up the number of active vertices (i.e. active_count) local to the worker, and to obtain the total number of active vertices (i.e. the return value of all_sum(.)). This value is then used to decide the termination condition. For master_sum_LL(.), the return value is only valid for the master process.

The second set of functions are called by the workers to exchange contents:

Every worker calls: all_to_all(vector& to_exchange)

Every worker calls: all_to_all(vector& to_send, vector& to_get)

Before calling the first function, one should make sure that to_exchange contains N elements of type T (N is the number of workers), where to_exchange[i] stores the object to be sent to worker i. After the function returns, to_exchange[i] stores the object received from worker i.

The second function is similar, except that the contents to send are stored in to_send, and the contents received are stored in to_get. However, for worker i, to_get[i] is empty and one needs to get the content from to_send[i]. If T is different from T1, we require that the serial representation of a T-typed object be the same as that of a T1-typed object (e.g., vector and hash_set have the same serial representation according to utils/serialization.h).

These functions are used by the system to implement the message passing logic, see basic/MessageBuffer.h for more details.

The final set of functions are called between master and the slaves to gather/scatter information:

Master calls: masterScatter(vector<T>& to_send);        Slave calls: slaveScatter(T& to_get)

Master calls: masterBcast(T& to_send);        Slave calls: slaveBcast(T& to_get)

Master calls: masterGather(vector<T>& to_get);        Slave calls: slaveGather(T& to_send)

The first pair of functions are called to let the master distribute contents to each worker. For the master, before calling masterScatter(.), to_send[i] stores the object to be sent to worker i. For each slave, after calling slaveScatter(.), to_get stores the object received from the master.

The second pair of functions are similar, except that the master sends the same content to_send to all slaves.

The third pair of functions are called to let the master gather contents from each worker. For each slave, before calling slaveGather(.), to_send stores the object to be sent to the master. For the master, after calling masterGather(.), to_get[i] stores the object received from worker i.

These functions are used by the system to implement the aggregator logic, see Worker::agg_sync() in basic/Worker.h for more details.