Building a parallel evolutionary simulation using the Message Passing Interface (MPI)


Say you have an evolutionary algorithm that takes bloody ages to complete. Each evaluation maybe takes a minute or so and you have 100 population members. Then, the whole evolutionary process takes a decade to complete..on the other hand, you have access to 50 networked computers. Each computer in the network could evaluate 2 population members. That is, the evaluations can occur in parallel.

This is where the Message Passing Interface comes in handy. The idea is very simple. You distribute the computation process over multiple nodes which affords a drastic decrease in overall simulation time. Let's look at a very simple example (Disclaimer: I'm not saying this is how you're meant to 'do' MPI, but it worked for me).

#include <.....whatever else you need to include....>
#include <mpi.h> // This is the important one for MPI

int main(int argc, char *argv[])
{
    // MPI Initialization (for multicore stuff)
    int numprocs; // number of mpi nodes we have (ideally popSize % numprocs==0)
    int myid; // the id number of this node
    MPI_Init(&argc,&argv);

    // set numprocs and myid
    MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
    MPI_Comm_rank(MPI_COMM_WORLD, &myid);
     
    // If we are the root node (node 0), we initialize the chromosome data
     // and send the data to the receiving node. The receiving nodes
    // will all have an id > 0 and can be regarded as 'worker' nodes
    if(myid==0)
    {
        initializeData();
        sendData(numprocs);
    }
    else // we are a worker node to receive data
    {
        receiveData();
    }

    // if we are the root node, we have the main optimisation loop here
    int generations = 1000;
    for(int gen = 0;gen<generations;gen++)
    {

        // block until all nodes have reached this point. This is to synchronise everything
        // and will indicate when all worker nodes have finished their evaluations
        MPI::COMM_WORLD.Barrier();

        // if we are the root node, we receive data from the worker nodes and do optimisations
        if(myid==0)
        {
            receiveFitnessValues(numprocs);
            applyOptimisations();

            // send data back again
            sendData(numprocs);
        }
        else // we are a worker node, so evaluate and send back results
        {
            evaluate();
            sendFitnessValue();

            // get ready to receive new data
            receiveData();
        }

        MPI::COMM_WORLD.Barrier();

    }

    MPI::COMM_WORLD.Barrier();
    MPI_Finalize();

    return 1;

}


Implementing MPI send and receive functions then look as follows:


// for the root node, this function is used to send chromosome parameters to worker nodes
void sendData(int numprocs)
{
    // note we begin at an index of 1 since 0 is the root node
    for(int n = 1;n<numprocs;n++)
    {
        MPI_Send (reinterpret_cast < char *>(&data),sizeof (data),
                MPI_CHAR,n,n, MPI_COMM_WORLD);
    }

}

Above, data is some variable that you wish to send. As with the prior serialization example, it needs to be cast to a byte (a char) in this example. Indeed, you can send the data of any number of variables in the above manner, casting each as shown. The data is then sent to worker node n.



// for the worker nodes, this function is used to receive chromosome parameters
void receiveData()
{
    MPI_Recv (reinterpret_cast < char *>(&data),
                 sizeof (data),
                 MPI_CHAR,0,tag,MPI_COMM_WORLD,&stat);

}

Above, we reverse the sending procedure, receiving the data and storing it in the variable data. Again, we might receive other data storing in other variables.

Implementing recevieFitnessValues and sendFitnessValue would then work in a similar way/



Comments

A simple serialization header

A simple binary serialization header

For convenience, its often handy to read and write data to disk in binary format. What we need to do is cast ever variable we wish to write to byte format. We therefore cast each variable to a char which is of size 1 byte and write it to disk. This is the header, just include it in your code:

// convenience code for serializing a variable to disk / reading in a var from disk
// Ben Jones 2011
// www.bhjones.com

#ifndef SERIALIZER_H
#define SERIALIZER_H

#include <iostream>
#include <fstream>

template <class T>
void serializeVariable(T var, std::ofstream &outFile)
{
    outFile.write (reinterpret_cast < char *>(&var),sizeof (var));
}

template <class T>
void readSerializedVariable(T &var, std::ifstream &inFile)
{
    inFile.read (reinterpret_cast < char *>(&var),sizeof (var));
}

#endif


to use this code, simply include it, say you have a class called Bob and a function called serialize, you’d do something like

#include "serializer.h"

void
Bob::serialize(ofstream& outFile)
{
    ::serializeVariable<int>(integerVariable,outFile);
    ::serializeVariable<double>(doubleVariable,outFile);
{



in order to write an integer variable called integerVariable and a double variable called doubleVariable to disk.
Comments