As you can tell by my posts lately (thoughts on parallel MOEAs, and thoughts on compiling MODFLOW), I am working on a project where we are going to have to link a lot of codes together that may not like each other. I guess it’s like little kids in the backseat of a car on a long trip! So to continue this I wanted to talk about an interesting idea on extending the “Frankenstein’s monster” of these linked codes a step further…

## The workflow

A parallel MOEA is coded in C. It is running a master-worker paradigm in which all the workers are dutifully evaluating their simulations. **Note that each worker must work on his own individual input and output files!**

The worker has to call on a MATLAB wrapper, which massages some input data and writes some files that are needed to run a simulation. **MATLAB has to somehow figure out what node it is running on!**

Finally, MATLAB calls to an executable that is running a simulation code written in a Mysterious Language. The exectuable communicates to the rest of the world via input and output text files.

## A problem

We know that we can call MATLAB using a system command. Consider trying to run a MATLAB file called myMatlabFunction.m. The command to invoke MATLAB is simply:

matlab -r myMatlabFunction

but note! MATLAB doesn’t have the luxury of having the MPI setup, so it doesn’t know what processor rank it is! In fact, MATLAB is inherently ‘stupid’, in the sense that it is a new process that has been spawned off of the calculations that the worker node is doing. So how do we tell the MATLAB process its rank? This is needed, of course, because the input filename for the simulation is going to have, within it, the name of the rank of the process.

In the C code, all we need to do to determine the rank of the processor is:

#include

int rank;

MPI_Comm_rank (MPI_COMM_WORLD, &rank);/* get current process id */

My first idea was to include a parameter in the MATLAB function that would accept ‘rank’, and then the form of the call to MATLAB would simply enter the parameters in parentheses! I would post where I found out how to do this, but I’m going to because (wait for it) this is not a valid approach in Unix!

In fact, to do what we need to do, according to this helpful post on the Mathworks website, we need to set environment variables on the machine. Then, Matlab can read the environment variables.

## The Solution

In your C code, first convert the rank to a char array, and then you can use the setenv command to set an environment variable. Note! For debugging purposes, it is helpful to save **every** input file that you create. Granted, this will mean that 1000s of files will be saved at once, but the benefit is that you get to see that the input files are created properly before you go destroying them. Therefore, there’s a second environment variable that you need to set which is the iteration. Essentially you will have a matrix of input files: ‘rank 2, iteration 5’ is the 5th iteration that rank 2 has worked on, etc. So:

sprintf(rankAsString, "%d", rank);

sprintf(iterationAsString, "%d", myCounter);

```
```

`setenv("RANK", rankAsString, 1);`

setenv("ITERATION", iterationAsString, 1);

To clarify, myCounter is a global variable that you increment up every time your function is called. So now, you have environment variables on your processor that can be read in by other programs to indicate the rank and iteration. This is important because your input file has those variables embedded in its name:

sprintf(theFilename, "input_rank-%d_iteration-%d.txt", rank, myCounter);

Now let’s look at the MATLAB code. To test initially, I just did something silly. All I did was have the MATLAB code read in the variables, and then spit them back out in a new file. If it can do that, it can obviously do more complicated things. Luckily MATLAB has a function that can read environment variables (getenv). You can also craft yourself some test output that you can look at to make sure the process is working:

function myMatlabFunction rank = getenv('RANK'); myCounter = getenv('ITERATION'); disp('Rank is:'); disp(rank); disp('Iteration is:'); disp(myCounter); filename = ['input_rank-' num2str(rank) '_iteration-' num2str(myCounter) '.txt']; outFilename = ['output_rank-' num2str(rank) '_iteration-' num2str(myCounter) '.txt']; fid = fopen(filename); for i = 1:11 vars(i) = fscanf(fid, '%f', 1); end fclose(fid); disp(vars); outid = fopen(outFilename, 'w'); fprintf(outid, '%f %f %f %f %f %f %f %f %f %f %f\n', vars); fclose(outid); exit

Note here that the number of variables is hard coded as 11, but this can be easily changed.

## Conclusion

As you can see, just because you need to use source code to run a Parallel MOEA, that doesn’t mean that you can’t employ wrappers and executable simulation codes within your workflow. Granted, the use of wrappers is going to slow you down, but for a model with a long evaluation time, this won’t likely matter that much. As usual please comment and question below.

Hey,

Could you explain a little more about what exactly a wrapper is/does?

Thanks!

Hi Catherine, thanks for reading. A wrapper in this case just refers to a model that’s external to the optimization algorithm. The process is usually:

1. Algorithm figures out some decision variables that it thinks will perform well;

2. Algorithm sends decision variables to evaluation function

3. Evaluation function returns a set of performance objectives

If your evaluation function happens to be written in the same language as the optimization algorithm, then you might not need to worry about this! But in the example in this post, the algorithm is in C while the evaluation function is in Matlab. So these are some tips for getting the external Matlab function (or wrapper) to be able to communicate with the optimization algorithm.

Thank you!