Free cookie consent management tool by TermsFeed Policy Generator
wiki:Documentation/Howto/OptimizeExternalApplications

Version 12 (modified by abeham, 14 years ago) (diff)

update

Optimizing External Applications

Sometimes it is not possible to directly write a new problem for HeuristicLab and integrate it through the plugin system. Some people already have an application of which at least a part represents a NP hard problem that they'd like to solve. This guide explains how to use the ExternalEvaluationProblem that is available in HeuristicLab 3.3 to optimize problems written in a language other than C# or written using different frameworks, to name just a few possibilities. First the architecture is described, then a more detailed look into the API is given and finally a short tutorial should give the reader an idea of how to apply this to his/her case.

The most important part in any optimization problem is the evaluation function. Without knowing about the quality of a certain solution configuration the algorithm is not able to come close to an optimal solution. In NP hard problems evaluating a solution is usually a rather simple task, whereas finding the best solution is extremely difficult. Of course there can be complex problems which require high computational effort to calculate the quality of a solution, but in many cases the evaluation of a solution is rather straight forward. So, if the problem is not a HeuristicLab plugin we assume that it is available in another kind of executable format, either in an application itself or as part of another framework for example. We thus have a situation where we need inter-process communication (IPC). There are several possibilities of how to do IPC, in the following we will explain the approach we offer in HeuristicLab 3.3.

Architecture Overview

Technology & Background

Among the many possibilities and technologies that have emerged to provide a base for performing "distributed computing" in a wide sense, one of the first technologies was Remote Procedure Call (RPC). The idea is very simple: Instead of calling a local procedure to do some kind of calculation, a procedure is called that is not defined within the same executable or one of its dependencies. The client thus is in some ways hands the parameters of the method to another program or server, waits for the computation and then reads back the return value. In such a broad sense this is how most web applications nowadays work and indeed it is not until the invention of Representational State Transfer (REST) that there was a revolutionary change from this early RPC paradigm.

Recently Google, one of the biggest players in the world wide web released parts of their core RPC technologies to the general public. The framework that they call Protocol Buffers combines a domain specific language (DSL) for describing messages with an RPC framework that is used to pass these messages among different servers. The RPC framework itself is not released, but the DSL for describing messages and translating them to objects of several supported programming languages was released under the BSD license. Google directly supports C++, Java, and Python, but many developers have programmed ports of protocol buffers to different languages such as C#, Visual Basic, Objective C, Perl, Haskell, and many more. The documentation on protocol buffers is extremely helpful in understanding the framework. A similar framework called Thrift has evolved that backs the operation of Facebook and was also released to the public. These technologies provide convenient ways to define messages, manipulate them, and serialize them to small sizes at high speeds.

In HeuristicLab using remote procedure calls seems to fit very well with what we are trying to achieve in "exporting" the problem definition. The problem, to an optimizer, is basically the evaluation function and the solution representation is their common knowledge. A client is provided in HeuristicLab as well as a framework that enables developers to write a service which exposes the evaluation function to HeuristicLab. Using this service foreign language applications and problems can effectively communicate with HeuristicLab and have their parameters optimized by HeuristicLab's powerful optimization library. Please note that while we're talking about RPC, the provided frameworks are not compatible with the RPC standard, but are rather simplified to ease application. Think of RPC as a paradigm rather than a standard.

Communication also requires a given media over which to exchange the information. So far HeuristicLab offers two choices for the underlying media:

  1. The external program can be started as a process from HeuristicLab and the communication occurs via the process' stdin and stdout. This requires that the external program can be executed under Windows. If the developer controls the standard input and output streams and does not need to write or read other data through them, this might be a simple solution.
  2. The external program is started independently from HeuristicLab and opens a TCP/IP port for communicating over a network. This is independent of the platform and a universal solution that should work in most cases.

Regardless of the chosen media, the solution receiving and quality sending processes are abstracted from the developer through our service framework. For writing services, generally two types of services are supported by our framework: Push Service and Poll Service. It depends on the application which one of these is more suitable. If the developer is in charge of the control flow the push service seems plausible, if however the application flow cannot be fully controlled by the developer the poll service is the better suited option.

Push Service

As the name suggests when implementing a push service the solutions are pushed into the evaluation method. The developer has to provide a class that is able to perform the evaluation task. This class specifies a method that takes a SolutionMessage and returns a double value indicating the quality of the given solution. The method is called in a new thread whenever a new solution is received by the framework.

Poll Service

In this type the received solutions have to be polled. The service receives solutions in its own thread and puts them into a queue, waiting for the developer to process them. It provides two public methods, one that returns the next solution from the queue and blocks until a solution becomes available and another that sends the quality back to HeuristicLab.

Application Scenarios

One of the application scenarios that we had in mind when designing this interface is the field of simulation-based optimization. There, a simulation model defines a number of parameters which need to be adjusted such that a measured output of the model improves. This can be inventory sizes in a supply-chain scenario, or similarly buffer sizes in an assembly line, or training the weights in a neural network simulator. There are numerous optimization problems that are implemented as simulation models and one of the main problems is talking to them. Many different frameworks exist with which one can conveniently build, run, and test a simulation model, and most of them already have some support for optimization. However that support often is of proprietary nature and little information is available on how these methods perform. HeuristicLab aims to provide an open source alternative and the means of this interface allows simulation experts to use HeuristicLab in the optimization tasks.

Naturally, there exist several more reasons why a problem cannot be modeled in HeuristicLab, such as language or platform dependency and for these purposes this interface should provide a solution.

Architecture Details

Optimizing external applications divides into two parts:

  1. Providing an evaluation service for HeuristicLab in the external application
  2. Preparing an ExternalEvaluationProblem in HeuristicLab

These two tasks are described in this section in more detail.

Providing an evaluation service

The following class diagram displays the classes and interfaces present in the java service framework. There is an abstract base class Channel on the one hand which provides methods for sending and receiving messages and several concrete implementations and on the other hand an abstract base class Service that provides the concrete PollService and PushService. Each Channel has a corresponding factory which implements IChannelFactory.

No image "ExternalEvaluationServiceCD.png" attached to Documentation/Howto/OptimizeExternalApplications

In the source you will also find some test applications that provide very simple examples of how to use the service framework. They do not really evaluate a solution, but return a random number in the interval [0;1) for every received solution.

No image "ExternalEvaluationServiceTestCD.png" attached to Documentation/Howto/OptimizeExternalApplications

Usage examples

Here is the java code for the RandomStreamingPollEvaluator

public class RandomStreamingPollEvaluator {

  public static void main(String[] args) {
    StreamChannelFactory factory = new StreamChannelFactory(System.in, System.out);
    PollService service = new PollService(factory, 1);
    service.start();
    
    Random random = new Random();
    while (true) {
      SolutionMessage msg = service.getSolution();
      // parse the message and retrieve the variables there
      try {
        service.sendQuality(msg, random.nextDouble());
      } catch (IOException e) {
        break;
      }
    }
    
    service.stop();
  }
}

This code shows a concrete example of how to use the PollService and the ease of integrating this into an external java application. Using a PushService is slightly different as can be seen in the following example. As mentioned it depends on the application which of the two possibilities is more suited.

public class RandomSocketPushEvaluator {
  private PushService service;
  
  public static void main(String[] args) {
    RandomSocketPushEvaluator main = new RandomSocketPushEvaluator();
    main.run();
    System.out.println("Service is running, terminate by pressing <Return> or <Enter>.");
    System.console().readLine();
    main.terminate();
  }
  
  private void run() {
    ServerSocketChannelFactory factory = new ServerSocketChannelFactory(8843);
    service = new PushService(factory, 1, new RandomEvaluator());
    service.start();
  }
  
  private void terminate() {
    service.stop();
  }
  
  private class RandomEvaluator implements IEvaluationService {
    Random random;
    
    public RandomEvaluator() {
      random = new Random();
    }
    
    @Override
    public double evaluate(SolutionMessage msg) {
      return random.nextDouble();
    }
    
  }
}

Preparing an ExternalEvaluationProblem

In the HeuristicLab Optimizer a number of algorithms and problems are available which can be created, viewed and parameterized. A problem in HeuristicLab consists of several parameters whose values describe the problem instance, but also consists of several operators. The two most important operators are the Evaluator which is used to evaluate a solution and the SolutionCreator which is used to create new solution configurations from scratch. Although solutions are usually created randomly, for some problem one might use certain heuristics and start with partly optimal solutions. Finally every problem also contains an operator list of operators that are known to work with the problem's solution representations. This list is usually hidden in many problems as the list is populated all by itself by discovering all operators that implement a certain interface for example.

Among these problems, there is a special problem designed for the purpose of calling an external evaluation function called ExternalEvaluationProblem. The problem allows the user to define a customized solution representation, as well as configure the operator list with operators that are available to solve the problem. The following screenshot shows the default parameters of this problem.

Parameter list of the `ExternalEvaluationProblem`

The list of parameters that are available there are briefly described:

  • BestKnownQuality - Displays the best known quality of the problem, this is updated through a special analyzer.
  • BestKnownSolution - Houses the best known solution so far, that is the scope that contains the solution representation
  • Client - The client that transmits the solution to the external application. The user configures the client by defining the appropriate channel and channel connection information.
  • Evaluator - The evaluator is an operator that collects variables from the scope and includes them in a message which will be sent to the external application. The user has to configure the evaluator such that it can find the variables to collect.
  • Maximization - Is necessary for the algorithm to know whether it should minimize or maximize the quality value.
  • Operators - Contains a list of operators that the problem provides to the algorithm. In this list any Crossover, Manipulation or other operator will be passed to the algorithm and will be made selectable there.
  • SolutionCreator - Is an operator that creates a solution. This operator can also be adjusted by the user as needed. One can use the included representation generators in the HeuristicLab.Encodings namespace, or use own generators from an own plugin.

In the tutorial section it will be shown how to configure these parameters to solve a real problem.

For completeness reasons a class diagram of the HeuristicLab.Problems.ExternalEvaluation namespace shall also be given. The interaction of the classes shall be explained and the operators described.

Class diagram of the external evaluation problem in HeuristicLab

The equivalent to the channels in the java service framework are implemented also in C# for use in HeuristicLab. There is an Interface IEvaluationChannel that defines a channel. As can be seen from the class diagram the methods that each channel has to provide are exactly the same as in the java service framework. Basically a channel can be opened and closed and there are two functions for sending and receiving. In HeuristicLab however there exists a third channel, the EvaluationProcessChannel, that provides a convenience wrapper around an EvaluationStreamChannel that is attached to the stdin and stdout streams of a process started from HeuristicLab.

Also shown in the class diagram are a number of converters that are used to convert HeuristicLab data types, such as the variables that contain the solution representation, to a SolutionMessage. The SolutionMessageBuilder is the class that contains a number of converters and allows the ExternalEvaluator to construct the message. Which converter shall be used can be configured in the GUI when creating the ExternalEvaluationProblem and the converters are naturally extensible. A user can define an own converter implementing IItemToSolutionMessageConverter and add it to the SolutionMessageBuilder in the GUI. By default the SolutionMessageBuilder is configured with all the converters that can be seen which in turn cover all the data types in the HeuristicLab.Data namespace. The solution representations in HeuristicLab.Encodings are derived from those, so the converters also cover them too. But any solution representation not derived directly from one of the objects in HeuristicLab.Data naturally requires an own converter (e.g. HeuristicLab.Encodings.SymbolicExpressionTree is such a type). Next in the class diagram is the EvaluationServiceClient which holds a Channel and allows the ExternalEvaluator to communicate with the remote process. Finally, there is a class that describes the ExternalEvaluationPRoblem itself, as well as the QualityMessage (the expected answer from the service) and the SolutionMessage. These two are automatically created by parsing the ExternalEvaluationMessages.proto file with the compiler tools of Google's protocol buffer framework.

Tutorial

In this tutorial we're going to connect the HeuristicLab 3.3 Optimizer with the Race Car Setup Optimization Competition that was organized for the EvoStar 2010 conference. Unfortunately the submission is already closed for this year, but let's hope there will be another competition hopefully in 2011.

The problem is a classic "simulation-based optimization problem" so there is a simulator, a simulation model, and several configuration parameters that have to be optimized. Additionally there might or might not be noise present in the evaluation of a configuration. More concretely the simulator is called TORCS, the simulation model consists of a certain simulated driver that tries to race as fast as possible and the configuration parameters are various settings of the race car that can be tuned. The challenging part in this problem is the limitation of the number of simulation time. Each optimizing run is given a certain number of simulation time in which the best parameter has to be found. Each solution is evaluated with a certain slice of that simulation time. It is possible to choose the size of this slice such that one evaluates a shorter time, with more noise on the quality values, but with a bigger number of possible evaluated solutions or a longer time with less noise, but the algorithm can calculate fewer generations. The driver in the simulation model starts where it left off after the last configuration, so there is some noise that depends on the driver's position on the track and some that depends on the previous configuration, which could have sent the driver off the track.

To prepare for this tutorial there's a couple of things that we need to obtain.

  1. Download the simulator TORCS from http://torcs.sourceforge.net/
  2. Download the Windows Server, Java Client and the manual from http://cig.dei.polimi.it/?page_id=103
  3. Follow the instructions in the excellent manual that you just downloaded to learn more about the problem situation and how to perform the setup

Once you're done, you should get a latest copy of HeuristicLab 3.3 and unzip it into a folder or [DevelopersManual#BuildSources build it from source] using Visual Studio. We're now going to configure the problem in HeuristicLab and then take a look at the client and finally let it run.

Configuring the Race Car Configuration Problem in HeuristicLab

In the Optimizer we select New from the File menu and choose the External Evaluation Problem from the list of Creatables among which we find the section Problems. You may have to scroll down a little until you reach the Problems.

Now we are presented with a new External Evaluation Problem such as we have seen in the architecture description above. It shall be shown here again.

Parameter list of the `ExternalEvaluationProblem`

So, first let's give the problem a proper name and a proper description so that when we load the problem later we know what this is all about.

  • In the title type "Race Car Configuration Problem (EvoStar 2010)"
  • In the description type "Problem that represents race car configuration challenge of EvoStar 2010"

Now we want to configure the solution representation and for this purpose we click the operator called SolutionCreator and get following screen.

We can examine the operators parameters by selecting the "Parameters" tab, and we will see that there is only a "Successor" parameter defined which is null. This parameter defines the next operator that would be executed after the SolutionCreator, but we don't need to bother with this parameter in this case and switch back to the "Operators" tab. There is a button with a yellow plus icon that we're going to press.

The type selector appears that has the ability to show every HeuristicLab object, but in this case, because the SolutionCreator exposes a list of only IOperators we see a dialog with only operators to choose from. We're interested in creating our solution representation and given that our simulation model expects a vector of real values that describe the car's configuration, we will look into the HeuristicLab.Encodings.RealVectorEncoding namespace. Click on the "+" infront of that namespace to list all the available operators there.

That is quite a list of operators, but don't get confused. From the names we can see what they are doing, there are several crossovers, manipulators, and a few other operators. Each of these operators has a description that you can see when you click onto it. The description is very helpful in describing what the operator does and there are even references to publications for operators that are taken from the literature. The one operator we are looking for to create a vector of real values is the UniformRandomRealVectorCreator. Look for it in the list down the bottom and click OK.

Views are usually nested in each other and when you click on the operator you should see its view opening to the right. If you can't see the operator's details due to low screen resolution you can double click the operator and it will open in a completely new view and in a new tab next to the tab of the problem.

What we have to do next is configure this operator. We see in the operators details that it has five parameters: Bounds, Length, Random, RealVector, and Successor. Like before we can ignore the Successor parameter, but the other 4 are interesting to us. You can click on any of these parameters to get a helpful description in the parameter's own view (which is nested again), you can also double click a parameter to open a view in a new tab. The Bounds parameter tells us that it represents "A 2 column matrix specifying the lower and upper bound for each dimension. If there are less rows than dimension the bounds vector is cycled." This is very helpful information indeed! We know, from reading the challenge's manual that the solution vector is bound in every dimension by 0 and 1 and now we have two choices. We could now click on the button with the pencil icon and create a new Bounds matrix directly in the parameter of this operator or we can set the Actual Name of this parameter to the name of the parameter that we create in a higher level. If we'd create the parameter directly in the parameter, other operators do not have access to it and because the Bounds are also needed in other operators like Crossover and Mutation it is a better idea to create the bounds matrix at the problem level.

So we click the yellow plus icon in the parameters list of our problem (marked in the following screenshot). If you have opened the operator in a new tab, switch to the tab of our problem (it starts with "Race Car Configuration...").

Again we see a dialog similar to the type selector.

This time it is restricted to all kinds of IParameters of which there are some in the HeuristicLab.Parameters namespace. There are several kinds of parameters of which there are two important types:

  1. Lookup parameters
  2. Value parameters

Lookup parameters do not have a value of their own, but rather search for this value in a higher scope. Value parameters on the other hand contain a value of their own and do not look for the value in a higher scope. The combination of the two is a ValueLookupParameter that either has a value in which case it will not look in a higher scope or it does not have a value in which case it will look in a higher scope. So it's the best of the two worlds. Usually when a parameter can be either set directly or looked up a ValueLookupParameter is well suited. The Bounds parameter for example of the solution creator is such a parameter. In this case however, we're at the problem scope and there's no higher scope so what we need is a new ValueParameter<T>. The <T> basically states that we have to define the type of the parameter and that we could make a parameter of IOperator for example or in our case, we need a ValueParameter<DoubleMatrix>. So we select the ValueParameter, after which a new field becomes visible.

Click the "T" in this box and the pencil button on the right to set the type. You're presented with a new window of the type selector that you now know already. You can type "matrix" in the search text box which limits your choices to just a few types, among them the DoubleMatrix. Double click the DoubleMatrix and see how the textbox now says "T: DoubleMatrix". Finally we need to give the parameter a name. Because our real vector solution creator expected it to be called Bounds, it is less configuration effort if we also just name this parameter Bounds. We can also add a description to remind us again what this parameter describes. Let's type: "Defines the upper and lower bounds for the solution vector." It should look like in the next screen before you click OK.

Now we need to assign the Bounds, so we click the newly created parameter in the parameter list and click the pencil icon in its detail view just under "Value".

That's simple! Select OK.

Now you should configure the Bounds matrix. Remember the description from the Bounds parameter of the SolutionCreator: Each row specifies the lower and upper bound for each dimension, if there are less rows than dimensions the rows are cycled. Because all our parameters will have the same bounds we just need to set 1 row and 2 columns and put 0 and 1 in the respective cells. It should look like the next screenshot.

Now we have the bounds for our vector defined at the problem level and every operator that needs these bounds can find them if they have a lookup parameter defined for it. Let's go back to our UniformRandomRealVectorCreator. Click on SolutionCreator and there it is in its operators list. If you have opened the tab earlier you can also just switch to that tab. The next parameter that we're going to examine is the Length parameter. This defines the length of our solution vector. This is not a parameter that will not be used by other operators that we encounter in this tutorial, so we just set the value directly there. Click the pencil icon. If that pencil icon is hidden to you, again due to screen resolution just double click the Length parameter and perform the actions in the tab. Close this tab again when you're done.

In the upcoming dialog confirm that it is an IntValue and click ok to create the value. Our solution vector is of length 22, because we have 22 values to optimize so insert 22 in the value text box. If you can't see the value text box, double click on Length.

Let's look at the other parameters. There's Random next and then there is RealVector. If we click on RealVector we see that this is the name of the variable that will house our solution vector. We can change this to another name by setting the Actual Name, but since this is our only vector we can leave it at that.

So we have defined our SolutionCreator! Now we need to define the Evaluator, specify some operators to manipulate that real vector and finally specify the connection to our evaluation service.

Attachments (29)