4.1Performance

We have performed several tests to measure the system performance. We have compared the performance of the genetic algorithm hosted on a single processor versus performance of the genetic algorithm using diffusion method of parallelization. We know that to perform a really good test of such a complex system is very difficult, maybe impossible. In several cases we also compare things very different and hardly comparable.

In this section we assume reader's good knowledge of the system Antares. For detailed information see the system documentation.

4.1.1 Test Conditions

Our aim was to achieve the same or at least similar conditions to all tested variants.

Problem Specification

We have chosen a function optimization as a test problem. Each gene has represented a single real value x from range <-100,100>. The value was encoded in 32 bits by Gray coding. For the chosen test function see Fig.4.1-1. The task was to find value of x that gives the maximum value as an argument of the test function (x=0). The test function has several points of local maximum to make the problem more difficult to the genetic algorithm.


Figure 4.1-1: Test function used for optimization.



Parameters

Every test variant has been initialized with the same population. There were only certain population sizes used during the test: 49, 100, 200, 294, 400 and 600 genes in a population. Populations of these sizes was generated randomly in advance and they were used as an initial population for both the serial and the parallel test variants. These are the parameter files containing these populations:

The system kernel has too many parameters, to test the performance for all their settings. Only several very important parameters were tested for a range of values (variable parameters). Other parameters were set to be constant for all variants of the test (constant parameters). We have performed a few tests (not documented) to get the optimal value of at least the key constant parameters. The rest of the parameters was set to their default values.

Termination Condition

The computation was stopped whenever the fitness value of any gene was greater then 0.999999999. In addition was the computation stopped after the number of executed genetic operators has exceeded 6000.

Tested Values

Every test variant was performed five times. The test results were averaged and the variances were computed. The tested values for all variants were:

4.1.2Serial Genetic Algorithm Test

We have performed two types of a serial genetic algorithm. The first, referred as dynamic, is more similar to the genetic algorithm performed by the diffusion method. The second, referred as classic, represents more "classic" genetic algorithm. For details about the used genetic operators see section Genetic Algorithm.

The only variable parameter of the serial test was the size of the population.

Results

Fig. 4.1-2 shows the test results of the dynamic variant of genetic algorithm. Fig. 4.1-3 shows the test results of the classic variant of genetic algorithm.

Figure 4.1-2: Result of the dynamic variant of genetic algorithm.
Population Optimum Run Time Variation
49 20% 88 46.57
100 60% 108.6 45.16
200 20% 186.8 25.67
294 100% 121.4 24.67
400 40% 213.4 91.81
600 80% 242.2 90.99

Figure 4.1-3: Result of the classic variant of genetic algorithm.
Population Optimum Run Time Variation
49 100% 90.6 40.23
100 100% 66.8 11.43
200 80% 136.2 62.05
294 40% 201.4 100.46
400 20% 280.6 87.62
600 20% 333.8 145.66

4.1.3Parallel Genetic Algorithm Test

We have tested the diffusion parallelization method. The variable parameters were the subpopulation size and the number of child processes. Each child process was hosted on its own processor from the actual PVM configuration. We have tested these variants during the night to avoid the network traffic. The last variable parameter was ShadowDmax (referred as Shadow in the figures). This parameter, in fact, determines the number of gene shadow copies kept by the child process for its every direct neighbour. This parameter affects the the amount of inter-process communication (the degree of cooperation of child processes). The used neighbourhood type was 4 neighbours grid.

Results

Fig. 4.1-4 shows the test results of the diffusion method performed with a single child process. Fig. 4.1-5 shows the test results of the diffusion method performed with two child processes. Fig. 4.1-6 shows the test results of the diffusion method performed with four child processes. Fig. 4.1-7 shows the test results of the diffusion method performed with six child processes.

Figure 4.1-4: Result of the test variants of the diffusion method (one processor).
Subpopulation Optimum Run Time Variation Init Time Variation Time
49 20% 75.4 4.56 6.6 0 82
100 60% 91.6 56.37 6.6 0.55 98.2
200 20% 74.4 21.72 6.4 0.55 80.8
294 100% 52.4 19.93 3 0 55.4
400 100% 58.4 27.5 4.8 2.4 63.2
600 80% 67.6 30.51 5.80 0.84 73.4

Figure 4.1-5: Result of the test variants of the diffusion method (two processors).
Subpopulation Shadow Optimum Run Time Variation Init Time Variation Time
49 1 100% 87.6 54.46 6.6 0.89 94.2
49 2 100% 27 16.5 7 0.7 34
49 3 100% 40 30.33 6.6 0.55 46.6
100 1 100% 53.8 26.68 6.4 0.55 60.2
100 2 100% 54.8 56.3 6.8 0.45 61.6
100 3 100% 96.4 70.51 6.6 0.55 103
200 1 100% 55.4 21.52 7.6 0.89 63
200 2 100% 51.4 22.37 6.6 0.89 58
200 3 100% 55 26.7 6.6 0.55 61.6

Figure 4.1-6: Result of the test variants of the diffusion method (four processors).
Subpopulation Shadow Optimum Run Time Variation Init Time Variation Time
49 1 100% 37.6 20 6.8 0.45 44.4
49 2 100% 52.8 21.68 6.6 0.55 59.4
49 3 100% 45 25.91 6.2 0.45 51.2
100 1 100% 25.4 11.46 7.6 1.14 33
100 2 100% 40 9.19 7 0.7 47
100 3 100% 17.2 6.76 7.2 1.3 24.4
200 1 100% 25.8 10.3 15.8 4.65 41.6
200 2 100% 27 9.41 15.4 3.05 42.4
200 3 100% 26.8 12.64 15.8 3.27 42.6

Figure 4.1-7: Result of the test variants of the diffusion method (six processors).
Subpopulation Shadow Optimum Run Time Variation Init Time Variation Time
49 1 100% 12.6 10.41 6.6 0.54 19.2
49 2 100% 23.6 15.72 6.1 0.45 29.7
49 3 100% 17.8 6.3 6.4 0.55 24.2
100 1 100% 16.4 6.07 9.6 0.89 26
100 2 100% 16.6 4.83 10 1.41 26.6
100 3 100% 23.8 9.39 9.8 1.48 33.6
200 1 80% 14 4.95 25.8 2.95 39.8
200 2 80% 18.8 10.8 27.6 2.3 46.4
200 3 100% 24 11.89 26.6 1.67 50.6

4.1.4Test Conclusions

The first conclusion is that the parameter ShadowDmax affects the performance of the diffusion method a lot. This parameter must be chosen very carefully with respect to the number of child processes, size of their subpopulation and the network capacity. The following figures show the relation between the value of the ShadowDmax parameter, the size of subpopulation and the run time.


Figure 4.1-8: Relation between the value of the ShadowDmax parameter,the size of subpopulation and the run time (2 processors).


Figure 4.1-9: Relation between the value of the ShadowDmax parameter,the size of subpopulation and the run time (4 processors).


Figure 4.1-10: Relation between the value of the ShadowDmax parameter,the size of subpopulation and the run time (6 processors).

We have compared the performance of diffusion method performed with a single child process hosted on a single processor versus the serial genetic algorithms (dynamic and classic). In fact, this is a comparison of three serial algorithms. The diffusion method has, in many cases, better performance than the dynamic and classic variant of genetic algorithm. We assume the reason of this fact is, that genetic operators executed during the diffusion method can change genes only in the limited part of the grid structure of the sbpopulation. Therefore the diffusion method seems more robust to the danger of suboptimal result. See Fig. 4.1-11 and Fig. 4.1-12 for comparison of these genetic algorithm variants.


Figure 4.1-11: Comparison of time for diffusion (one processor), dynamic and classic GA variants.


Figure 4.1-12: Comparison of optimum ratio for diffusion (one processor), dynamic and classic GA variants.

The last comparison we have made was the performance of the diffusion method with respect to the number of child processes i.e. processors. See Fig. 4.1-13 for the comparison.


Figure 4.1-13: Comparison of time for the diffusion method with respect to the number of processors.

System OverviewProject Antares