4Application Preparation

This chapter should guide you through the process of creating a new application for the Antares system. You do not need any programming language knowledge. You only have to modify or create a new parameter file.

Each chapter, one after other is the step in the creation procedure.

4.1 Gene Coding

4.2 Fitness Definition

4.3 Algorithms

4.4 Stopping Conditions

Do not forget to add all required parameters if you have created a new parameter file.

4.1Gene Coding

The problem description very often consists of more then one specification. You must decide which of these specifications optimize and how to encode them into gene. For each chosen specification you must specify following attributes: accuracy, minimal and maximal value, type of specification (real or integral number) and type of encoding. Specification can be either single value or data array (vector). Each specification is stored inside the gene in given position called gene part. Optimization process is done over whole gene according to fitness function (see fitness construction). When it is done gene parts are decoded back into problem specifications. Parameter GenCode (String) describing gene part(s) must be specified in parameter file. It has the following format:

name(accuracy,type,minimum,maximum,dimmension)

where: name is a name of the specification (each specification must have a name and names of specifications must be different) accuracy is integer value that says how many bits use to store specification (the more the better accuracy) type is one of the following:

0 - integer stored using binary encoding
1 - integer stored using Gray encoding
2 - real stored using binary encoding
3 - real stored using Gray encoding

minimum and maximum are either integers or real numbers. Value won't exceed these bounds. dimension - vector dimension when using vector specification, otherwise 1 when using more specifications use ":" as their separator. first(.....):second(.....):third(.....) etc. example problem: optimize real function of two variables on <0,1>^x<0,1> parameter file should contain something like this: GenCode_Type=String GenCode_Value=x(32,3,0,1,2) (variable are x[0] and x[1], both are real stored using Gray encoding with 32 bit accuracy, bounds are 0 and 1 ) or GenCode_Type=String GenCode_Value=x(32,3,0,1,1):y(16,2,0,1,1)(variables are x and y, x is real stored using Gray encoding, 32 bit accuracy, y is binary encoded real with 16 bit accuracy, both have bounds 0 and 1 )

4.2Fitness Definition

4.2.1 Main

4.2.2 How to use gene parts

4.2.3 How to use named constants

4.2.4 How to use neural networks

4.2.1Main

The fitness is the rule that says how good is the gene. In fact it is a function; it can be formed from elementary functions. Gene parts are variables of this function (see gene parts in expression). You can also use named constants (see named constant in expression) and neural network (see neural network in expression).
Fitness expression is a string describing this function in linear form (one row with many parenthesis).
The minimum and the maximum of the fitness function must be precounted.
Parameter "FitExpr(String)" describing the fitness expression must be specified in the parameter file.

Elementary functions supported by the system:


+                plus 


-                minus 


*                multiply 


/                divide 


^                power 


SIN(x)           sinus 


COS(x)           cosinus 


TAN(x)           tangent 


PI()             pi 


SUM(i,from,to,x) sum 


SQRT(x)          square root 


LOG(x)           log10 


LN(x)            natural logarithm 


EXP(x)           e to the power x 


MOD(x,y)         modulo 


SGN(x)           signum 


SGNP(x)          when x>=0 returns 1; otherwise 0 


SGNN(x)          when x>0 returns 1; otherwise 0

Example functions (as written in parameter file):

FitExpr_Type=String
FitExpr_Value=x+(LOG(x+y)*2)

FitExpr_Type=String
FitExpr_Value=SIN(y^x)+(3*PI)

FitExpr_Type=String
FitExpr_Value=SUM(i,0,2,x[i]*c[i])
; this means x0*c0+x1*c1+x2*c2

Note that x, y and c are variables or constants and must be defined (see gene parts and named constants)

4.2.2How To Use Gene Parts

If you have defined gene encoding (see encoding information into gene) you can use names of parameters in fitness expression as variables of the fitness function. If the parameter is single, then you can simply put its name in the place where the variable fits. If it is a vector, you must add [i] alter the name to identify which vector element should be used as a variable (e.g. x[0]). Vector elements are indexed from 0.
Note that single parameter is vector parameter with only one element (so x is equal to x[0]).

4.2.3How To Use Named Constants

You can define your own constants called named constants. Named constant can be either single constant, vector of constants or n-dimensional field. If the named constant is single, then you can simply put its name in the place where the constant fits. If it is a vector, you must add [i] after the name to identify which vector element should be used as a constant (e.g. c[0]). If you are using n-dimensional field of constants you must add[i1,i2,....in] to the name (e.g. c[4,3,6]). All elements are indexed from 0.
Note that single constant is vector constant with only one element (so c is equal to c[0]).
Parameter "FitCon(String)" must be specified in parameter file. Also all parameters corresponding to constants named „FCname[...]“ (any numeric type) must exist in parameter file.

single constant (must be defined as one element vector!):
name[1]

vector of constants:
name[dim]

n-dimensional field of constants:
name[dim1,dim2,.....dimn]

when using more named constants use : as their separator:
first[.....]:second[.....]:third[.....] etc.

example: constant c=2.3456 and constant vector pr=[3,2]
parameter file should contain something like this:

FitCon_Type=String
FitCon_Value=c[1]:p[2]
FCc[0]_Type=Float
FCc[0]_Value=2.3456
FCpr[0]_Type=Integer
FCpr[0]_Value=3
FCpr[1]_Type=Integer
FCpr[1]_Value=2

4.2.4How To Use Neural Network

If you have trained neural network (see training of the neural network), you can use it as a function in the fitness expression. This function has the same number of input parameters as the neural network does, its output is the output of the neural network. The input values must be from the range <0,1>, output is also from <0,1>.
The name of neural network function must be specified in parameter "NeuNet(String)" in parameter file and file with weights of neurones (see how to train neural network) must be included into parameter file.

#include "weights.dta"
NeuNet_Type=String
NeuNet_Value=net

example: trained network (named „net“) has topology 4-2-2-1, the gene contains gene parts a,b,c and d (all from the range <0,1>).
Then the fitness expression can look like this:
net(a,b,c,d)

4.3Algorithms

Through the parameter file you can change the way how the population of genes is evoluted. It is not to complicated, you only specify which operation when and how is executed.

4.3.1 Operators

4.3.2 Setting Up Tables

4.3.1Operators

4.3.1.1 Basics

4.3.1.2 GAtables

4.3.1.3 Remarks

4.3.1.4 Specifications

4.3.1.5 Parameters

4.3.1.1Basics

The operators (GAtools) is the way to evolute the population. They change the gene sets - population, or the genes themselves.

Basically three kinds of operators exists, they differ mostly in the number of parameters they need for their execution:

GAunoms
Requires one parameter, one population. The changes are made inside this one population.
GAfunctions
Requires two parameters, two populations. The result is always placed in the second population. It depends on the specific GAfunction whether only the first population or both of them are used for the execution.
GAselectors
These operators selects the specified number of genes. There are four parameters The source population, the destination population, the number of genes to be selected and the flag that the genes should be copied or moved.
...GAtables
This is specific part of the GAfunctions group. They will be explained later.

4.3.1.2GAtables

These specific function GAfunctions does not make any changes directly. They consist of one or two (GAdualtables) sets of GAtools and a set(s) of temporary populations instead. Then the execution of a GAtable consists of the execution of every GAtool in the set one after one. Every table has the ability to load the set specification from the paramter file. This and the fact that every genetic algorithm includes one such table, enables huge variability of the computation algorithm without any programming.
Note: The temporary populations are cleared after each execution .

4.3.1.3Remarks

It does not make any problems to use the same population as both the parameter in GAfunc and GAselectors. Of course, in some operators it does not make sense.

If the GAselector is asked to select more genes than actually exist in the incoming population, all genes are selected.

4.3.1.4Specifications

GAunoms:

Empty
Deletes all genes in the population.

Sizer
The number of genes is changed to be equal to the number specified by the parental genetic algorithm. The weighted-best ones are cloned or the worsts are deleted.

Elitism
In the first run a number of best genes is copied (the number is the elitism size). Every next run the weighted-worst genes are replaced by the saved genes from the previous run. Than a new copy is made.

GAfunctions:

Crossover
Pairs of genes from the incoming population are taken and a standard crossover is made. The possibly odd gene is only moved to the output.

Mutation
Every gene from the incoming population is taken. Every bit of the gene can be altered with the mutation probability. The result is moved into the output

Pump
The output population's size changes to be equal to the size specified by the parental genetic algorithm. Missing genes are taken from the incoming population (weighted best), surplus ones are deleted (the worsts).

Copy
All genes from the input are copied to the output.

Move
All genes from the input are moved to the output.

GAselectors:

Random
The genes are selected randomly.

Weighted Random
The bigger the fitness of the gene is, the bigger the chance of selection of the gene is.

Inverted Weighted Random
The smaller the fitness of the gene is, the bigger the chance of selection of the gene is.

Bests
The best genes are selected.

Worsts
The worsts genes are selected.

GAtables:

Single
The set of tools is simply executed.

Cycle
The set of tools is executed for a specified number of times. Only then the temporary populations are emptied.

GAdualtables:

These tables contains two sets of tools. Basically they differ in the way of selection, which set will be executed.

Random Fork
The first set is selected with the specified probability.

If Fork
The second set is selected if the gene with the best/worst fitness in the incoming population exceeds the border fitness.

Count Fork
For the specified number of generations the first set is taken then the second one.

Alternate
For the specified number of generations the first set is taken, then the second one for the the some count of generations and again the first. Etc. etc...

4.3.1.5Parameters

Some GAtools requires parameters from the parameter file(e.g. mutation probability in Mutation).

Every GAtool can have a unique name. If it does, then the parameter with the name of the GAtool is searched first. If it is not found, than the simply parameter, without any prefixes is searched. If even this is not found the operator can not be executed.

Elitism
EliteSize the number of saved genes for the next cycle.

Mutation
MutProb probability of mutation.

All GAtables requires their sets definitions, except these some other parameters can be required.

Cycle
CycleCount number of cycles to make.

Random Fork
FirstTabProb the probability the first set will be selected.

If Fork
DeadLine the border fitness
AllGenes flag if one gene above is enough or all of them must be higher.

Count Fork
FirstRunCount the number of generations the first set will be used.

Alternate
FirstRunCount the number of generations the first set will be used.
SecondRunCount the number of generations the second set will be used.

4.3.2Setting Up Tables

Every GAtable definition is defined in one group of the parameter file.

A group called default have to exist. This is a definition of the default table contained in every child.
Any other tables definition is placed in the group named as the GAtable+'1', in the case of GAdualtables also '2'.
For now we will call this group "tablegroup".

So what every tablegroup must contain:

number of temporary populations
number of GAtools
definitions of tools

Precisely:

Parameter of temporary population count +2 for fixed population (Incoming = 0, Outgoing = 1).

groupnamePopCount_Type=Integer
groupnamePopCount_Value=2
groupnamePopCount_Change=Yes

Parameter of the tools count

groupnameToolCount_Type=Integer
groupnameToolCount_Value=1
groupnameToolCount_Change=Yes

Every GAtool is than indexed from one to count-1.

The syntax for parameter value of the GAtool definition is :

groupname<index>_Value= TypeNumTypeName":"[ToolName]"("parameters")"

groupname is the name of the tablegroup
<index> is the index number of the tool in the set
TypeNum is: GAunom=1, GAfunction=2, GAselector=3, GAtable=4, GAdualtable=5
TypeName is the kind of tool
ToolName optional, is the unique name of the tool
parameters differs by the type of the tool, they are delimited by commas:

GAunom
index of the In/OutComingPopulation
GAFunction, GAtables, GAdualtables
index of the Incoming population, and index of the Outgoing population
GAselector
index of the Incoming population, index of the Outgoing population, number of genes to be selected and 'M' for move or 'C' for copy

For examples look the parameter files in the installation directory.

4.4Stopping Conditions

Although the kernel can work until not stopped by the user, some build-in stopping conditions exists of course.

4.4.1 Child Process

4.4.2 Parent Process

4.4.1Child Process

The child process can stop due to these condition:

best gene fitness
average fitness of the population
minimal fitness of the population
number of generations to make
number of operations to make
number of changes to make

operation
Every (mostly) GAtool execution is considered as an operation. Exceptions are: GAselectors, GAtables and some others (Empty, Copy, Move).
change
Every change in the genes bit array is considered as a change.

Here is a list of appropriate parameters (switch means a boolean parameter un/enabling the condition)

Parameter	Description	Switch
MaxGenerations	Maximal count of generations	CheckMaxGenerations
MaxOps	Maximal count of operations	CheckMaxOps
MaxChngs	Maximal count of changes	CheckMaxChngs
MinEndFit	Minimal fitness to end	CheckMinEndFit
AvgEndFit	Average fitness to end	CheckAvgEndFit
MaxEndFit	Fit of best gene to end	CheckMaxEndFit

4.4.2Parent Process

If parent is stopped, it stops all running child processes immediately.

These are the stopping conditions:

a specified number of child processes has stopped
a specified percent of all child processes has stopped
the best fitness of all the populations
the average fitness of all the populations
the minimal fitness of all the populations
the sum of all changes in all population has reached a specified number
the sum of all operations in all population has reached a specified number

Parameter	Description	Switch
CentralMaxOps	Maximal count of operations
CentralMaxChngs	Maximal count of changes
CentralMinEndFit	Minimal fitness to end
CentralAvgEndFit	Average fitness to end
CentralMaxEndFit	Fit of best gene to end
FinalPercent	Perecent of stopped childs to stop all	ComputePercent
FinalCount	Count of stopped childs to stop all	ComputeCount