Experiments
You might want to take a look at the following pages before exploring experiments:
On this page we describe how to specify experiments in our experiments.yml file. We will see how
to use builds in experiments, redirect the output and enable options such as repeating experiments or
setting a timeout. Moreover, we will see that simexpal can be used together with batch schedulers (e.g.
Slurm,
Oracle Grid Engine (formerly SGE),
…), and how to set environment variables.
Specifying Experiments
In this section we will see how to specify experiments with different kinds of Instances and Variants by using the keys:
name: name of the experimentargs: list of experiment arguments.
Experiments with Local/Remote Instances
Assuming we have a ./my_sort.py file in the same directory as the
experiments.yml that takes a keyword argument --algo and a path to a
single (local/remote) instance
as input, we can define our experiments.yml as follows:
1experiments:
2 - name: insertion-sort
3 args: ['./my_sort.py', '--algo=insertion-sort', '@INSTANCE@']
4 stdout: out
In the examples above we created an experiment named insertion-sort. As experiment arguments we have
a list of strings (instead of one space separated string). Note that the @-variable
@INSTANCE@ resolves to the paths of the instances given in the instances stanza, i.e,
/instance_directory/<instance_name>.
If the instances have Extra Arguments, we further need to add the @-variable
@EXTRA_ARGS@ to the experiment arguments, e.g,
args: ['./my_sort.py', '--algo=insertion-sort', '@INSTANCE@', '@EXTRA_ARGS@'] for the example above.
@EXTRA_ARGS@ will resolve to the specified extra arguments of the respective instance (and also the used
variants, see below).
Experiments with Multiple Extension Instances
Specifying experiments with multiple extension instances works similarly to
specifying experiments with local/remote intances. They
only differ in the used @-variable in the experiment arguments. Here, we use the
@-variable @INSTANCE:<ext>@, where <ext> is an extension that is specified in the
extensions key of an instance in the instances stanza.
Assuming you have an algorithm that takes a path to a .graph and a .yxz file as input, you can
specify your experiment as follows:
1experiments:
2 - name: graph-algorithm
3 args: ['./algorithm.py', '@INSTANCE:graph', '@INSTANCE:xyz@']
4 stdout: out
The @INSTANCE:graph@ variable will resolve to /instance_directory/<instance_name>.graph during
runtime. Analogously for the @INSTANCE:xyz@ variable.
Extra Arguments are handled analogously to the case of Experiments with Local/Remote Instances.
Experiments with Arbitrary Input File Instances
Specifying experiments with arbitrary input file instances works similarly to
specifying experiments with multiple extension intances.
They only differ in the used @-variable in the experiment arguments. Here, we use the
@-variable @INSTANCE:<index>@, where <index> is the index of the desired file specified in the
files key of an instance in the instances stanza. Note that indices start at 0.
Assuming you have an algorithm that takes two input files as input and you want to pass the path to the first
file of the files key and then the path to the second file to your algorithm, you can specify your experiment
as follows:
1experiments:
2 - name: algorithm
3 args: ['./algorithm.py', '@INSTANCE:0', '@INSTANCE:1@']
4 stdout: out
The @INSTANCE:0@ variable will resolve to /instance_directory/files[0], where files[0] is
the first filename of the files key. Analogously for the @INSTANCE:1 variable.
Extra Arguments are handled analogously to the case of Experiments with Local/Remote Instances.
Experiments with Variants
To specify experiments with Variants we need to add the @EXTRA_ARGS@ variable to the experiment
arguments:
1experiments:
2 - name: algorithm
3 args: ['./algorithm.py', '@INSTANCE@', '@EXTRA_ARGS@']
4 stdout: out
The @EXTRA_ARGS@ variable resolves to the extra arguments of all variants (and also the used instance, see
above) of the experiment during runtime. For example, assume we have the following variants stanza:
1variants:
2 - axis: 'block-algo'
3 items:
4 - name: 'ba-insert'
5 extra_args: ['insertion_sort']
6 - name: 'ba-bubble'
7 extra_args: ['bubble_sort']
8 - axis: 'block-size'
9 items:
10 - name: 'bs32'
11 extra_args: ['32']
12 - name: 'bs64'
13 extra_args: ['64']
Then @EXTRA_ARGS@ will resolve to
'ba-bubble', 'bs32','ba-bubble', 'bs64','ba-insert', 'bs32'and'ba-insert', 'bs64'
in the respective experiments.
Use Builds
On the Builds page we explained how to set up automated builds. In order to use those builds for our experiments we need to specify them with the
use_builds: list of used build names
key. Assuming that we have defined build1 in our builds stanza, we can link the build to
the experiment as follows:
1experiments:
2 - name: experiment1
3 args: ['<name_of_executable_of_build1>', ...]
4 use_builds: [build1]
5 ...
In this way simexpal will check the installation directory and the extra_paths
of the builds specified in use_builds for the executable. If a build
requires other builds and they are properly specified in the requires key, then
simexpal will also check the installation directories and extra_paths of those builds.
Output
To redirect the output of an experiment to the ./output/ folder, we specify the
stdout: extension of the output fileoutput: dictionary containing all output file extensions
keys.
Assume the following experiments stanza in our experiments.yml:
1experiments:
2 - name: experiment1
3 ...
4 stdout: 'out'
5 output:
6 extensions: ['out', 'foo']
Simexpal will then store the outputs in <instance_name>.out files, which are located in the
./output/<experiment_name>~<variant_names>@<revision_name>
directory.
Note
In previous versions of simexpal we would specify the output key with 'stdout' as value, i.e
output: 'stdout', to achieve the behaviour above. This is deprecated and might be removed in
future versions.
The substring ~<variant_names> only appears, if the experiment has variants. <variant_names>
will then be a comma separated enumeration of the used variants. The suffix @<revision_name>
appears if the experiment uses builds and shows the name of the used revision.
To access the output files with other extensions, we can use the @-variable
@OUTPUT:<ext>@, where <ext> is an extension specified in the extensions key. This
@-variable can be used in the args key of experiments and is useful for use cases like the following:
The experiments that we are running store all intermediate steps and results. Thus, when taking a look at
the output files, we could encounter thousands (or even more) lines of information even though we might
only be interested in the last couple of lines. To avoid this, we add another input parameter, which takes a
file path, to our experiments. We then store the final experiment results in this file. Our experiment args
could then look like this:
args: ['experiments.py', '@INSTANCE@', '@OUTPUT:foo@'],
where the first file path is the path to the instance and the second file path is the path to the output file
that contains the final results (@OUTPUT:foo@ will resolve to the output file with extension .foo).
Repeat
Sometimes it might be useful to validate experiment results by repeating the experiment. In order to
avoid duplicating an experiments entry we can use the
repeat: integer - number of times an experiment is repeated
key. To repeat an experiment twice we define our experiments stanza as follows:
1experiments:
2 - name: experiment1
3 ...
4 repeat: 2
The default value of repeat is 1.
Timeout
Use the timeout key in the experiments section to specify the time in
seconds an experiment is allowed to run for. When the timeout is exceeded, the
experiment will be terminated forcefully. The following is an example on how to
set a timeout after 7200 seconds (2 hours):
1experiments:
2 - name: experiment1
3 ...
4 timeout: 7200
After the experiment has reached the limit of the specified timeout, the signal
SIGXCPU is sent to the running process. SIGXCPU can be handled by the
process first, and after a grace period the signal SIGTERM is sent to the
process for the final termination.
Setting Environment Variables
When using APIs like OpenMP it is sometimes necessary to specify settings as environment variables. Thus, simexpal supports setting environment variables in experiments by specifying the
environ: dictionary of (environment variable, value)-pairs
key. For example you can specify the OMP_NUM_THREADS environment variable as follows:
1experiments:
2 - name: experiment1
3 args: ...
4 ...
5 environ:
6 OMP_NUM_THREADS: 2
7 - name: experiment2
8 args: ...
9 ...
10 environ:
11 OMP_NUM_THREADS: 4
Slurm
sbatch: --ntasks-per-node, -c, -N
When using a job scheduler like Slurm it might be useful to run your software using different node/cpu settings.
Currently, simexpal supports the following three sbatch parameters by using its own keywords in
the experiments.yml:
procs_per_node: number of tasks to invoke on each node (slurm:--ntasks-per-node=n)num_threads: number of cpus required per task (slurm:-c,--cpus-per-task=ncpus)num_nodes: number of nodes on which to run (N = min[-max]) (slurm:-N,--nodes=N)exclusive: boolean flag to run an experiment exclusively on specified computing resources (slurm:--exclusive)
1experiments:
2 - name: experiment1
3 ...
4 num_nodes: 1
5 procs_per_node: 24
6 num_threads: 2
7 - name: experiment2
8 ...
9 num_nodes: 2
10 procs_per_node: 24
11 num_threads: 2
When launching your experiments with slurm, the line -N 1 --ntasks-per-node 24 -c 2
will be appended to the sbatch command for experiment1. Analogously for experiment2.
Arbitrary sbatch Arguments
In the section before, we saw how to set the values of three supported sbatch arguments. In
this section, we will see how to set the value of any supported sbatch command. To do so, we
use the
slurm_args: list of additionalsbatcharguments
key. For example, we can set the job name of an experiment by using the -J parameter of the
sbatch command:
1experiments:
2 - name: experiment1
3 ...
4 slurm_args: ['-J', 'arbitrary_jobname']
Next
To get a more detailed understanding of experiment variants and fully set up your experiments, you can visit the Variants page. If you do not plan on having experiments, you can visit the Run Matrix page to modify the experiment combinations that you want to run.