Evaluated artifact for "quantifying software reliability via model-counting"

TechnicalRemarks: # counterSharp Experiment and Play Environment

This repository contains the reproducible experimental evaluation of the counterSharp tool. The repository contains a Docker file which configures the counterSharp tool, two model counters (ApproxMC and Ganak) as well as the tool by Dimovski et al. for our experiments. Furthermore the repository contains the benchmarks on which we ran our experiments as well as the logs of our experiments and scripts for transforming the log files into LaTeX tables.

Getting Started

In order to pull and run the docker container from docker hub you should execute docker run. Or, you can load the archived and evaluated artifact into docker with docker load < countersharp-experiments.tar.gz

If the image is loaded, docker run opens a shell allowing the execution of further commands:

bash docker run -it -v `pwd`/results:/experiments/results samweb/countersharp-experiments

By using a volume, the results are written to the host system rather than the docker container You can remove the volume mounting option (-v ...), and create /experiments/results inside the container if you can spare the results. If you are using the volume and run into permission problems, then you need to give rights via SELinux: chcon -Rt svirt_sandbox_file_tpwd/results.

This will create a writable folder results in your current folder which will hold any logs from experiments.

A minimal example can be executed by running (this takes approximately 70 seconds): bash ./showcase.sh This will create benchmark log files for the benchmarks for_bounded_loop1.c and overflow.c in the folder results. For example, /experiments/results/for_bounded_loop1.c/0X/ contains five folder for the five repeated runs of the experiments on this file. Each folder /experiments/results/for_bounded_loop1.c/0X/ contains the folders for the different tools, which includes the log and output files.

A full run can be executed by running (this takes approximately a little under 2 days): bash ./run-all.sh

Additionally single benchmarks can be executed through the following commands: bash run-instance approx program.c "[counterSharp arguments]" # Runs countersharp with ApproxMC on program.c run-instance ganak program.c "[counterSharp arguments]" # Runs countersharp with Ganak on program.c Probab.native -single -domain polyhedra program.c # Runs the tool by Dimovski et al. for deterministic programs Probab.native -single -domain polyhedra -nondet program.c # Runs the tool by Dimovski et al. for nondeterministic programs For example we can execute run-instance approx /experiments/benchmarks/confidence.c "--function testfun --unwind 1" to obtain the outcome of counterSharp and ApproxMC for the benchmark confidence.c. Note that the time information produced by runlim is always only for one part of the entire execution (i.e. for counterSharp or one ApproxMC run or one Ganak run). The script run-instance is straightforwarded, we have the call to our tool counterSharp:

bash python3 -m counterSharp --amm /tmp/amm.dimacs --amh /tmp/amh.dimacs --asm /tmp/asm.dimacs --ash /tmp/ash.dimacs --con /tmp/con.dimacs -d $3 $2 which is followed by the call of ApproxMC oder ganak.

Benchmarks

The benchmarks are contained in the folder benchmarks which also includes an overview on the sources and modifications to the benchmarks
Note that benchmark versions for the tool by Dimovski et al. are contained in folder benchmarks-dimovski

Benchmark Results

The results are contained in the folder results in which all logs from benchmark runs reside. The log files from the evaluation are not available in the Docker Image, but just on GitHub. The logs are split-up by benchmark instance (first level folder), run number (second level folder) and tool (third level folder)
For example, the file results/bwd_loop1a.c/01/approxmc/stdout.log contains the stdout and stderr of running approxmc on the instance bwd_loop1a.c in run 01

Machine Details

All runs were executed on a Linux machine housing an Intel(R) Core(TM) i5-6500 CPU (3.20GHz) and 16GB of memory. Note that for every benchmark log 01/counterSharp/init.log contains information on the machine used for benchmark execution as well as on the commits used in the experiments.

Running benchmarks

For all cases of automated benchmark execution we assume a CSV file containing relevant information on the instances to run: The first column is the benchmark's name, the second column are parameters passed to counterSharp (see instances.csv) or the tool by Dimovski (see instances-dimovski.csv). All scripts produce benchmarking results for "missing" instances, i.e. instances for which no folder can be found in the results folder.

  • Run counterSharp on the benchmarks:
    run-counterSharp instances.csv
  • Run ApproxMC on benchmarks:
    run-approxmc instances.csv
  • Only after counterSharp has been run
  • Run GANAK on benchmarks:
    run-ganak instances.csv
  • Only after counterSharp has been run
  • Run Dimovski's tool on benchmarks:
    run-dimovski instances-dimovski.csv

Log summarization

Summarization is possible through the python script in logParsing/parse.py within the container. The script takes as input a list of benchmarks to process and returns (parts of) a LaTeX table. Note, that there must exist logs for all benchmarks provided in the CSV file for the call to succeed! - To obtain (sorted) results for deterministic benchmarks:
cat logParsing/deterministic-sorted.csv| python3 logParsing/parse.py results aggregate2 - To obtain (sorted) results for nondeterministic benchmarks:
cat logParsing/nondeterministic-sorted.csv| python3 logParsing/parse.py results nondet

Building the docker container

All tools are packaged into a Dockerfile which makes any installation unnecessary. There is, however, the need for a running Docker installation. The Dockerfile build depends on the accessibility of the following GitHub Repositories: - CryptoMiniSat - ApproxMC - Ganak - Probab_Analyzer - counterSharp

The Docker image is hosted at Dockerhub.

Cite this as

Teuber, Samuel, Weigl, Alexander (2023). Dataset: Evaluated artifact for "quantifying software reliability via model-counting". https://doi.org/10.35097/1520

DOI retrieved: 2023

Additional Info

Field Value
Imported on August 4, 2023
Last update August 4, 2023
License Other
Source https://doi.org/10.35097/1520
Author Teuber, Samuel
More Authors
Weigl, Alexander
Source Creation 2023
Publishers
Karlsruhe Institute of Technology
Production Year 2021
Publication Year 2023
Subject Areas
Name: Computer Science