ESBMC

An Efficient SMT-based Bounded Model Checker.

GitHub

Documentation
News
Publications
SV-COMP
Test-Comp
People
Applications
Download Archive
Third Party Contributions
Index of /benchmarks

Documentation

Installing ESBMC
ESBMC Features
Illustrative Examples
Witness Generation
Unwinding Assertions
Modeling with non-determinism
Falsification
Incremental BMC
k-Induction proof rule
Verification of modules that rely on larger structures
ESBMC Support

A demonstration is available here.

You can also use ESBMC via our web interface.

A set of slides about the detection of software vulnerabilities using ESBMC: Part I, Part II, and Part III.

This software security course describes further implementation details about ESBMC.

Installing ESBMC

As a prerequisite, we recommend installing ESBMC on a GNU/Linux OS (the support for MacOS is still experimental). In order to install ESBMC on your machine, you should download the ESBMC-Linux.sh (or ESBMC-Darwin.sh) and save it on your disk. After that, you should open the terminal prompt and run the shell script using the following commands.

For Linux:

$chmod +x ESBMC-Linux.sh
$./ESBMC-Linux.sh

For MacOS (experimental):

$chmod +x ESBMC-Darwin.sh
$./ESBMC-Darwin.sh

Once the user runs the above shell-script, he/she should read the license before proceeding with the installation. The ESBMC distribution is split into two directories:

If the user wants to use other SMT solvers (e.g., MathSAT, Yices, CVC4), we recommend checking out the ESBMC source code, which is hosted on GitHub, and then follow the instructions in the BUILDING file.

ESBMC Features

ESBMC aims to support all of C11, and detects errors in software by simulating a finite prefix of the program execution with all possible inputs. Classes of problems that can be detected include:

Concurrent software (using the pthread API) is verified by explicit exploration of interleavings, thus producing one symbolic execution per interleaving. By default, only normal errors will be checked for; one can also specify options to check concurrent programs for:

By default ESBMC performs a ''lazy'' depth first search of interleavings, but can also encode (explicitly) all interleavings into a single SMT formula. Currently, a number of SMT solvers are supported:

In addition, ESBMC can be configured to use the SMTLIB interactive text format with a pipe, to communicate with an arbitrary solver process, although not-insignificant overheads are involved. A limited subset of C++98 is supported too -- a library modeling the STL is also available.

ESBMC uses clang as its front-end, which brings a number of advantages:

To check all available options of the ESBMC tool, type:

$esbmc --help

Illustrative Examples

As an illustrative example to show some of the ESBMC features concerning floating-point numbers, consider the following C code:

#include <math.h>
int main() {
  unsigned int N = nondet_uint();
  double x = nondet_double();
  if(x <= 0 || isnan(x))
    return 0;
  unsigned int i = 0;
  while(i < N) {
    x = (2*x);
    assert(x>0);
    ++i;
  }
  assert(x>0);
  return 0;
}

Here, ESBMC is invoked as follows:

$esbmc file.c --floatbv --k-induction

where file.c is the C program to be checked, --floatbv indicates that ESBMC will use floating-point arithmetic to represent the program's float and double variables, and --k-induction selects the k-induction proof rule. The user can select the SMT solver, property, and verification strategy. For this particular C program, ESBMC provides the following output as the verification result:

*** Checking inductive step
Starting Bounded Model Checking
Unwinding loop 2 iteration 1 file ex5.c line 8 function main
Not unwinding loop 2 iteration 2 file ex5.c line 8 function main
Symex completed in: 0.001s (40 assignments)
Slicing time: 0.000s (removed 16 assignments)
Generated 2 VCC(s), 2 remaining after simplification (24 assignments)
No solver specified; defaulting to Boolector
Encoding remaining VCC(s) using bit-vector/floating-point arithmetic
Encoding to solver time: 0.005s
Solving with solver Boolector 3.2.0
Encoding to solver time: 0.005s
Runtime decision procedure: 0.427s
BMC program time: 0.435s

VERIFICATION SUCCESSFUL

Solution found by the inductive step (k = 2)

As an illustrative example to show some of the ESBMC features concerning pointer safety, consider the following C code:

#include <stdlib.h>
int *a, *b;
int n;
#define BLOCK_SIZE 128
void foo () {
  int i;
  for (i = 0; i < n; i++)
    a[i] = -1;
  for (i = 0; i < BLOCK_SIZE - 1; i++)
    b[i] = -1;
}
int main () {
  n = BLOCK_SIZE;
  a = malloc (n * sizeof(*a));
  b = malloc (n * sizeof(*b));
  *b++ = 0;
  foo ();
  if (b[-1])
  { free(a); free(b); }
  else
  { free(a); free(b); }
  return 0;
}
  

Here, ESBMC is invoked as follows:

$esbmc file.c --memory-leak-check

where file.c is the C program to be checked and --memory-leak-check indicates that ESBMC will check for memory leaks. For this particular C program, ESBMC produces the following counterexample:

Counterexample:

State 1 file ex2.c line 14 function main thread 0
----------------------------------------------------
  a = (signed int *)(&dynamic_1_array[0])

State 2 file ex2.c line 15 function main thread 0
----------------------------------------------------
  b = (signed int *)0

State 3 file ex2.c line 16 function main thread 0
----------------------------------------------------
  b = 0 + 1

State 6 file ex2.c line 16 function main thread 0
----------------------------------------------------
Violated property:
  file ex2.c line 16 function main
  dereference failure: NULL pointer

In the counterexample shown above, State 1 indicates that memory has been allocated, which is identified by 'dynamic_1_array'. State 2 indicates that the call to the malloc function did not succeed, and thus returned NULL, i.e., the memory was not allocated. Note that ESBMC allows the user not to check for malloc/new failure via --force-malloc-success. State 3 represents an assignment to pointer b. Lastly, State 6 reports a failure to dereference pointer b.

As an illustrative example to show some of the ESBMC features concerning concurrency, consider the following C code:

#include <pthread.h>
int n=0; //shared variable
pthread_mutex_t mutex;
void* P(void* arg) {
  int tmp, i=1;
  while (i<=10) {
    pthread_mutex_lock(&mutex);
    tmp = n;
    n = tmp + 1;
    pthread_mutex_unlock(&mutex);
    i++;
  }
  return NULL;
}
int main (void) {
  pthread_t id1, id2;
  pthread_mutex_init(&mutex, NULL);
  pthread_create(&id1, NULL, P, NULL);
  pthread_create(&id2, NULL, P, NULL);
  pthread_join(id1, NULL);
  pthread_join(id2, NULL);
  assert(n == 20);
}
  

Here, we create two threads id1 and id1; both threads will run the same code as implemented in P. Note that these two threads communicate via the shared memory n, which is protected by a mutex via pthread_mutex_lock and pthread_mutex_unlock. Note further that the thread main contains two joining points via pthread_join for id1 and id2.

ESBMC can be invoked as follows:

$esbmc file.c --context-bound 2

where file.c is the C program to be checked and --context-bound nr limits number of context switches for each thread. For this particular C program, ESBMC produces the following verification result:

*** Thread interleavings 612 ***
Unwinding loop 1 iteration 10 file test3.c line 6 function P
Unwinding loop 1 iteration 1 file test3.c line 6 function P
Unwinding loop 1 iteration 2 file test3.c line 6 function P
Unwinding loop 1 iteration 3 file test3.c line 6 function P
Unwinding loop 1 iteration 4 file test3.c line 6 function P
Unwinding loop 1 iteration 5 file test3.c line 6 function P
Unwinding loop 1 iteration 6 file test3.c line 6 function P
Unwinding loop 1 iteration 7 file test3.c line 6 function P
Unwinding loop 1 iteration 8 file test3.c line 6 function P
Unwinding loop 1 iteration 9 file test3.c line 6 function P
Unwinding loop 1 iteration 10 file test3.c line 6 function P
Symex completed in: 0.031s (431 assignments)
Slicing time: 0.001s (removed 183 assignments)
Generated 149 VCC(s), 7 remaining after simplification (248 assignments)
No solver specified; defaulting to Boolector
Encoding remaining VCC(s) using bit-vector/floating-point arithmetic
Encoding to solver time: 0.004s
Solving with solver Boolector 3.2.0
Encoding to solver time: 0.004s
Runtime decision procedure: 0.001s
BMC program time: 0.040s

VERIFICATION SUCCESSFUL

Witness Generation

When ESBMC refutes a property, it produces a counterexample that can be used to debug the program to find the root cause of the problem. For this purpose, ESBMC can produce the counterexample in graphml format to make its evaluation easier (e.g., by building a tool that allows graphical visualization).

As an illustrative example, consider the following fragment of C code, where we declare two bit-vectors of size 10 each: x and y, and then check whether x == y.

#include <assert.h>

int main() {
  _ExtInt(10) x = nondet_float();
  _ExtInt(10) y = nondet_int();
  assert(x == y);
  return 0;
}
  

If we call ESBMC as esbmc main.c --witness-output main.graphml, where main.c is the C program we want to verify while main.graphml stores the counterexample in graphml format, then ESBMC will produce the following output:

        

<?xml version="1.0" encoding="utf-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <key id="frontier" attr.name="isFrontierNode" attr.type="boolean" for="node">
    <default>false</default>
  </key>
  <key id="violation" attr.name="isViolationNode" attr.type="boolean" for="node">
    <default>false</default>
  </key>
  <key id="entry" attr.name="isEntryNode" attr.type="boolean" for="node">
    <default>false</default>
  </key>
  <key id="sink" attr.name="isSinkNode" attr.type="boolean" for="node">
    <default>false</default>
  </key>
  <key id="cyclehead" attr.name="cyclehead" attr.type="boolean" for="node">
    <default>false</default>
  </key>
  <key id="sourcecodelang" attr.name="sourcecodeLanguage" attr.type="string" for="graph"/>
  <key id="programfile" attr.name="programfile" attr.type="string" for="graph"/>
  <key id="programhash" attr.name="programhash" attr.type="string" for="graph"/>
  <key id="creationtime" attr.name="creationtime" attr.type="string" for="graph"/>
  <key id="specification" attr.name="specification" attr.type="string" for="graph"/>
  <key id="architecture" attr.name="architecture" attr.type="string" for="graph"/>
  <key id="producer" attr.name="producer" attr.type="string" for="graph"/>
  <key id="sourcecode" attr.name="sourcecode" attr.type="string" for="edge"/>
  <key id="startline" attr.name="startline" attr.type="int" for="edge"/>
  <key id="startoffset" attr.name="startoffset" attr.type="int" for="edge"/>
  <key id="control" attr.name="control" attr.type="string" for="edge"/>
  <key id="invariant" attr.name="invariant" attr.type="string" for="node"/>
  <key id="invariant.scope" attr.name="invariant.scope" attr.type="string" for="node"/>
  <key id="assumption" attr.name="assumption" attr.type="string" for="edge"/>
  <key id="assumption.scope" attr.name="assumption" attr.type="string" for="edge"/>
  <key id="assumption.resultfunction" attr.name="assumption.resultfunction" attr.type="string" for="edge"/>
  <key id="enterFunction" attr.name="enterFunction" attr.type="string" for="edge"/>
  <key id="returnFromFunction" attr.name="returnFromFunction" attr.type="string" for="edge"/>
  <key id="endline" attr.name="endline" attr.type="int" for="edge"/>
  <key id="endoffset" attr.name="endoffset" attr.type="int" for="edge"/>
  <key id="threadId" attr.name="threadId" attr.type="string" for="edge"/>
  <key id="createThread" attr.name="createThread" attr.type="string" for="edge"/>
  <key id="witness-type" attr.name="witness-type" attr.type="string" for="graph"/>
  <graph edgedefault="directed">
    <data key="producer">ESBMC 6.7.0</data>
    <data key="sourcecodelang">C</data>
    <data key="architecture">64bit</data>
    <data key="programfile">main.c</data>
    <data key="programhash">7ba149c407ef7ae9e971bbc937b37a624575d6a5</data>
    <data key="specification">CHECK( init(main()), LTL(G ! call(__VERIFIER_error())) )</data>
    <data key="creationtime">2021-06-07T13:37:38</data>
    <data key="witness-type">violation_witness</data>
    <node id="N0">
      <data key="entry">true</data>
    </node>
    <node id="N1"/>
    <edge id="E0" source="N0" target="N1">
      <data key="enterFunction">main</data>
      <data key="createThread">0</data>
    </edge>
    <node id="N2"/>
    <edge id="E1" source="N1" target="N2">
      <data key="startline">4</data>
      <data key="assumption">x = -512;</data>
      <data key="threadId">0</data>
    </edge>
    <node id="N3"/>
    <edge id="E2" source="N2" target="N3">
      <data key="startline">5</data>
      <data key="assumption">y = -166;</data>
      <data key="threadId">0</data>
    </edge>
    <node id="N4">
      <data key="violation">true</data>
    </node>
    <edge id="E3" source="N3" target="N4">
      <data key="startline">93</data>
      <data key="threadId">0</data>
    </edge>
  </graph>
</graphml>


We recommend reading Exchange Format for Violation Witnesses and Correctness Witnesses to obtain further information about violation and correctness witnesses in graphml format.

Unwinding Assertions

In ESBMC, all loops are "unwound", i.e., replaced by several guarded copies of the loop body; the same happens for backward "gotos" and recursive functions. Soundness requires that ESBMC insert a so-called unwinding assertion at the end of the loop. As an example, consider the simple C code fragment illustrated below:

1 unsigned int x=∗;
2 while ( x>0) x−−;
3 assert ( x==0);
  

Note that the loop in line 2 runs an unknown number of times, depending on the initial non-deterministic value assigned to x in line 1. The assertion in line 3 holds independent of x's initial value. BMC tools typically fail to verify programs that contain such loops. In particular, BMC tools introduce an unwinding assertion at the end of the loop, as illustrated in line 5 of this C code fragment.

1 unsigned int x=∗;
2 if(x>0)
3   x−−;   // k copies
4   ...
5 assert (!(x>0));
6 assert(x==0);
  

This unwinding assertion in line 5 causes the BMC tool to fail if k is too small as follows:

1 #include <assert.h>
2 unsigned int nondet_uint();
3 int main() {
4   unsigned int x=nondet_uint();
5   while(x>0) x--;
6   assert(x==0);
7   return 0;
8 }
  
$esbmc file.c --unwind 3
Counterexample:

State 1 file file.c line 4 function main thread 0
----------------------------------------------------
  x = 3170305 (00000000 00110000 01100000 00000001)

State 2 file file.c line 5 function main thread 0
----------------------------------------------------
  x = 3170304 (00000000 00110000 01100000 00000000)

State 3 file file.c line 5 function main thread 0
----------------------------------------------------
  x = 3170303 (00000000 00110000 01011111 11111111)

State 4 file file.c line 5 function main thread 0
----------------------------------------------------
Violated property:
  file file.c line 5 function main
  unwinding assertion loop 

Modeling with non-determinism

ESBMC extends C with three modeling features:

__ESBMC_assert(e): aborts execution when e is false.

void __ESBMC_assert (e, "some message here");

nondet_X(): returns non-deterministic X-value, with X in {bool, char, int, float, double, loff_t, long, pchar, pthread_t, sector_t, short, size_t, u32, uchar, uint, ulong, unsigned, ushort} (no side effects, pointer for void *, etc.). ESBMC assumes that the functions are implemented according to the following template:

X nondet_X () { X val; return val; }

__ESBMC_assume(e): "ignores" execution when e is false, no-op otherwise.

void __ESBMC_assume(e);

__ESBMC_atomic_begin(), __ESBMC_atomic_end(): For modeling an atomic execution of a sequence of statements in a multi-threaded run-time environment, those statements can be placed between two function calls.

__ESBMC_atomic_begin();
//shared memory
__ESBMC_atomic_end();

As an illustrative example to show some of the ESBMC features to model non-determinism, consider the following C code:

1 int main() {
2   int x=nondet_int(),y=nondet_int(),z=nondet_int();
3   __ESBMC_assume(x > 0 && y > 0 && z > 0);
4   __ESBMC_assume(x < 16384 && y < 16384 && z < 16384);
5   assert(x*x + y*y != z*z);
6   return 0;
7 }
  

Here, ESBMC is invoked as follows:

$esbmc file.c

For this particular C program, ESBMC produces the following counterexample:

Counterexample:

State 1 file file.c line 2 function main thread 0
----------------------------------------------------
  x = 252 (00000000 00000000 00000000 11111100)

State 2 file file.c line 2 function main thread 0
----------------------------------------------------
  y = 561 (00000000 00000000 00000010 00110001)

State 3 file file.c line 2 function main thread 0
----------------------------------------------------
  z = 615 (00000000 00000000 00000010 01100111)

State 6 file file.c line 5 function main thread 0
----------------------------------------------------
Violated property:
  file file.c line 5 function main
  assertion
  (_Bool)(x * x + y * y != z * z)

VERIFICATION FAILED

Falsification

$esbmc file.c --falsification

Our falsification approach (--falsification) uses an iterative technique and verifies the program for each unwind bound up to either a maximum default value of 50 (which can be changed via --max-k-step nr), or indefinitely (until it exhausts the time or memory limits). Intuitively, we aim to find a counterexample with up to k loop unwindings. The algorithm relies on the symbolic execution engine to increasingly unwind the loop after each iteration.

This approach replaces all unwinding assertions (e.g., assertions to check if a loop was completely unrolled) with unwinding assumptions. Normally, this would lead to unsound behaviour but, since the falsification algorithm cannot provide correctness validation, it will not affect the search for bugs. This approach is focused on bug finding and does not care if a loop was not completely unrolled; it only cares if the current number of unwindings will lead to a property violation.

The falsification algorithm also offers the option to change the granularity of the increment; the default value is 1, but can be increased in order to meet any expected behaviour via --k-step nr. Note that changing the value of the increment can lead to slower verification time and might not present the shortest counterexample possible for a property violation.

Incremental BMC

$esbmc file.c --incremental-bmc

Our incremental BMC approach (--incremental-bmc) uses an iterative technique and verifies the program for each unwind bound up to either a maximum default value of 50, which can be modified via --max-k-step nr, or indefinitely (until it exhausts the time or memory limits). Intuitively, we aim to either find a counterexample with up to k loop unwinding or to fully unwind all loops so we can provide a correct result. The algorithm relies on the symbolic execution engine to increasingly unwind the loop after each iteration of the algorithm.

The approach is divided in two steps: one that tries to find property violations and one that checks if all the loops were fully unwound. When searching for property violation, the tool replaces all unwinding assertions (e.g., assertions to check if a loop was completely unrolled) with unwinding assumptions. Normally, this would lead to unsound behaviour, however, the first step can only find property violations and reporting an unwinding assertion failure is not a real bug. The next step is to check if all loops in the program were fully unrolled. This is done by checking if all the unwinding assertions are unsatisfiable; note that checking any other assertion in the program, for the current k, is not necessary as they were already verified.

The algorithm also offers the option to change the granularity of the increment; the default value is 1, but can be increased in order to meet any expected behaviour via --k-step nr. Note that changing the value of the increment can lead to slower verification time and might not present the shortest counterexample possible for the property violation.

k-Induction proof rule

$esbmc file.c --k-induction

The original k-induction algorithm (--k-induction) presented by Sheeran et al. [1] was used to prove safety properties in hardware verification. The algorithm was later refined by Alastair et al. [2] and applied to the verification of general C programs. Our algorithm is a combination of both approaches. It can be summarized as follows:

\begin{equation} \begin{array}{rcrrl} & \neg & B(k) & \rightarrow & \text{program contains bug} \\ B(k) & \wedge & F(k) & \rightarrow & \text{program is correct} \\ B(k) & \wedge & I(k) & \rightarrow & \text{program is correct} \end{array} \end{equation}

Here B(k) is the base case, F(k) is the forward condition and I(k) is the inductive step; k is the number of loop unwinding used for each step. For the base case we use the plain BMC technique, hence we can only find property violations here. If the base case error check is satisfiable, then the algorithm presents a counterexample of length k. For the forward condition and inductive step, the base case must be checked for satisfiability before the result is presented. This is a soundness requirement of the technique.

The forward condition attempts to prove that all loops in the program were fully unrolled; this is achieved by adding unwinding assertions after all loops. The forward condition is further optimized to only check the unwinding assertions, as all program assertions are already proven to be unsatisfiable by the base case, for the current value of k. The inductive step attempts to prove that, if the property is valid for k iterations, then it must be valid for the next iteration; this is achieved by assigning nondeterministic values to all variable written inside a loop body, assuming k-1 invariants and checking if the invariant holds at the kth iteration.

The algorithm starts with k = 1 and increases it up to a maximum number of iterations, incrementally analysing the program, until it either finds a bug (i.e., the base case is satisfiable for some k), proves correctness (i.e., the base case is unsatisfiable and either the forward condition or inductive step is unsatisfiable for some k), or exhausts either time or memory constraints.

[1] Mary Sheeran, Satnam Singh, Gunnar Stålmarck: Checking Safety Properties Using Induction and a SAT-Solver. FMCAD 2000: 108-125

[2] Alastair F. Donaldson, Leopold Haller, Daniel Kroening, Philipp Rümmer: Software Verification Using k-Induction. SAS 2011: 351-368

Verification of modules that rely on larger structures

ESBMC can verify code that relies on existing infrastructures and must be compliant with those. Consider the following C program where the verification engineer wants to check whether the assert-statement in line 8 holds.

1 #include "lib.h"
2 // Running with esbmc  --overflow-check main.c lib.c
3 int main() {
4   int64_t a;
5   int64_t b;
6   int64_t r;
7   if (mul(a, b, &r)) {
8     __ESBMC_assert(r == a * b, "Expected result from multiplication");
9   }
10   return 0;
11 }
  
The function mul is implemented in the library "lib.h", which is located under "/lib". Here, ESBMC is invoked as follows:
$esbmc main.c --overflow-check -I lib/ lib/lib.c

where main.c is the C program to be checked, --overflow-check enables arithmetic over- and underflow check, and -I path sets the include path. For this particular C program, ESBMC produces the following counterexample:

Counterexample:

State 1 file lib.c line 14 function mul thread 0
----------------------------------------------------
Violated property:
  file lib.c line 14 function mul
  arithmetic overflow on mul
  !overflow("*", a, b)

VERIFICATION FAILED

The library header and implementation files located under /lib are:

1 #include
2 _Bool mul(const int64_t a, const int64_t b, int64_t *res);
  
 1 #include "lib.h"
 2 _Bool mul(int64_t a, int64_t b, int64_t *res) {
 3   // Trivial cases
 4   if((a == 0) || (b == 0)) {
 5     *res = 0;
 6     return 1;
 7   } else if(a == 1) {
 8     *res = b;
 9     return 1;
10   } else if(b == 1) {
11     *res = a;
12     return 1;
13   }
14   *res = a * b; // there exists an overflow
15   return 1;
16 }  

ESBMC Support

We are still increasing the robustness of ESBMC and also continuously implementing new features, more optimizations and experiencing new encodings. For any question about ESBMC, please contact us via https://github.com/esbmc/esbmc.