Tips on framing, loops, and debugging

Feb 22, 2013 at 9:49 PM
This message explains some things about framing, using a particular program as a starting point, highlighting some common errors, giving some tips on how to avoid the complaints, and showing a useful debugging technique. The program I'm using as a starting point is found here.
Feb 22, 2013 at 9:51 PM
The first issue with the program relates to framing. Frames in Dafny (modifies clauses on methods and loops, and reads clauses on functions, for example) are done at the granularity of objects. That is, a modifies clause on a method specifies a set of objects whose fields are allowed to be modified. (It is actually possible to give some frames at the field granularity, but let me ignore that here, because it is usually not needed.) By default, Dafny applies the modifies clause of a method to the loops inside the method, which is often what one wants. But if you want something smaller, you can write loop invariants about what goes unchanged or specify a modifies clause directly on the loop.

The example program looks like:
class ArraySet {
  // ...
  var nextArray: array<int>;
  ghost var freeSeq: seq<int> ; 

  method init(size: int) 
    modifies this; 
  //...
  {
    nextArray := new int[size];
    // ...
    while (i < nextArray.Length - 1) 
        modifies nextArray;
        modifies freeSeq;
    // ...
    { 
       freeSeq := freeSeq + [i+1]; 
       nextArray[i] := i+1;
       i := i + 1;
    }
    // ...
  }
}
This program gives a complaint about "modifies freeSeq", because "this.freeSeq" does not denote a single object or a set of objects. In fact, there is no reason to mention it in the modifies clause--what you'd want instead is permission to modify the fields of "this".

Having removed "modifies freeSeq" from the loop, the next complaint is that the assignment to "this.freeSeq" is not allowed. This is because the loop restricts updates to "modifies nextArray", which does not allow fields of "this" to be modified. There are two options here. One is to add "this" to the modifies clause of the loop, like:
modifies this, nextArray;
The other option is to remove the modifies clause on the loop altogether, instead relying on the modifies clause of the enclosing method ("modifies this"). I'd go for the latter, since that is simpler. But in either case, you will now get a complaint that the update of "nextArray[i]" inside the loop is not allowed. Let's see why.

In addition to the explicitly given modifies clause, a method is always allowed to allocate new objects and modify their state. We refer to any such object as "fresh". More precisely, an object is "fresh" in a method if the object was allocated (by the method itself or by something that the method called) after the method was invoked. In your program, "nextArray" is assigned a newly allocated value at the beginning of the method. This means that you are allowed to modify the elements of "nextArray". So then, why do you get a complaint? The reason is that the loop allows modifications of the fields of "this", so Dafny thinks that loop iterations may potentially modify the value of the field "this.nextArray", and if the loop set "this.nextArray" to point to an array that was not newly allocated, then an assignment to "nextArray[i]" may not be allowed.

To fix this, you need to add a loop invariant that says that "this.nextArray" is not changed inside the loop. Actually, you can state something weaker, which is that "this.nextArray" is invariantly a freshly allocated array:
  invariant  fresh(nextArray);
With this invariant on the loop, and without any modifies clauses on the loop, the errors related to framing go away.
Feb 22, 2013 at 10:16 PM
The verifier can tell, by a syntactic inspection, what local variables may be modified by a loop. But if the loop makes any modification of the heap (updating a field, updating an array elements, allocating an object, or invoking a method, even a method with an empty modifies clause since such a method is still allowed to allocate objects), then it will appear to the verifier that the entire heap may change. In such a case, the governing modifies clause helps, because it restricts the places in the heap where modifications may occur. Whenever anything changes in a loop, one expects to have to write a loop invariant that confines the changes. If heap modifications are involved, this can sometimes get complicated. For this reason, if it happens to be simple to avoid heap modifications in a loop, then that would lead to a simpler-to-specify program. Also, if loops can be avoided, then that too simplifies the program.

As it turns out, the loop in this program can be simplified in both ways. The effect of the loop is to build "this.freeSeq" into a particular integer sequence, and at the same time to initialize "this.nextArray" with particular array elements. Let's consider each in turn.

The sequence to which "this.freeSeq" should be initialize requires some iteration to build up. If we modify "this.freeSeq" inside the loop body, then the verifier will see that the heap changes and will then require loop invariants (such as "fresh(nextArray)", as discussed above). If this is unavoidable, then it is sometimes possible to move other assignments (like "nextArray := new int[size]") to after the loop, so that there's no need to remember, across the loop iterations, what had been done before the loop. But to initialize "this.freeSeq" in the program at hand, we can do something even simpler: Use a loop to initialize a local variable to the appropriate sequence, and then, after the loop, assign that local variable to "this.freeSeq". This can be done with:
    var fs, i := [], 0;
    while (i < nextArray.Length)
      invariant 0 <= i <= nextArray.Length;
      invariant |fs| == i;
      invariant forall j :: 0 <= j < i ==> fs[j] == j;
    {
      fs, i := fs + [i], i + 1;
    }
    freeSeq := fs;
It's a bit unfortunate that Dafny does not have a sequence-comprehension expression; hence the need for this loop. Again, since this loop does not modify the heap, the loop invariant only needs to talk about the local variables that are changed, which simplifies the loop invariant dramatically (compared to the original program).

Let's now consider the initialization of the array "nextArray". Here, Dafny provides a construct that avoids the loop altogether, namely the aggregation statement "parallel". It lets us perform an assignment to each array element simultaneously. Not only does this make for a more readable program, but it also avoids the loop which in turn avoids the need for a loop invariant. Here's one possible way to do it:
    parallel (j | 0 <= j < nextArray.Length - 1) {
      nextArray[j] := j + 1;
    }
    nextArray[nextArray.Length - 1] := -1;
Here's another, equivalent, way:
    parallel (j | 0 <= j < nextArray.Length) {
      nextArray[j] := if j < nextArray.Length - 1 then j + 1 else -1;
    }
Feb 22, 2013 at 11:10 PM
We've now simplified the code of the "init" method substantially. Here's what my version looks like now:
  method init(size: int) 
    modifies this; 
    requires size > 0;
    ensures nextArray != null;
    ensures forall j :: 0 <= j < nextArray.Length ==> -1 <= nextArray[j] < nextArray.Length;
    ensures arrayInv();
  {
    elementArray := new int[size];

    nextArray := new int[size];
    parallel (j | 0 <= j < nextArray.Length) {
      nextArray[j] := if j < nextArray.Length - 1 then j + 1 else -1;
    }
    
    usedHd := -1;
    freeHd := 0;
    usedSeq := []; 

    var fs, i := [], 0;
    while (i < nextArray.Length)
      invariant 0 <= i <= nextArray.Length;
      invariant |fs| == i;
      invariant forall j :: 0 <= j < i ==> fs[j] == j;
    {
      fs, i := fs + [i], i + 1;
    }
    freeSeq := fs;
  }
What remains in the verification of this program is the hardest part--but as we'll soon see, once we figure out what the problem is (that is the hard part), the solution is simple.

The next problem is that the postcondition "ensures arrayInv()" does not verify. With different versions of the "init" method body, sometimes I get the postcondition violation reported, but sometimes I also received a time-out from the verifier. The time-out is not so helpful, because it does not pin-point where the verifier was spending its time when time ran out. To debug a time-out, you currently have to temporarily comment out various things and see if that gets around the time-out. Often, but not always, you'll find that the problem is something that Dafny has problems verifying. The verifier may then try extra hard to prove it, so hard that it gets totally bogged down and runs out of time.

Dafny reports a postcondition violation for "ensures arrayInv()", where "arrayInv()" is a user-defined predicate. Dafny also highlights, inside the body of "arrayInv()", which conjunct(s) it cannot prove. In this case, it points us to the line:
    (forall j :: 0 <= j < nextArray.Length ==> ((j in usedSeq) || (j in freeSeq))) && 
Perhaps this is a bug in our program, perhaps it indicates an error in the specification, or perhaps the error report comes from the verifier's inability to construct an appropriate proof. Here's how I would debug this problem.

First, copy the offending line to an assert statement at the end of the method body:
    assert forall j :: 0 <= j < nextArray.Length ==> j in usedSeq || j in freeSeq;
What you should get now is a complaint about this assertion. But assuming that the asserted condition holds, we expect the postcondition to be verified. That is indeed the case here. So, what remains in our debugging is to figure out why we're getting a complaint about this assertion.

Looking at the program, we expect that what the loop invariant says about "fs" should hold about "freeSeq" after the assignment "freeSeq := fs;". And knowing "freeSeq[j] == j" about each index j ought to mean that "j in freeSeq" holds for each index, which in turn ought to imply the assert we added. To test this hypothesis, I would now write these things down. I'll write two new assert statements, so we end up with the following 3 assertions at the end of the "init()" method body:
    assert forall j :: 0 <= j < nextArray.Length ==> freeSeq[j] == j;
    assert forall j :: 0 <= j < nextArray.Length ==> j in freeSeq;
    assert forall j :: 0 <= j < nextArray.Length ==> j in usedSeq || j in freeSeq;
Dafny proves the first one, it complains about the second one, it proves the third one (which follows from the second assertion), and it proves the postcondition of the method (which follows from all the things that were proved automatically before, plus the third assertion). So, what we have now discovered is that either our understanding of why the second assertion follows from the first is wrong, or Dafny is unable to prove that for us.

We're getting closer. Why can Dafny not prove the second assertion? To explore this further, it would be nice to "zoom in" on the proof argument for each j. We can do this with a "parallel" statement. We can replace the second assert with something of the form:
    parallel (j | 0 <= j < nextArray.Length)
      ensures j in freeSeq;
    {
      // ...
    }
(Alternatively, you can precede the second assertion with this parallel statement, to make sure you're stating the right thing. The second assertion gives the same condition that this parallel statement ensures, so we now expect to see a complaint about the "ensures" clause of the parallel statement, but no other complaints.)

Let us now add to the body of the parallel statement what we believe to be true, namely:
      assert freeSeq[j] == j;
Evidently, Dafny proves this assertion and is then able to prove the ensures clause of the parallel statement, which for all j's of the parallel statement implies the second assertion, which implies the third, which makes the whole method verify.

What I have just showed is a fairly typical way to go about debugging what's going wrong. You can closer by introducing assertions. And for this program, it was also useful to "zoom in" on the proof obligation by opening up a scope that allows us to more easily speak about each "j".

What I'm about to say next is more advanced; ignore it if you don't care about the details. How come what we did helped Dafny construct the proof? I did not dive into the details to try to understand what Boogie programs/formulas were generated underneath Dafny's covers or how the underlying SMT solver Z3 dealt with the formulas, but here's my hunch. The expression "j in freeSeq" gives rise to an existential quantifier, something like:
exists k :: 0 <= k < |freeSeq| && freeSeq[k] == j
Often when we deal with existential quantifiers, it is necessary to provide an explicit witness. In the process described above, we may have thought that the invariant freeSeq[j]==j would give that witness. It does, but this witness expression sits inside a universal quantifier (which Z3 has to instantiate). The proof goal (here, the second assertion) is also a universal quantifier; since it's the proof goal, it will be negated and Skolemized, but the negation turns the "exists k..." into a universal quantifier (which Z3 has to instantiate). Z3 then sits there with two contradictory universal quantifiers, but it doesn't know it because it doesn't know how to instantiate either of them. But providing the parallel statement, we get the opportunity to give the witness closer to the existential quantifier, which does the trick. A different way to do this would be to not use the parallel statement and instead replace the second assertion with:
    assert forall j :: 0 <= j < nextArray.Length ==> freeSeq[j] == j ==> j in freeSeq;        // (*)
Here, too, we have placed the "freeSeq[j] == j" expression, which gives the witness, inside one level of universal quantification, thus getting closer to the existential entailed by the "in". But this assertion may not have been as easy to invent, especially not if the formulas involved were more complicated than in this example.

In summary, now that we have figured out the extra information that Dafny needed to complete the proof, we can (if we so choose) delete the extraneous assertions that were needed only temporarily while we were debugging). You would then be left with, at the end of the "init()" method, the parallel statement shown above. Or, instead of the parallel statement, you can give the assertion (*).
Feb 22, 2013 at 11:14 PM
Here is the final program (with various unneeded parentheses removed).
Feb 25, 2013 at 10:06 PM
This is very nice. Thank you for the comprehensive explanation and the well-structured code! One question I have is on the operational meaning of the 'parallel' statement. Can these statements be compiled by the Dafny compiler? Where could I find out a bit more about them?