Nondeterminism in Dafny 1.8 interfering with student assignments

Jun 4, 2014 at 10:41 PM
Hi ---

Dafny 1.8 (see below) is sometimes failing to verify programs that V1.6 verifies every time. For a particular short program but of moderate complexity (involving e.g. arrays and sequences) it's become so common that I have had to make a shell script that keeps running Dafny (with an adjustable timeout) "until it works".

I understand the issues (I believe); but this particular program is a student assignment and (I'll warrant) they would have some trouble with the advice "If your program doesn't verify, just try verifying it a second (third, fourth...) time." ;-)

Luckily they have access to V1.6, which seems to be more consistent. But I thought I'd bring this to your attention, since it might indicate there's some new strategy in V1.8 that's not working as well as you'd hoped.

Below is a clipping of a "dafnies" run (my shell file) that worked only on the fourth try.

=== Thanks for any advice!

// dafnies f.dfy S T
// runs dafny for up to S seconds and, if it fails or times-out,
// runs it again, but only T times maximum.
// Here goes...

Keizer-Karel% clear;dafnies MYcpC.dfy 10 10

------- dafny MYcpC.dfy
Thu 5 Jun 2014 08:30:52 EST
Dafny program verifier version 1.8.2.10419, Copyright (c) 2003-2014, Microsoft.
MYcpC.dfy(54,36): Error BP5005: This loop invariant might not be maintained by the loop.
Execution trace:
(0,0): anon0
MYcpC.dfy(49,2): anon34_LoopHead
(0,0): anon34_LoopBody
MYcpC.dfy(49,2): anon35_Else
MYcpC.dfy(49,2): anon41_Else
(0,0): anon42_Then
(0,0): anon43_Then
(0,0): anon44_Then
(0,0): anon21
(0,0): anon45_Then
(0,0): anon33
Dafny program verifier finished with 5 verified, 1 error

real 0m4.284s
user 0m3.982s
sys 0m0.623s
------- Not yet (1).

------- dafny MYcpC.dfy
Thu 5 Jun 2014 08:30:57 EST
Dafny program verifier version 1.8.2.10419, Copyright (c) 2003-2014, Microsoft.
MYcpC.dfy(54,36): Error BP5005: This loop invariant might not be maintained by the loop.
Execution trace:
(0,0): anon0
MYcpC.dfy(49,2): anon34_LoopHead
(0,0): anon34_LoopBody
MYcpC.dfy(49,2): anon35_Else
MYcpC.dfy(49,2): anon41_Else
(0,0): anon42_Then
(0,0): anon43_Then
(0,0): anon44_Then
(0,0): anon21
(0,0): anon45_Then
(0,0): anon33
Dafny program verifier finished with 5 verified, 1 error

real 0m4.550s
user 0m4.089s
sys 0m0.607s
------- Not yet (2).

------- dafny MYcpC.dfy
Thu 5 Jun 2014 08:31:01 EST
Dafny program verifier version 1.8.2.10419, Copyright (c) 2003-2014, Microsoft.
MYcpC.dfy(54,36): Error BP5005: This loop invariant might not be maintained by the loop.
Execution trace:
(0,0): anon0
MYcpC.dfy(49,2): anon34_LoopHead
(0,0): anon34_LoopBody
MYcpC.dfy(49,2): anon35_Else
MYcpC.dfy(49,2): anon41_Else
(0,0): anon42_Then
(0,0): anon43_Then
(0,0): anon44_Then
(0,0): anon21
(0,0): anon45_Then
(0,0): anon33
Dafny program verifier finished with 5 verified, 1 error

real 0m4.298s
user 0m3.994s
sys 0m0.620s
------- Not yet (3).

------- dafny MYcpC.dfy
Thu 5 Jun 2014 08:31:06 EST
Dafny program verifier version 1.8.2.10419, Copyright (c) 2003-2014, Microsoft.

Dafny program verifier finished with 6 verified, 0 errors

real 0m4.294s
user 0m4.003s
sys 0m0.627s
------- Yes (4).

Keizer-Karel%
Coordinator
Jun 11, 2014 at 1:38 AM
Hi,

I ran the program you sent me separately through version 1.8 of Dafny. Unfortunately (!), I get it to verify consistently. I'm using a (64-bit) PC. It could be that there are differences in memory layout etc, as well as some possible differences in Z3 (I'm using version Z3 version 4.1, which is what we recommend, although I've heard reports that version 4.3 also does fine). I also tried out the program on rise4fun which (is backed by a PC running 32-bit .NET and) uses Dafny 1.8 and Z3 4.1, and the program verifies there too.

I'm sympathetic to your complaint. In fact, I've spent a good part of today trying to get a little change to Dafny through the test suite and I'm experiencing the same kind of now-it-verifies/now-it-don't problem, which is very frustrating. If you have other versions of the program you'd like to send me, I'm happy to try them out as well, with the hope that perhaps one of them will reproduce in a way that leads to effective debugging. It would also be nice to build a test suite of your programs, which may let us detect more quickly when some change affects performance.

Sorry to not have more useful information.

Rustan