The multithreading bug in Isabelle is fascinating from a systems design perspective: threads racing ahead assuming a stuck proof will succeed, creating false positives. That's exactly the kind of subtle concurrency issue that makes verification tools need... verification tools.
I don't think it's that subtle: it would be enough to have a continuous integration tool that tries to compile the proofs that are checked into version control and raises an error if it fails.
The linked blog post says "Unfortunately, a student once submitted work containing this error; it was almost entirely incorrect and he had no idea." I guess the student probably was aware that not every proof had gone through, and that the that he saw like "99 out of 100 proofs succeeded" and assumed that meant that he had mostly completed the task, not realizing that a false theorem would be used to the give incorrect proofs for the rest of the file.
The multithreading bug in Isabelle is fascinating from a systems design perspective: threads racing ahead assuming a stuck proof will succeed, creating false positives. That's exactly the kind of subtle concurrency issue that makes verification tools need... verification tools.
I don't think it's that subtle: it would be enough to have a continuous integration tool that tries to compile the proofs that are checked into version control and raises an error if it fails.
The linked blog post says "Unfortunately, a student once submitted work containing this error; it was almost entirely incorrect and he had no idea." I guess the student probably was aware that not every proof had gone through, and that the that he saw like "99 out of 100 proofs succeeded" and assumed that meant that he had mostly completed the task, not realizing that a false theorem would be used to the give incorrect proofs for the rest of the file.