Automated Theorem Proving
Download
Report
Transcript Automated Theorem Proving
Automated Theorem Proving:
A Retrospection
&
Applications of Formal Methods
CS3234
Aquinas Hobor and Martin Henz
Outline
• Reflections on Coq
• Applications of Formal Methods
• How to study for the Final
2
A proverb
“Only those who have been bitten by the snake
can understand how it feels.”
3
Why are theorem provers used?
Very high assurance due to mechanical checking
• Checkers are very through: don’t get tired, don’t
get bored, don’t make mistakes
• If anything, the problem is the opposite – trying
to convince a checker that a true thing is true
can be frustrating.
4
In our class…
• Many times I helped students who were sure
that something was true, and just needed a
little help to understand how to convince Coq.
• Usually it was not more than a line or two…
5
In our class…
• Many times I helped students who were sure
that something was true, and just needed a
little help to understand how to convince Coq.
• Usually it was not more than a line or two…
• … several times the difficulty was that the
thing that they were sure was true was
actually false. (e.g., n – 1 + 1 = n : Not true!)
6
There are some contexts where bugs
have enormous cost.
• Proofs about real software are hard to get right;
Coq found a previously unknown bug in the proof
in the (widely-used, second-edition) textbook.
7
Disadvantages of
Automated Theorem Proving
• Developing the hints / proof by hand can be
very labor-intensive
• It can be very difficult to formalize correctness
– “correct” operating system?
– “correct” web browser?
– “correct” compiler?
• Learning curve to use systems can be steep
8
Disadvantages of
Automated Theorem Proving
• Developing the hints / proof by hand can be
very labor-intensive
• It can be very difficult to formalize correctness
– “correct” operating system?
– “correct” web browser?
– “correct” compiler?
Now you
understand…
• Learning curve to use systems can be steep
9
One more advantage…
they are fun to use!
• A bit like writing software in a scripting language
“Building such scripts is surprisingly addictive,
in a videogame kind of way…”
- Xavier Leroy
• The advantage of never having to worry about bugs
in the finished product
• Can work on math at 3 AM without fear
10
One more advantage…
they are fun to use!
• A bit like writing software in a scripting language
“Building such scripts is surprisingly addictive,
in a videogame kind of way…”
Unclear if this is
- Xavier Leroy
an advantage
• The advantage of never having to worry about bugs
in the finished product
• Can work on math at 3 AM without fear
11
In your own words…
• “I found COQ is really an amazing tool. I like to play
with it now. haha!”
• “I love video games. Coq is the first programming
language I use as a game. I really enjoy it haha!”
• “[Proof completed.] is the best feeling in the world!”
• (and many others)
• Of course, it’s harder to give negative feedback, so…
12
In the words of Nick Benton
Senior Researcher, Microsoft Research, Cambridge UK
Wrote a paper in 2006 titled,
“Machine Obstructed Proof”
http://research.microsoft.com/
en-us/um/people/nick/mop.pdf
13
In the words of Nick Benton
“After years doing programming language theory without
going near a proof assistant, I was finally convinced by the
POPLmark ‘buzz’ and conversations at ICFP’05 that it was time
to try one.”
… [there was a workshop giving an introduction to Coq] …
“The workshop description says “the available tools are [...]
difficult to learn, inadequately documented, and lacking in
specific library facilities required for work in programming
languages”. I can confirm that I have rarely felt as stupid and
frustrated as I did during my first few weeks using Coq.”
14
In the words of Nick Benton
“Scripts are unreadable by themselves, as one has no idea what
the tactics are doing to the proof state, and the documentation for
them is incomprehensible to the novice. The only thing that works
is lots and lots of trial and error in an interactive environment, and
I still couldn’t give a coherent general description of what some of
the tactics I’ve used many times actually do, or how they differ
from half a dozen apparently similar ones. And basic ones are still
missing; I spent days fighting with elim, case, destruct and
variations on induction and still kept finding myself having done
case splits without the information about which branch I was in.
This was so frustrating I gave up on Coq (and spent a week playing
with HOL Light) until Georges Gonthier showed me the magic, and
frankly bizarre, incantation
generalize (refl_equal x); pattern x at -1; case x.”
15
In the words of Nick Benton
“There are bugs. On Windows, CoqIDE falls over,
whilst Proof General only works with the original
2004 version of Coq 8. I spent ages defining
Setoid structures on everything, only to find
Setoid rewriting throws an ‘anomaly’ exception
in interesting contexts.”
(you may recall a HW assignment (hw5) where I
said to use “generalize” instead of “spec”)
16
In the words of Nick Benton
“I had many similar difficulties, but then started to make
progress. Just having intermediate stages of the work in a
computerized form rather than on many pages of paper
proved a major benefit. Far more often than I’d expected,
one can alter definitions and then mildly tweak the
previous version of a proof to keep it up to date. On
paper, I tend to keep going back to the top and doing
everything from scratch to be sure everything is still
consistent.”
17
In the words of Nick Benton
“Automated proving is not just a slightly more fussy version
of paper proving and neither (Curry-Howard notwithstanding)
is it really like programming. It’s a strange new skill, much
harder to learn than a new programming language or
application, or even many bits of mathematics. I’m resistant
to investing significant effort in tools (I don’t write clever TeX or
Emacs macros), but the payoff really came the second time I
used Coq: I was able to prove some elementary but delicate
results for a different paper in just a day or so. Coq is worth
the bother and it, or something like it, is the future, if only we
could make the initial learning experience a few thousand
times less painful.”
18
Of note
• There is a somewhat similar course being
offered in the past year to undergraduates
– Using Coq to examine program semantics
– University of Pennsylvania
– Princeton University
– Harvard University (a bit different)
– Probably others
• For good or evil, you have been on the
bleeding edge of formal methods teaching…
19
So…
• Well done for getting through it!
• Hopefully you have learned a lot (we noticed that
your pen-and-paper induction proofs improved
significantly), and had some fun as well.
• With a bit of luck you don’t hate your Profs (too
much!) for putting it in the course… If you have
suggestions for making the process easier in future
years, please let us know.
20
Outline
• Reflections on Coq
• Applications of Formal Methods
• How to study for the Final
21
How do people use formal methods?
1. Type theory
2. Program Analysis
3. Proof-Carrying Code
4. Certified compilation
22
Type Theory
• You are familiar with languages like Java, where your
variables (& function parameters, etc.) have a
declared type.
– int myInt;
– List<List<Int>> myListofListofInts;
– Object myObject;
• The idea is to classify values into sets. During
compilation, there is a phase called type-checking,
where the compiler looks for type errors, such as
– myListofListofInts := 5; // Oops… what does this mean?
23
Type Theory
• Why do we do this? Because type errors almost always
are mistakes in the program.
– Analogy is to units in science (e.g., a Newton of force is 1
kilogram-meter per second squared).
– You check your units when you do science for the same
reason that it is a good idea to do type checking when
programming.
• We can (usually) do type-checking in ~ linear time, and
can discover many bugs much faster than testing.
– This is a form of proof generation (which we have only
explained very little of in this course, but have used a lot,
e.g., “auto”, “omega”, etc.)
– The idea with a type system is to trade off expressivity for
automatic generation.
24
Type Theory
• The field that studies program typing is called Type Theory.
• It turns out that there is a deep connection between formal
logic and type theory, called the Curry-Howard Isomorphism.
• Languages with strong type systems include
– Java / C#
– SML / oCaml
– Haskell
• Languages with weak type systems include
– C / C++
• Languages without any type systems include
– Python/JavaScript
– Assembly (some have weak typing)
25
Program Analysis
• Another use for formal methods is automatic
program analysis; e.g., bounds checking.
• Lots of code uses integers that are supposed to
stay within certain bounds
– Arrays
– Overflow (max_int + 1 – bad idea)
• It is possible to develop software that examines
source code and automatically proves that (for
example) an array access is always in-bounds.
26
Bounds Checking
• These automatic tools usually require a bit of
help from the programmer.
• But they are very effective: can catch many bugs
long before they would have been caught
through testing.
• Companies that use this kind of technology:
– Airbus
– Microsoft
– etc.
27
Bounds Checking
How it works (one choice)
• Programmer puts annotations in his code
– int x __inrange(0,10);
– int n __lessthan(sizeof(myArray));
• The tool then uses a form of Hoare logic to
propagate these bounds in the code
• If it is unable to prove that the invariant is
obeyed, it complains
• To run on real code, tools must be very complex
– Pointer arithmetic, aliasing, etc. etc.
28
Proof-Carrying Code
• Problem: we would like to have a way to get
code from an untrusted source, which we
then want to run on our computer
– e.g., Web Browser
• This sounds like a bad idea…
• But (it turns out) people really want to do it.
29
Proof-Carrying Code
• So, we will require that the untrusted code
provider sends us both
– The code, which we will run only after checking
– A proof that the code obeys some safety policy
• Simplest version you are familiar with:
– Java web applets
30
Proof-Carrying Code
• How it works:
– Modify compiler to take a proof at the source
code level (maybe just a proof that the source
program is well-typed) and transform that proof
as it compiles into a proof at the machine level.
– Modify runtime system to accept both code and
proof; it must check proof, then run code
31
Certified Compilation
• Bugs in a program that are due to a mistake in
the compiler (“compiler bugs”) are some of
the most difficult to find – can take weeks or
months.
• The problem is, the source code is right.
• Compilers are some of the most complicated
pieces of software there are.
– Hundreds of thousands of lines, highly
complicated algorithms
32
Certified Compilation
• The goal of a certified compiler is to have a
compiler with no bugs.
– That is, the source program is always correctly
translated into the target language.
• We have seen in lecture 12 some of the tools
used: you give both the source and target
languages a formal semantics, and then prove:
– ¾ ¾’
)
C(¾) * C(¾’)
– That is, that the compiler C preserves the
semantics of the program.
33
Outline
• Reflections on Coq
• Applications of Formal Methods
• How to study for the Final
34
Final
• There is a paper part and a Coq part
– Total: 90 points
– Paper: 50 points, 60%
– Coq: 40 points, 40%
• You will have two hours (combined) to do
both parts.
– You can switch back and forth if you like.
35
Paper part: Format
• Fair game topics:
– Everything covered in lectures 1-12, midterm, quizzes
1-3, homework 1-10
• Not multiple choice; instead:
36
Paper part: Format
• Fair game topics:
– Everything covered in lectures 1-12, midterm, quizzes
1-3, homework 1-10
• Not multiple choice; instead:
• Proofs! (Oh boy!)
– Natural deduction
– Induction
– Semantic proofs
– Hoare proofs
– etc.
37
Coq part: Format
• Five Questions
1.
2.
3.
4.
5.
First-order logic *
Induction
Modal logic
Modal logic *
Hoare logic
* Similar to questions 1 and 2 on the Coq quiz
38
Coq part: Format
• Each problem is worth 9 points (out of 90, so 40%)
• Maximum 36 points: pick 4 out of the 5
– Skip one if you get stuck!
• No extra credit
– Spend the extra time on the paper part
• We are assuming that it will take about an hour
– Around 15 minutes per question
– Quiz: 2 questions, 23 minutes (11.5 minutes each)
39
Coq part: study strategy
• Make sure that you can do the quiz in a reasonable
amount of time (again, two questions on the final are
similar to the two quiz questions. If you get them
done quickly then you have 20% of the points already.)
• Review major frameworks (Modal logic in HW5 and
Hoare logic in HW10). You will not have enough time
to figure them out again during the exam.
• You can redo some problems to remind yourself how
those parts work.
40
Test strategy
• Remember: you can skip one Coq problem
– If you get stuck, move on: maybe another one is easier.
– Don’t Panic! If you get stuck then do paper part for
awhile, maybe an idea for Coq part will occur.
• Watch the time. It’s not a short exam.
• Don’t get so absorbed with the Coq part that you
neglect paper part, or vise versa.
41
Last but not least…
Good Luck on your finals
(including ours)
and
Thanks for taking CS3234!
– Martin and Aquinas
42