An Introduction to Proof-Carrying Code Amy Felty CSI 5110 The Problem: Code Safety Code Producer Code Consumer Does no harm? Native Code Source Program Compiler load r3, 4(r2) add r2,r4,r1 store 1, 0(r7) store r1, 4(r7) add r7,0,r3 add r7,8,r7 beq r3, .-20 Execute 2 References ● Proof-Carrying Code http://www.cs.berkeley.edu/~necula/pcc.html George Necula & Peter Lee, Proof-Carrying Code, Technical Report CMU-CS-96-165, 1996. Available from course web page. George Necula, Proof-Carrying Code, Symposium on Principles of Programming Languages (POPL), 1997. George Necula & Peter Lee, The Design and Implementation of a Certifying Compiler, PLDI 1998. ● Other References – See web page for course project. 3 Proof-Carrying Code Code Producer Implements a program and compiles it to native machine code C. The verification condition safe(C) is sent to a prover which proves it (automatically) and outputs a proof P. The compiler also sends hints to the prover. ● The code producer communicates the code and proof to the code consumer. ● Code Consumer Checks that P is a proof of safe(C). If successful, executes C as needed. ● Safety Policy Set ahead of time by the code consumer. Defined by a set of inference rules. 4 Proof-Carrying Code Code Producer Code Consumer Native Code C Source Program Certifying Compiler load r3, 4(r2) add r2,r4,r1 store 1, 0(r7) store r1, 4(r7) add r7,0,r3 add r7,8,r7 beq r3, .-20 Hints Safety Proof of safe(C) Prover ∃-i( ∀-i(... →-r ( ...) ) ) Execute OK Trusted Code Base Checker 5 A Closer Look at the Prover and Checker Code Producer Code Consumer Native Code and Hints VCGen VCGen Safety Safety Predicate Predicate Prover Safety Proof Checker 6 What can go wrong? ● The proof gets corrupted during transfer from producer to consumer. The proof is unlikely to check. If it does, the new proof is an alternate proof of safety. ● The code gets corrupted during transfer from producer to consumer. The proof is unlikely to check. If it does, the new code is still safe (though it may not do what it was intended to do). 7 Advantages of Proof-Carrying Code ● Trusted Code Base is quite small; includes only the checker. ● No need to trust compiler or prover. ● The safety policy (meaning of safe) can be general and flexible. Can use types, dataflow, induction, or any other provable property. ● Automated proof is possible for a large class of properties. Safety properties of interest are relatively simple. Hints from the compiler provide help. 8 Safety ● ● ● ● ● Type safety and memory safety Prohibit accessing private data Prevent overwriting important data Prevent accessing unauthorized resources Avoid consuming too many resources 9 Outline ● ● ● ● ● ● An Example Program Certifying Compiler Verification Condition Generator (VCGen) Checker Prover Executing the Program 10 An ML Program to Add the Values in a List datatype intlist = nil of () | cons of int * intlist; fun addnums nil = 0 | addnums (cons(n,ns)) = n + (addnums ns); 11 Outline: Certifying Compiler ● Layout of datatypes in memory ● Instruction set ● Formulas used to express hints and safety policies (propositional logic) ● The abstract machine Important for proving correctness of the PCC technique ● Example program compiled to machine code (with hints) 12 Compiler Layout of Data Structures datatype intlist = nil of () | cons of int * intlist nil tag 0 cons tag 1 int 37 v intlist 13 Example Instruction Set ● ● ● ● ● ● ● ● ADD rd := rs1 + rs2 ADDC rd := rs + c LD rd := m(rs + c) ST m(rd + c) := rs BEQ (rs1 = rs2) c BGT (rs1 > rs2) c RET INV p 14 Registers, Memory, and Expressions ● Register bank: r ::== R | upd(r,n,e) ● Memory: m ::== M | upd(m,e1, e2) ● Expressions: e ::== n | e1 + e2 | rn | pc | ret | m(e) 15 Formulas of Safety Policy ● Types: τ ::== int | intlist | … ● Atomic Formulas: A ::== e :m τ | e1 = e2 | e1 < e2 | readable(e) | writable(e) ● Formulas: P ::== A | ⊥ | P1 ∧ P2 | P1 ∨ P2 | P1 → P2 | ¬P 16 The Abstract Machine: One Computation Step (r,m) (r’,m’) if (r,m) evaluates to (r’,m’) by executing one instruction. Let r’’=upd(r,pc,rpc+1). 17 The Abstract Machine (1) (r,m) evaluates to: Instruction ADD rd := rs1 + rs2 ADDC rd := rs + c LD rd := m(rs + c) and readable(rs + c) ST m(rd + c) := rs and writable(rd + c) RET INV p r’ m’ upd(r’’,d, rs1 + rs2) m upd(r’’,d, rs + c) m upd(r’’,d, m(rs + c)) m r’’ upd(r,pc,ret) r’’ upd(m, rd + c, rs) m m 18 The Abstract Machine (2) (r,m) evaluates to: Instruction BEQ (rs1 = rs2) and (rs1 = rs2) BEQ (rs1 = rs2) and (rs1 ≠ rs2) BGT (rs1 > rs2) and (rs1 > rs2) BGT (rs1 > rs2) and (rs1 ≤ rs2) c r’ upd(r,pc, rpc + c) m’ m c r’’ m c upd(r,pc, rpc + c) m c r’’ m 19 Compiled Program with Hints Precondition: r0 :m intlist ∧ r3 = 0 ADD r1 := r3 + r3 %initialize total to 0 INV r0 :m intlist ∧ r1 :m int ∧ r3 = 0 L1 LD r5 := m(r0 + 0) %r5 gets list tag BEQ (r5 = r3) L2 %jump if list tag is 0 LD r2 := m(r0 + 1) %load next int in r2 LD r0 := m(r0 + 2) %r0 gets pointer to rest ADD r1 := r1 + r2 %add next int to total BEQ (r3 = r3) L1 %jump back INV r1 :m int L2 ADDC r0 := r1 + 0 %put total in r0 RET 20 Outline: Verification Condition Generator ● The Verification Condition ● Motivating the VCGen Algorithm Hoare Logic Hoare Logic for Machine Instructions ● The VCGen algorithm ● Correctness of VCGen ● Example safety predicate generated by the algorithm 21 The Verification Condition ● A predicate in propositional logic with the property that its validity with respect to the inference rules of the safety policy is a sufficient condition for ensuring compliance with the safety policy. ● Includes: proof that loop invariant holds when loop first entered proof that invariant is preserved around the loop proof that postcondition follows from invariant 22 Hoare Logic for Program Verification (Ch. 4) (|Q [e/x]|) x:=e (Q |) (|P ∧ B |) S1(|Q |) e.g., (|x=0 |) y:=0 (|x=y |) (|P ∧ ¬B |) S2(| Q |) (|P |) if B then S1 else S2(|Q |) (|P|)S1(|P’ |) (|P’ |) S2(|Q |) (|P |) S1;S2(|Q |) P → P’ (|P’ |) S(|Q’ |) Q’ → Q (|P |) S(|Q |) 23 Hoare Logic for Machine Instructions (1) Pre → Q1 (|Q1|)S1(|Q2|) (|Q2|)S2(|Q3|) … (|Qn|)Sn(|Qn+1|) Qn+1 → Post (|Pre|)S1;S2;…;Sn(|Post|) P → P’ (|P’ |) S (|Q’ |) (|P |) S (|Q |) Q’ → Q (|Q[(rs1+rs2)/rd] |) ADD rd := rs1 + rs2 (|Q|) e.g., (|r1+r2=0|) ADD r3 := r1 + r2 (| r3=0|) (|Q[(rs+c)/rd] |) ADDC rd := rs + c (|Q|) 24 Hoare Logic for Machine Instructions (2) (| Q[m(rs+c)/rd] ∧ readable(rs+c)|) LD rd:=m(rs+c)(|Q|) (|Q[upd(m,rd+c,rs)/m]∧ writable(rd+c)|) ST m(rd+c):=rs (|Q|) (|(rs1 = rs2 → Qc)∧ (¬(rs1 = rs2) → Q)|) BEQ (rs1=rs2) c (|Q|) (|(rs1 > rs2 → Qc)∧ (¬(rs1 > rs2) → Q)|) BGT (rs1>rs2) c (|Q|) (|Q|) RET (|Q|) 25 Definition of Verification Condition Generator ● Let Π be the list of instructions output by the certifying compiler. Let Πi be the instruction at position i in Π. ● Note: VCi+1 is needed to compute VCi. . VCi= VCi+1[(rs1+rs2)/rd] if Πi is ADD rd:=rs1+rs2 VCi+1[(rs+c)/rd] if Πi is ADDC rd:=rs+c VCi+1[m(rs+c)/rd]∧ readable(rs+c) if Πi is LD rd:=m(rs+c) VCi+1[upd(m,rd+c,rs)/m] ∧ writable(rd+c) if Πi is ST m(rd+c):=rs 26 Definition of VCG (continued) VCi= (rs1 = rs2 → VCi+c-1)∧ (¬(rs1 = rs2) → VCi+1) if Πi is BEQ (rs1=rs2) c (rs1 > rs2 → VCi+c-1)∧ (¬(rs1 > rs2) → VCi+1) if Πi is BGT (rs1>rs2) c post if Πi is RET p if Πi is INV p ● post is the postcondition. ● Every jump point must be proceeded by an INV statement. 27 Verification Condition ● Let Inv be the set of line numbers containing INV machine instructions. Also, 0∈ Inv. ● Inv0 is the precondition. ● Invi denotes the formula at line i. ● SP is the function computing the safety predicate (verification condition) from the code. SP(Π,Inv,post) = ∧ ∈ i Inv Invi → VCi+1 28 Another Look at the Role of VCGen ● The VCGen provides an algorithmic way to compute the safety predicate from the native machine code instructions. ● It insures that the proof that is checked really is a proof about the code that is executed. Code Producer Native Code Code Consumer and Hints VCGen VCGen Safety Safety Predicate Predicate Prover Safety Proof Checker 29 Correctness of VCGen Approach ● If the verification condition (safety theorem) generated from the program by VCGen is provable, then in every step of the abstract machine, a load will always be from a readable address, and a store will always be to a writable address. ● Necula & Lee [1996] 30 VCGen Applied to Example Program 0: r0 :m intlist ∧ r3 = 0 ... 2: INV r0 :m intlist ∧ r1 :m int ∧ r3 = 0 ... 9: INV r1 :m int (Inv0 → VC1) ∧ (Inv2 → VC3) ∧ (Inv9 → VC10) 31 Computing the Last Three VCs VCi := VCi+1[(rs+c)/rd] post p if Πi is ADDC rd:=rs+c if Πi is RET if Πi is INV p INV r1 :m int L2 ADDC r0 := r1 + 0 RET %put total in r0 VC11 is т VC10 is VC11[(r1+0)/r0] ≡ т VC9 is r1 :m int (Inv9 → VC10) ≡ (r1 :m int → т) 32 Computing the Next VC VCi := (rs1 = rs2 → VCi+c-1)∧ (¬(rs1 = rs2) → VCi+1) if Πi is BEQ (rs1=rs2) c INV r0 :m intlist ∧ r1 :m int ∧ r3 = 0 L1 LD r5 := m(r0 + 0) %r5 gets list tag : BEQ (r3 = r3) L1 %jump back VC9 is r1 :m int VC8 is (r3 = r3 → VC2)∧ (¬(r3 = r3) → VC9)≡ (r3 = r3 → (r0 :m intlist ∧ r1 :m int ∧ r3 = 0))∧ (¬(r3 = r3) → (r1 :m int)) 33 Computing the Next VC VCi := VCi+1[(rs1+rs2)/rd] ADD r1 := r1 + r2 if Πi is ADD rd:=rs1+rs2 %add next int to total VC8 is (r3 = r3→ (r0 :m intlist ∧ r1 :m int ∧ r3 = 0))∧ (¬(r3 = r3) → (r1 :m int)) VC7 is VC8[(r1+r2)/r1] ≡ (r3 = r3→ (r0 :m intlist ∧ ((r1+r2) :m int) ∧ r3 = 0))∧ (¬(r3 = r3) → ((r1+r2) :m int)) 34 Conjunct 2 of the Verification Condition ● Exercise: Compute (Inv2 → VC3) ● Solution: (r0 :m intlist ∧ r1 :m int ∧ r3 = 0) → ((m(r0+0) = r3 → r1 :m int) ∧ (¬(m(r0+0) = r3) → ((r3=r3 → (m(r0+2) :m intlist ∧ (r1 +m(r0+1)):m int ∧ r3 = 0)) ∧ (¬ (r3=r3) → (r1 +m(r0+1)):m int) ∧ readable(r0+2) ∧ readable(r0+1))) ∧ readable(r0+0)) 35 Conjunct 1 of the Verification Condition ● Exercise: Compute (Inv0 → VC1) ● Solution: (r0 :m intlist ∧ r3 = 0) → (r0 :m intlist ∧ ((r3 + r3) :m int) ∧ r3 = 0) 36 Outline: Checker ● The safety policy (inference rules) implemented by the checker includes: 1. Propositional logic with integers 2. Typing rules 3. Safety rules 4. Interface Rules ● The Trusted Code Base (TCB) includes: All these rules The implementation of the checker which checks proofs built from these rules 37 Safety Policy: (1) Basic Logic ● Propositional logic with arithmetic (integers) A ● Propositional Rules: : A B A∧B A∧B ⊥ ¬i ∧i ∧e2 ∧e1 A∧B ¬A B A A B A A∨ B ∨i1 A : B A→B →i B A∨ B A ∨i2 : : A∨ B C C C A→B B →e A ¬A ¬e ⊥ ⊥ ∨e A ⊥e ¬ ¬ A ¬¬e A 38 Safety Policy (2): Typing Rules ●Integers: 0:mint x :m int x+1 :m int x :m int y :m int x+y :m int 39 Safety Policy (2): Typing Rules ●Integer lists: cons nil tag 0 tag 1 int 37 v intlist m(v) = 0 v :m intlist m(v)=1 m(v+1) :m int m(v+2) :m intlist v :m intlist v :m intlist m(v) = 0 ∨ m(v) = 1 v :m intlist m(v)=1 m(v+1) :m int v :m intlist m(v)=1 m(v+2) :m intlist 40 Safety Policy (3) ●Safety rules, e.g., nil tag 0 cons tag 1 int 37 v intlist v :m intlist readable(v) v :m intlist m(v)=1 readable(v+1) v :m intlist m(v)=1 readable(v+2) 41 Safety Policy (4) ● Interface rules: describe the calling conventions between the code consumer and the foreign code 42 Outline: Prover ● Safety predicates have a regular form (a small subset of formulas). This class of formulas is easy to automate. ● We don’t discuss the automated prover here. ● Instead, we show part of a proof of safety (for our running example) that is generated by such a prover. 43 Proof of Safety ●Part 1: Proof of (Inv0 → VC1) 1. (r0 :m intlist ∧ r3 = 0) assump 2. r0 :m intlist ∧ e1 1 3. r3 = 0 ∧ e2 1 4. 0:mint typing axiom 5. r3 :m int arith 3,4 6. (r3 + r3) :m int type rule 5,5 ∧ i 2,6,3 7. (r0 :m intlist ∧ ((r3 + r3) :m int) ∧ r3 = 0) 8. (r0 :m intlist ∧ r3 = 0) → (r0 :m intlist ∧ ((r3 + r3) :m int) ∧ r3 = 0) 44 Outline: Executing the Program ● One step left: checking the precondition 45 Checking the Precondition 1 38 r0 200 201 202 ... r1 r2 r3 0 ... ● After checking the proof, the program can be executed as many times as needed, but the precondition must be checked (proved) each time. ● For this example, there is a simple proof of: r0 :m intlist ∧ r3 = 0 1 45 0 1 37 46
© Copyright 2024 ExpyDoc