• Uncertainty • The axioms of probability • Bayes rule • Belief networks 1 Uncertainty No matter in proposition logic or first order logic reasoning, it is assumed that assumed 仮定した • facts are known to be true, • facts are known to be false, • or nothing is known. In general, however, people are not sure what the relevant facts are true or false and thing becomes unpredictable due to • partial observation (e.g. road state, other driver’s plan) unpredictable 予知できない • noisy sensors (Radio traffic report) • uncertainty in action outcome (flat tire, accident) Rules may give unreliable conclusions e.g. unreliable信頼 できない At = Leaving for airport t minutes before flight, will At get me airport on time?? toothache cavity?? gum disease?? 2 Probabilistic Reasoning • Probability theory provides a way of dealing rationally with Probability theory 確率論 uncertainty – assigning a numerical degree of belief between 0 and 1 to sentences. e.g. P(cavity) = 0.1 indicates a patient having cavity with a probability of 0.1 (a 10% chance). • Degree of truth, as opposed to degree of belief, is the subject of fuzzy logic. Fuzzy logic ファジー論理 cavity 虫歯 Belief 信じること • Probabilistic reasoning may be used in the following three types of situations: - The world is really random - The relevant world is not random given enough data, but it is not always have access to that much data - The world appears to be random because we have not describe it at the right level. 3 Necessary formulas • Definition of conditional probability P(A|B) = Conditional probability 条件確率 P(A B) P(B ) Where, P(A|B) is read as “the probability of A given that all we know is B” for example, P(cavity|Toothache)=0.8 indicates that if a patient is observed to have a toothache and toothache 歯痛 there is no other information, then the probability of the patient having a cavity will be 0.8 (a 80% chance). • Product rule gives an alternative formulation Product alternative 代わりの (掛け算の)積 P(A B)=P(A|B)P(B) it comes from the fact that for A and B to be true, we need B to be true, and then A to be true given B. You can also write P(A B)=P(B|A)P(A) 4 The axioms of probability For any propositions, A, B axiom 公理 • 0 P(A) 1 all probabilities are between 0 and 1 • P(True) = 1 and P(False) = 0 Necessary true propositions have probability 1, and necessary false propositions have probability 0. assign 与える • P(A B) = P(A) + P(B) – P(A B) The total probability of AB is seen to be the sum of the subtract 引く probabilities assigned to A and B, but with P(A B) subtracted out so that those cases are not counted twice. True A B AB P(A B)=Porange_color+Ppurple_color 5 +Pgreen_color Chain rule • It is derived by successive application of product rule: derive 得る P(X1, …, Xn) = P(X1, …, Xn-1) P(Xn| X1, …, Xn-1 ) successive 継続的な = P(X1, …, Xn-2)P(Xn-1| X1, …, Xn-2 )P(Xn| X1, …, Xn-1 ) =… = P(X1)P(X2|X1) … P(Xn-1| X1, …, Xn-2 )P(Xn| X1, …, Xn-1 ) = ni=1P(Xi| X1, …, Xi-1) • For example, we can calculate the probability of the event that the alarm has event出 来事 sounded but neither a burglary nor an earthquake has occurred, and both John and Mary call. (referring to conditional probability in next page) P(J M A B E ) ? = P(J|A)P(M|A)P(A| B E ) P( B )P( E ) burglary 侵入盗 Earthquake = 0.90 x 0.70 x 0.001 x 0.999 x 0.998 地震 = 0.00062 6 P(E) P(B) 0.001 earthquake burglary alarm A P(J) T 0.90 F 0.05 John calls B E P(A) T T F F 0.95 0.94 0.29 0.001 T F T F Mary calls 0.002 A P(M) T 0.70 F 0.01 7 Bayes’ rule • Recall the two forms of the product rule: Recall P(A B)=P(A|B)P(B) 思い出す P(A B)=P(B|A)P(A) P(B|A) = P(A |B)P(B) P(A ) Why is Bayes’ rule useful? It can be used for assessing diagnostic probability from causal probability. 原因となる P(Effect |Causal)P(Causal) Assess 査定する P(Causal|Effect) = P(Effect) E.g., let C be cavity and T be toothache: P(C|T) = P(T|C)P(C) P(T) = Causal Effect (原因から直接引き 起こされる)結果 0.5 x 0.0001 = 0.0005 0.1 8 Normalization Consider the equation for calculating the probability of cavity given a toothache P(C|T) = P(T|C)P(C) P(T) Suppose we are also concerned with the possibility that the patient is suffering from gum disease G given a toothache. Suffer from 患う Gum disease P(G|T) = P(T|G)P(G) 歯肉炎 P(T) Prior 前の Comparing these two equations, we see that in order to computer the relative likelihood of cavity and gum disease, given a toothache, we need not assess the prior probability P(T) , since we have Relative likelihood 相対的可能性 P(C/T) P(G/T) = P(T|C)P(C) P(T|G)P(G) = 0.5 x 0.0001 0.8 x 0.005 = 1 80 That is, gum disease is 80 times more likely than cavity, given a toothache. 9 Conditioning • Introducing a variable as an extra condition: P(X|Y) = z P(X|Y, Z= z)P(Z=z|Y) e.g., P(RunOver|Cross) = P(RunOver|Cross, Light=green)P(Light=green|Cross) +P(RunOver|Cross, Light=yellow)P(Light=yellow|Cross) +P(RunOver|Cross, Light=red)P(Light=red|Cross) • When Y is absent, we have P(X) = z P(X|Z= z)P(Z=z) = z P(X,Z= z) P(RunOver) = P(RunOver|Light=green)P(Light=green) +P(RunOver|Light=yellow)P(Light=yellow) X and Y are Conditional independence (under the condition, Z) 条件付きの独立 +P(RunOver|Light=red)P(Light=red) The above equation expresses the conditional independence of RunOver and Cross given Light. 10 Full joint distributions That is, the unconditional probability of any proposition is computable as the sum of entries from the full joint distribution. •For any proposition defined on the random variables (wi) is true or false unconditional 無条件の • is equivalent to the disjunction of wi’s where (wi) is true, hence equivalent to 同等で P() = {wi: (wi) } P(wi) Conditional probability can be computed in the same way as a ratio. P( |) = P( ) P() joint 共 同の e.g., Suppose Toothache and Cavity are the random variables: w1=Cavity w2= Toothache Toothache = True Toothache = False Cavity = True Cavity = False P(Cavity |Toothache) = P(w1) = 0.04 + 0.06 = 0.10 P(w2) = 0.04 + 0.01 = 0.05 0.04 0.01 0.06 0.89 P(Cavity Toothache) P(Toothache) = (0.04) / (0.04 + 0.01) = 0.8 11 Independence • Two random variables A B are (absolutely) independent iff P(A|B) = P(A) or P(A, B) = P(A|B)P(B) = P(A)P(B) e.g., A and B are two coin flips flip (勝負を決めるため)硬 貨を空中へはじき上げること P(A=head, B=head) = P(A=head)P(B=head)=0.5 x 0.5 = 0.25 • If n Boolean variables are independent, the full joint is P(X1, …, Xn) = ni=1P(Xi) • Conditional independence Absolute independence is a very strong requirement, seldom met! P(A|B, C) = P(A|C) we say that A is conditionally independent of B given C e.g., P(Catch|Toothache, Cavity) = P(Catch|Cavity) P(Catch|Toothache, Cavity) = P(Catch| Cavity) It means that if a patient has a cavity (or not a cavity), the probability that the probe catches in it does not depend on whether the patient has a toothache 12 Belief networks • A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions. • Belief networks are also called “causal nets”, “Bayes nets”, or “influence diagrams”. 0.001 earthquake burglary alarm A P(J) T 0.90 F 0.05 John calls B E P(A) T T F F 0.95 0.94 0.29 0.001 T F T F Mary calls notation 表記法 compact specification 簡潔な仕様 For example, figure in page 7 shows a belief network. P(B) assertion 断言 P(E) Belief 0.002 信じること Variables: Burglary, Earthquake, Alarm, JohnCallas, MaryCalls A P(M) T 0.70 F 0.01 13 Syntax and Semantics direct Syntax: (ある方向に)向ける • A set of nodes, one per variable • a directed, a cyclic graph (link shows “directly influences”) influence 影響 • a conditional probability distribution for each node given its probability distribution parents: 確率分布 P(Xi|Parents(Xi)) Semantics: • “Global” semantics defines the full joint probability distribution as the product of the local conditional distributions: P(X1, …, Xn) = ni=1P(Xi|Parents(Xi)) e.g., P(J M A B E ) = P(J|A)P(M|A)P(A| B E ) P( B )P( E ) 14 Constructing Bayes nets • Equation in global semantics defines what a given belief means. It does not explain how to construct a belief network. • The equation implies certain conditional independence relationships Imply 暗に含む that can be used to guide constructing the topology of the network. P(X1, …, Xn) = ni=1P(Xi|Parents(Xi)) ----(1) If we rewrite the joint in terms of a conditional probability using the definition of conditional probability: P(x1,…, xn) = P(xn|xn-1, …, x1)P(xn-1, …, x1) topology 形態 probability distribution 確率分布 =… = P(xn|xn-1, …, x1) P(xn-1|xn-2, …, x1) …P(x2|x1)P(x1) = ni=1P(xi|xi-1, …, x1) ----(2) • Comparing (1) and (2), we see that specification of the joint is equivalent to the general assertion that P(Xi|Xi-1, …, X1) = P(Xi|Parents(Xi)) provided that Parents(Xi) {Xi-1, …, X1}. Select parents from Xi-1, …, X1 15 Constructing Bayes nets (continue …) The general procedure for incremental network construction is as follows: Imply 1. Choose the set of relevant variables Xi that describe the domain. 暗に含む 2. Choose an ordering for the variables. 3. While there are variables left: topology 形態 (a) Pick a variable Xi and add a node to the network for it. (b) Set Parents(Xi) to some minimal set of nodes already in the net such that the conditional independence property is satisfied. (c) Define the conditional probability table for Xi. For example, we choose the ordering M, J, A, B, E in the burglary example. Add MaryCalls MaryCalls JohnCalls Add JohnCalls Add Alarm Alarm Add Burglary Add Earthquake Burglary P(J|M)=P(J)? No. Since if Mary calls, that probably means the alarm has gone off , which of course would make it more likely that John calls. Earthquake P(E|B,A,J,M)=P(E|A)? No. P(E|B,A,J,M)=P(E|A,B)? yes. Since if the alarm is on, it is more likely that there is an earthquake. But if we know there has been a burglary, then that would change for the probability of an earthquake due to the alarm. P(A|J,M)=P(A|J)? No. P(A|J,M)=P(A)? No. Since if both call, it is more likely that the alarm has gone off than if just one or neith call. P(B|A,J,M)=P(B)? No. P(B|A,J,M)=P(B|A)? Yes. Since the alarm likely 16 gives us information about a burglary. Exercises Ex1. According to the following tables, could you calculate the probability of the event that the alarm has sounded, a burglary has occurred but an earthquake has not occurred, and both John and Mary call. P(E) P(B) 0.95 earthquake burglary alarm A P(J) T 0.90 F 0.05 John calls B E P(A) T T F F 0.95 0.94 0.29 0.001 T F T F Mary calls 0.002 A P(M ) T 0.70 F 0.01 17 Exercises Ex2. (optional) Constructing a belief network for the burglary example. Let us choose the ordering M, J, E, B, A 18
© Copyright 2024 ExpyDoc