ImageVerifierCode 换一换
格式:PPT , 页数:31 ,大小:1.13MB ,
资源ID:4530482      下载积分:10 金币
快捷下载
登录下载
邮箱/手机:
温馨提示:
快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。 如填写123,账号就是123,密码也是123。
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝    微信支付   
验证码:   换一换

加入VIP,免费下载
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【https://www.docduoduo.com/d-4530482.html】到电脑端继续下载(重复下载不扣费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录   QQ登录   微博登录 

下载须知

1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
2: 试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
3: 文件的所有权益归上传用户所有。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

本文(人工智能_贝叶斯网络.ppt)为本站会员(hwpkd79526)主动上传,道客多多仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知道客多多(发送邮件至docduoduo@163.com或直接QQ联系客服),我们立即给予删除!

人工智能_贝叶斯网络.ppt

1、1,Artificial Intelligence: Bayesian Networks,2,Graphical Models,If no assumption of independence is made, then an exponential number of parameters must be estimated for sound probabilistic inference. No realistic amount of training data is sufficient to estimate so many parameters. If a blanket assu

2、mption of conditional independence is made, efficient training and inference is possible, but such a strong assumption is rarely warranted. Graphical models use directed or undirected graphs over a set of random variables to explicitly specify variable dependencies and allow for less restrictive ind

3、ependence assumptions while limiting the number of parameters that must be estimated. Bayesian Networks: Directed acyclic graphs that indicate causal structure. Markov Networks: Undirected graphs that capture general dependencies.,3,Bayesian Networks,Directed Acyclic Graph (DAG) Nodes are random var

4、iables Edges indicate causal influences,4,Conditional Probability Tables,Each node has a conditional probability table (CPT) that gives the probability of each of its values given every possible combination of values for its parents (conditioning case). Roots (sources) of the DAG that have no parent

5、s are given prior probabilities.,Burglary,Earthquake,Alarm,JohnCalls,MaryCalls,5,CPT Comments,Probability of false not given since rows must add to 1. Example requires 10 parameters rather than 251 = 31 for specifying the full joint distribution. Number of parameters in the CPT for a node is exponen

6、tial in the number of parents (fan-in).,6,Joint Distributions for Bayes Nets,A Bayesian Network implicitly defines a joint distribution.,Example,Therefore an inefficient approach to inference is: 1) Compute the joint distribution using this equation. 2) Compute any desired conditional probability us

7、ing the joint distribution.,7,Nave Bayes as a Bayes Net,Nave Bayes is a simple Bayes Net,Y,X1,X2,Xn,Priors P(Y) and conditionals P(Xi|Y) for Nave Bayes provide CPTs for the network.,8,Independencies in Bayes Nets,If removing a subset of nodes S from the network renders nodes Xi and Xj disconnected,

8、then Xi and Xj are independent given S, i.e. P(Xi | Xj, S) = P(Xi | S) However, this is too strict a criteria for conditional independence since two nodes will still be considered independent if their simply exists some variable that depends on both. For example, Burglary and Earthquake should be co

9、nsidered independent since they both cause Alarm.,9,Independencies in Bayes Nets,If removing a subset of nodes S from the network renders nodes Xi and Xj disconnected, then Xi and Xj are independent given S, i.e. P(Xi | Xj, S) = P(Xi | S) However, this is too strict a criteria for conditional indepe

10、ndence since two nodes will still be considered independent if their simply exists some variable that depends on both. For example, Burglary and Earthquake should be considered independent since they both cause Alarm.,P(Xi | Xj, S) = P(Xi | S) , is equivalent to P(Xi , Xj | S) = P(Xi | S) P(Xj | S)

11、How to prove?,10,Independencies in Bayes Nets,If removing a subset of nodes S from the network renders nodes Xi and Xj disconnected, then Xi and Xj are independent given S, i.e. P(Xi | Xj, S) = P(Xi | S) However, this is too strict a criteria for conditional independence since two nodes will still b

12、e considered independent if their simply exists some variable that depends on both. For example, Burglary and Earthquake should be considered independent since they both cause Alarm.,11,Independencies in Bayes Nets (cont.),Unless we know something about a common effect of two “independent causes” or

13、 a descendent of a common effect, then they can be considered independent. For example, if we know nothing else, Earthquake and Burglary are independent. However, if we have information about a common effect (or descendent thereof) then the two “independent” causes become probabilistically linked si

14、nce evidence for one cause can “explain away” the other. For example, if we know the alarm went off that someone called about the alarm, then it makes earthquake and burglary dependent since evidence for earthquake decreases belief in burglary. and vice versa.,12,Bayes Net Inference,Given known valu

15、es for some evidence variables, determine the posterior probability of some query variables. Example: Given that John calls, what is the probability that there is a Burglary?,Burglary,Earthquake,Alarm,JohnCalls,MaryCalls,?,John calls 90% of the time there is an Alarm and the Alarm detects 94% of Bur

16、glaries so people generally think it should be fairly high.However, this ignores the prior probability of John calling.,13,Bayes Net Inference,Example: Given that John calls, what is the probability that there is a Burglary?,Burglary,Earthquake,Alarm,JohnCalls,MaryCalls,?,John also calls 5% of the t

17、ime when there is no Alarm. So over 1,000 days we expect 1 Burglary and John will probably call. However, he will also call with a false report 50 times on average. So the call is about 50 times more likely a false report: P(Burglary | JohnCalls) 0.02,14,Bayes Net Inference,Example: Given that John

18、calls, what is the probability that there is a Burglary?,Burglary,Earthquake,Alarm,JohnCalls,MaryCalls,?,Actual probability of Burglary is 0.016 since the alarm is not perfect (an Earthquake could have set it off or it could have gone off on its own). On the other side, even if there was not an alar

19、m and John called incorrectly, there could have been an undetected Burglary anyway, but this is unlikely.,15,Types of Inference,16,Sample Inferences,Diagnostic (evidential, abductive): From effect to cause. P(Burglary | JohnCalls) = 0.016 P(Burglary | JohnCalls MaryCalls) = 0.29 P(Alarm | JohnCalls

20、MaryCalls) = 0.76 P(Earthquake | JohnCalls MaryCalls) = 0.18 Causal (predictive): From cause to effect P(JohnCalls | Burglary) = 0.86 P(MaryCalls | Burglary) = 0.67 Intercausal (explaining away): Between causes of a common effect. P(Burglary | Alarm) = 0.376 P(Burglary | Alarm Earthquake) = 0.003 Mi

21、xed: Two or more of the above combined (diagnostic and causal) P(Alarm | JohnCalls Earthquake) = 0.03 (diagnostic and intercausal) P(Burglary | JohnCalls Earthquake) = 0.017,17,Sample Inferences,Diagnostic (evidential, abductive): From effect to cause. P(Burglary | JohnCalls) = 0.016 P(Burglary | Jo

22、hnCalls MaryCalls) = 0.29 P(Alarm | JohnCalls MaryCalls) = 0.76 P(Earthquake | JohnCalls MaryCalls) = 0.18 Causal (predictive): From cause to effect P(JohnCalls | Burglary) = 0.86 P(MaryCalls | Burglary) = 0.67 Intercausal (explaining away): Between causes of a common effect. P(Burglary | Alarm) = 0

23、.376 P(Burglary | Alarm Earthquake) = 0.003 Mixed: Two or more of the above combined (diagnostic and causal) P(Alarm | JohnCalls Earthquake) = 0.03 (diagnostic and intercausal) P(Burglary | JohnCalls Earthquake) = 0.017,Assignment: Calculate these results!,18,Probabilistic Inference in Humans,People

24、 are notoriously bad at doing correct probabilistic reasoning in certain cases. One problem is they tend to ignore the influence of the prior probability of a situation.,19,Monty Hall Problem,1,2,3,One Line Demo: http:/math.ucsd.edu/crypto/Monty/monty.html,20,Multiply Connected Networks,Networks wit

25、h undirected loops, more than one directed path between some pair of nodes.,In general, inference in such networks is NP-hard. Some methods construct a polytree(s) from given network and perform inference on transformed graph.,21,Node Clustering,Eliminate all loops by merging nodes to create meganod

26、es that have the cross-product of values of the merged nodes.,Number of values for merged node is exponential in the number of nodes merged. Still reasonably tractable for many network topologies requiring relatively little merging to eliminate loops.,22,Bayes Nets Applications,Medical diagnosis Pat

27、hfinder system outperforms leading experts in diagnosis of lymph-node disease. Microsoft applications Problem diagnosis: printer problems Recognizing user intents for HCI Text categorization and spam filtering Student modeling for intelligent tutoring systems.,23,Statistical Revolution,Across AI the

28、re has been a movement from logic-based approaches to approaches based on probability and statistics. Statistical natural language processing Statistical computer vision Statistical robot navigation Statistical learning Most approaches are feature-based and “propositional” and do not handle complex

29、relational descriptions with multiple entities like those typically requiring predicate logic.,Structured (Multi-Relational) Data,In many domains, data consists of an unbounded number of entities with an arbitrary number of properties and relations between them. Social networks Biochemical compounds

30、 Web sites,25,Biochemical Data,Predicting mutagenicity Srinivasan et. al, 1995,Web-KB Dataset Slattery & Craven, 1998,Faculty,Grad Student,Research Project,Other,Collective Classification,Traditional learning methods assume that objects to be classified are independent (the first “i” in the i.i.d. a

31、ssumption) In structured data, the class of an entity can be influenced by the classes of related entities.Need to assign classes to all objects simultaneously to produce the most probable globally-consistent interpretation.,Logical AI Paradigm,Represents knowledge and data in a binary symbolic logi

32、c such as FOPC. + Rich representation that handles arbitrary sets of objects, with properties, relations, quantifiers, etc. Unable to handle uncertain knowledge and probabilistic reasoning.,Probabilistic AI Paradigm,Represents knowledge and data as a fixed set of random variables with a joint probab

33、ility distribution. + Handles uncertain knowledge and probabilistic reasoning. Unable to handle arbitrary sets of objects, with properties, relations, quantifiers, etc.,30,Statistical Relational Models,Integrate methods from predicate logic (or relational databases) and probabilistic graphical model

34、s to handle structured, multi-relational data. Probabilistic Relational Models (PRMs) Stochastic Logic Programs (SLPs) Bayesian Logic Programs (BLPs) Relational Markov Networks (RMNs) Markov Logic Networks (MLNs) Other TLAs,31,Conclusions,Bayesian learning methods are firmly based on probability the

35、ory and exploit advanced methods developed in statistics. Nave Bayes is a simple generative model that works fairly well in practice. A Bayesian network allows specifying a limited set of dependencies using a directed graph. Inference algorithms allow determining the probability of values for query variables given values for evidence variables.,

本站链接:文库   一言   我酷   合作


客服QQ:2549714901微博号:道客多多官方知乎号:道客多多

经营许可证编号: 粤ICP备2021046453号世界地图

道客多多©版权所有2020-2025营业执照举报