收藏 分享(赏)

SlideNote03.ppt

上传人:myw993772 文档编号:7387440 上传时间:2019-05-16 格式:PPT 页数:36 大小:232.13KB
下载 相关 举报
SlideNote03.ppt_第1页
第1页 / 共36页
SlideNote03.ppt_第2页
第2页 / 共36页
SlideNote03.ppt_第3页
第3页 / 共36页
SlideNote03.ppt_第4页
第4页 / 共36页
SlideNote03.ppt_第5页
第5页 / 共36页
点击查看更多>>
资源描述

1、MLP,Week 03,Outline,Introduction Architecture Learning Process Learning Issues Summary and Further Discussion,Limit of Perceptron,XOR Problem,Limit of Perceptron,Nonlinear Separable Problem,+,Positive Class,Negative Class,Limit of Perceptron,Linear Classifier Linear Decision Boundary in Perceptron 2

2、 Dimensional Space: Line 3 Dimensional Space: Plane N Dimensional Space: Hyperplane Unsolvable to Problems XOR Problem Nonlinear Separable Research on Neural Network in Dark Age 1960 1980 (20 years) Motivation for inventing MLP Addition of intermediate layer called hidden layer Solvable even to Nonl

3、inear Separable,Limit of Perceptron,Solution to XOR Problem,1,1,bias,-0.5,0, 00 0, 11 1, 01 1, 11,Limit of Perceptron,Solution to XOR Problem,1,1,bias,-1.5,0, 00 0, 10 1, 00 1, 11,Limit of Perceptron,Solution to XOR Problem,0, 00 0, 11 1, 01 1, 10,1,1,1,1,1,-1.5,bias,bias,-1.5,-0.5,-0.5,Limit of Per

4、ceptron,Overview of MLP Addition of one more layer between input and output layer Added Layer called Hidden Layer Boundary: Linear Quadratic Solvable even to non-linear separable classification Approximation of any nonlinear function By Universe Theorem,Architecture,Input Layer,Output Layer,Hidden L

5、ayer,Architecture,Input Layer Receive input vector #Nodes = dimension of input vector Net input: element of input vector Output = Net Input Linear Function as Activation Function,Architecture,Input Nodes,Architecture,Hidden Layer Encode Input Vector into another Form Intermediate Layer Receive Net I

6、nput as the summation of product of input value and weights Compute its own output and transfer it to output layer #Nodes: Arbitrary Too Many Nodes Overfitting and High Complexity Too Few Nodes Underfitting and Poor Learning Linear Boundary Quadratic Boundary Activation Function: Sigmoid Function,Ar

7、chitecture,Hidden Nodes,Architecture,Output Layer Classification Output Value = CSV (Categorical Score Value) #Output Nodes = #Classes (or Categories) Output Value with Maximum Value Classified Class or Category Regression Univariate Regression: #Output Node = 1 Multivariate Regression: #Ouput Nodes

8、 = # Variables Output Value Estimated Output Value,Architecture,Hidden Nodes,Learning Rule: Back Propagation,Feed Forward,Output Computation,Input Layer,Output Layer,Hidden Layer,Learning Rule: Back Propagation,Weight Update,Input Layer,Output Layer,Hidden Layer,Backward,Learning Rule: Back Propagat

9、ion,Notations,Learning Rule: Back Propagation,Notations,Learning Rule: Back Propagation,Gradient Descent for Weights Optimization,Error Function to minimize,E,w,Learning Rule: Back Propagation,Update weight between output and hidden,Learning Rule: Back Propagation,Update weight between hidden and in

10、put,.,jth input,ith hidden,First output,cth ouput,.,Learning Rule: Back Propagation,Update weight between hidden and input,Learning Rule: Back Propagation,Batch Learning,Input: Training ExamplesInitialize Weights at Random Iterate T timesfor each training examplecompute values of hidden nodescompute

11、 value of output nodescompute average errorupdate weights between output and hiddenupdate weights between hidden and inputOutput: Optimized Weights,Learning Rule: Back Propagation,Interactive Learning,Input: Training ExamplesInitialize Weights at Random Iterate T timesfor each training examplecomput

12、e values of hidden nodescompute value of output nodescompute average errorupdate weights between output and hiddenupdate weights between hidden and input Output: Optimized Weights,Learning Issues,Optimization Architecture #Input Nodes Dimensions #Output Nodes Binary Classification: One node Multiple

13、 Classification: #Classes Univarite Regression: One node Multivariate Regression: #Output Variables #Hidden Nodes ? Validation Set Set of some training examples separated from given Training examples Reduction of #Training Examples for Training Parameter Optimization (#Learning epochs) Falling into

14、Local Minima Reducing following descent Once reach minima, not moving,Learning Issues,Parameter Settings Learning Rate: Arbitrary between 0 and 1 Close to 1: Fast Learning but Fluctuation Close to 0: Slow Learning but Stability #Hidden Nodes Many Nodes: Much Time for Learning, Overfitting Few Nodes:

15、 Less Time for Learning , Underfitting Training Iteration: Too Many: Overfitting Too Few: Underfitting,Learning Issues,Validation Set,Training Set,Test Set,For Training MLP,For Evaluating Performance Hide Target Labels during Training,Training Set,Validation Set,Learning Issues,Falling into local mi

16、nima,E,w,Learning Issues,Other Issues of MLP Obtaining Training Examples No Evidence to given answer Slow Learning Large Dimension in its application to real problems,Summary and Further Discussions,Summary Multiple Perceptrons as solution to Limit of Perceptron Architecture of MLP Learning Process

17、of MLP Learning Issues of MLP,Summary and Further Discussions,Virtual Training Examples Solution to insufficient number of training examples Other Training Examples derived from given Training Examples Original Training Examples Actual Ones labeled with their Target Outputs, initially Derived Traini

18、ng Examples Virtual Ones without their target outputs Target Output by Generalization of MLP Actual Ones and Virtual Ones Training,Summary and Further Discussions,Co Learning Two MLPs: MLP 1 and MLP 2 Training Examples: Labeled + Unlabeled MLP 1 Trained by Labeled Ones MLP 2 Trained by Labeled Ones

19、MLP 1 labels unlabeled training examples by its own generalization Set 1 MLP 2 labels unlabeled training examples by its own generalization Set 2 MLP 1 Trained by Labeled + Set 2 MLP 2 Trained by Labeled + Set 1,Summary and Further Discussions,Evolutionary Neural Networks Optimize Weights by Evoluti

20、onary Computations instead of Gradient Descent Evolutionary Computations Genetic Algorithm Genetic Programming Evolutionary Strategy Evolutionary Programming Avoid falling into local minima,Summary and Further Discussions,Application of MLP to Time Series Prediction X(1), X(2), X(3), , X(T) Training Examples (d: temporal window size) X(1), , X(d) X(d+1) X(2), , X(d+1) X(d+2) . X(T-1-d), , X(T-1) X(T) Test Example X(T-d), X(T) X(T+1),

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 企业管理 > 管理学资料

本站链接:文库   一言   我酷   合作


客服QQ:2549714901微博号:道客多多官方知乎号:道客多多

经营许可证编号: 粤ICP备2021046453号世界地图

道客多多©版权所有2020-2025营业执照举报