1、CNN的早期历史n 卷积神经网络 CNN K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological Cybernetics, vol. 36, pp. 193202, 1980 Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jack
2、el, “Backpropagation applied to handwritten zip code recognition,” Neural Computation, vol. 1, no. 4, pp. 541551, 1989 Y. Le Cun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 22782324, 19981DL时代的 CNN扩展n
3、 A Krizhevsky, I Sutskever, GE Hinton. ImageNet classification with deep convolutional neural networks. NIPS2012n Y. Jia et al. Caffe: Convolutional Architecture for Fast Feature Embedding. ACM MM2014n K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. arX
4、iv preprint arXiv:1409.1556, 2014n C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A.Rabinovich. Going deeper with convolutions. CVPR2015 (&arXiv:1409.4842, 2014)2卷积 示例3卷积 形式化n 4卷积 why?n 1. sparse interactions有限 连接, Kernel比输入小连接数少很多,学习难度小,计算复杂度低n m个 节点与 n个节点相连
5、O(mn)n 限定 k(m)个节点与 n个节点相连,则为 O(kn)5卷积 why?n 1. sparse interactions有限 连接, Kernel比输入小连接数少很多,学习难度小,计算复杂度低n m个 节点与 n个节点相连 O(mn)n 限定 k(m)个节点与 n个节点相连,则为 O(kn)6卷积 why?n 1. sparse interactions有限 (稀疏 )连接n Kernel比输入小n 局部连接连接数少很多n 学习难度小n 计算复杂度低层级感受野(生物启发)n 越高层的神经元,感受野越大7卷积 why?n 2. Parameter Sharing(参数共享)Tied
6、weights进一步极大的缩减参数数量n 3. Equivariant representations等变性配合 Pooling可以获得平移不变性n 对 scale和 rotation不具有此 属性8CNN的基本结构n 三 个步骤卷积n 突触前激活, net非线性激活n DetectorPoolingn Layer的两种定义复杂定义简单定义n 有些 层没有参数9Pooling10n 定义(没有需要学习的参数)replaces the output of the net at a certain location with a summary statistic of the nearby ou
7、tputsn 种类 max pooling (weighted) average poolingWhy Pooling?11n 获取不变性小的平移不变性:有即可,不管在哪里n 很强的先验假设 The function the layer learns must be invariant to small translationsWhy Pooling?12n 获取不变性小的平移不变性:有即可,不管在哪里旋转不变性?n 9个不同朝向的 kernels(模板)0.2 0.6 10.1 0.5 0.30.02 0.05 0.1Why Pooling?13n 获取不变性小的平移不变性:有即可,不管在哪
8、里旋转不变性?n 9个不同朝向的 kernels(模板)0.5 0.3 0.021 0.4 0.30.6 0.3 0.1Pooling与下采样结合n 更好的获取平移不变性n 更高 的计算效率(减少了神经元数)14从全连接到有限连接n 部分链接权重被强制设置为 0通常:非 邻接神经元,仅保留相邻的神经元全连接网络的特例,大量连接权重为 015Why Convolution & Pooling?n a prior probability distribution over the parameters of a model that encodes our beliefs about what m
9、odels are reasonable, before we have seen any data.n 模型参数的先验概率 分布 (No free lunch)在见到任何数据之前,我们的信念(经验)告诉我们,什么样的模型参数是合理的n Local connections;对平移的不变性; tied weigts来自生物 神经系统的启发16源 起: Neocognitron (1980)n SimplecomplexLower orderhigh order17K. Fukushima, “Neocognitron: A self-organizing neural network model
10、 for a mechanism of pattern recognition unaffected by shift in position,” Biological Cybernetics, vol. 36, pp. 193202, 1980Local Connection源起: Neocognitron (1980)18源起: Neocognitron (1980)n 训练方法分层自组织n competitive learning 无 监督n 输出层独立训练有监督19LeCun-CNN1989 用于字符识别n 简化 了 Neocognitron的结构n 训练 方法监督训练BP算法n 正切
11、函数收敛更 快, Sigmoid Loss, SGDn 用于邮编识别大量应用20LeCun-CNN1989 用于字符识别n 输入 16x16图像n L1H1 12个 5x5 kernel 8x8个神经元n L2-H2 12个 5x5x8 kernel 4x4个神经元n L3H3 30个神经元n L4 输出层 10个神经元n 总连接数 5*5*12*64+5*5*8*12*16+192*30,约 66,000个21LeCun-CNN1989 用于字符识别n Tied weights对同一个 feature map, kernel对不同位置是相同的!22LeCun-CNN1989 用于字符识别23
12、1998年 LeNet 数字 /字符识别n LeNet-5Feature mapn a set of units whose weighs are constrained to be identical. 241998年 LeNet 数字 /字符识别n 例如: C3层参数 个数(3*6+4*9+6*1)*25 + 16 = 1516 25后续: CNN用于目标检测与识别26AlexNet for ImageNet (2012)n 大规模 CNN网络650K神经元60M参数 使用 了各种 技巧n Dropoutn Data augmentn ReLUn Local Response Normal
13、izationn Contrast normalizationn .27Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.“ Advances in neural information processing systems. 2012.AlexNet for ImageNet (2012)n ReLU激活函数28AlexNet for ImageNet (2012)n 实现2块 GPU卡输入层 15
14、0,528其它 层 253,440186,624 64,896 64,896 43,264 4096 4096 1000.29Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.“ Advances in neural information processing systems. 2012.AlexNet for ImageNet (2012)n ImageNet物体分类任务上1000类, 1,431
15、,167幅图像30Rank Name Error rates(TOP5)Description1 U. Toronto 0.153 Deep learning2 U. Tokyo 0.261 Hand-crafted features and learning models.Bottleneck.3 U. Oxford 0.2704 Xerox/INRIA 0.271Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.“ Advances in neural information processing systems. 2012.