收藏 分享(赏)

下一代网络计算-云计算.ppt

上传人:无敌 文档编号:303370 上传时间:2018-03-26 格式:PPT 页数:110 大小:5.48MB
下载 相关 举报
下一代网络计算-云计算.ppt_第1页
第1页 / 共110页
下一代网络计算-云计算.ppt_第2页
第2页 / 共110页
下一代网络计算-云计算.ppt_第3页
第3页 / 共110页
下一代网络计算-云计算.ppt_第4页
第4页 / 共110页
下一代网络计算-云计算.ppt_第5页
第5页 / 共110页
点击查看更多>>
资源描述

1、下一代网络计算-云计算,翟岩龙 博士北京理工大学 计算机学院中教10342011.09.20,人类对计算机资源的不变追求?PowerfulScalableAccessibleReliable,2,3,提纲,云计算概述云计算体系结构云计算关键技术 - 虚拟化云计算关键技术 - MapReduce云计算在北理,Gartner Report,Top 10 Strategic Technology Areas for 2009 VirtualizationCloud ComputingServers: Beyond BladesWeb-Oriented ArchitecturesEnterprise

2、MashupsSpecialized SystemsSocial Software and Social NetworkingUnified CommunicationsBusiness IntelligenceGreen Information Technology,Top 10 Strategic Technology Areas for 2010Cloud Computing Advanced AnalyticsClient Computing IT for GreenReshaping the Data CenterSocial ComputingSecurity Activity M

3、onitoring Flash MemoryVirtualization for AvailabilityMobile Applications,Gartner Report,纽约时报租用亚马逊的云计算服务,使用基于云计算的开源软件Hadoop,将其自1851年以来的1100万份报道转变成可搜索的数字化文档,耗时仅一天。如果用传统方法,这项工作可能要数月才能完成。,最近兴起的云计算甚至可以让你体验每秒10万亿次的运算能力,拥有这么强大的计算能力可以模拟核爆炸、预测气候变化和市场发展趋势。,云计算4People,云作为数据中心(多终端同步-透明)PC / 笔记本 客户端浏览器PDA / 手机 /

4、 相机电子相册CRM云作为运算中心拍出的相片立即编辑修改在线编写文档、报告随时随地写日志随时随地的身体健康状况监控,data,devices,anywhere,shared,8,企业云计算,企业云计算from Salesforce,9,1986年我国第一封Email 560bps,现在的网络速度,云计算的产生和演进,云计算的产生和演进,计算能力的需求的增长,并行计算、分布式计算和网格计算,并行计算,Parallel Computing是指同时使用多种计算资源解决计算问题的过程,其主要目的是快速解决大型且复杂的计算问题特点:把计算任务分派给系统内的多个运算单元大型机的多CPU和多存储器并行计算问

5、题的特征将工作分离成离散部分,有助于同时解决随时并及时地执行多个程序指令(多条线同时运行)多计算资源下解决问题的耗时要少于单个计算资源下的耗时,分布式计算,Distributed Computing所谓分布式计算是一门计算机科学,它研究如何把一个需要非常巨大的计算能力才能解决的问题分成许多小的部分,然后把这些部分分配给许多计算机进行处理,最后把这些计算结果综合起来得到最终的结果。特点:把计算任务分派给网络中的多台独立的机器优点:稀有资源可以共享 通过分布式计算可以在多台计算机上平衡计算负载 可以把程序放在最适合运行它的计算机上,分布式计算,一些流行的分布式项目SETIHome:寻找外星文明RC

6、-72:密码分析和破解,研究和寻找最为安全的密码系统Foldinghome:研究蛋白质折叠,误解,聚合及由此引起的相关疾病Rosettahome:蛋白质折叠项目,预测并设计蛋白质结构United Devices:寻找对抗癌症的有效的药物GIMPS:寻找最大的梅森素数(解决较为复杂的数学问题),网格计算,Grid Computing网格是利用互联网把地理上广泛分布的各种资源(包括计算资源、存储资源、带宽资源、软件资源、数据资源、信息资源、知识资源等)连成一个逻辑整体,就像一台超级计算机一样,为用户提供一体化信息和应用服务(计算、存储、访问等) 网格计算是分布式计算的一种,是分布式计算封装,什么是

7、云计算,CLOUD COMPUTING,Cloud Computing is,No softwareaccess everywhere by Internetpower - Large-scale data processingAppeal for startupsCost efficiencySoftware as a ServiceConsSecurityData lock-in,SaaSPaaSUtility Computing,What Cloud Computing “IS NOT”?,It is not Network Computing Application and Data

8、are not confined to any specific Companys Server No VPN Access Encompasses multiple companies, multiple servers and multiple networks It is not Traditional Outsourcing Not a contract to host data by 3rd party Hosting Business No subcontracting for computing services for specific outside firm,So exac

9、tly what Cloud Computing is?,A style of computing where massively scalable IT-enabled capabilities are provided as a service over the network,Acquisition Model Service Based,Business Model Usage Based,Access Model Network,Technical Model Dynamic,云计算,云计算云计算是为用户提供无限计算资源的商业服务,是能够自我管理计算资源的系统平台,是应用服务按需定制

10、、易于扩展的软件架构。,云计算,云计算(cloud computing),是一种基于互联网的计算方式,通过这种方式,共享的软硬件资源和信息可以按需提供给计算机和其他设备。,22,云计算相关概念,云云是一些可以自我维护和管理的虚拟计算资源,通常为一些大型服务器集群,包括计算服务器、存储服务器、宽带资源等等。云计算将所有的计算资源集中起来,并由软件实现自动管理,无需人为参与。这使得应用提供者无需为繁琐的细节而烦恼,能够更加专注于自己的业务,有利于创新和降低成本。,计算资源的演进:从集中到分散再到集中,全世界只需要5台电脑就足够了 托马斯沃森个人用户的内存只需640K足矣 比尔盖茨The netwo

11、rk is the computer John Gage,计算时代,网络时代,云时代,Attributes of Cloud Computing,Data stored on the cloudSoftware & services on the cloud - Access via web browserBased on standards and protocols - Linux, AJAX, LAMP, etc.Accessible from any device,Hardware Centric,Software Centric,Service Centric,云计算特点,超大规模:

12、服务器群虚拟化:可以看作是一片用于计算的云高可靠性:冗余副本、负载均衡通用性:支撑千变万化的实际应用高可扩展性:灵活、动态伸缩按需服务:按需购买极其廉价:不再需要一次性购买超级电脑安全: 摆脱数据丢失、病毒入侵 方便:支持多终端、数据共享,云计算发展的障碍,云计算目前的困境,云计算何时从云端到地面,标准不统一Google、Amazon、 IBM、微软等的平台互不兼容,“云计算”之争,云计算何时从云端到地面,数据真的安全?云服务提供商的信誉留后门?!面临着全世界的黑客需要高强度的安全系统,云计算何时从云端到地面,网络带宽3G 尚未普及,费用极高,云计算何时从云端到地面,耗电量巨大主旋律节能减排终

13、端设备的电池容量有限,云计算的几大形式,云计算服务类形基础设施即服务( IaaS)软件即服务( SaaS )网络服务平台即服务(PaaS)管理服务提供商(MSP)商业服务平台云安全,33,提纲,云计算概述云计算体系结构云计算关键技术 - 虚拟化云计算关键技术 - MapReduce云计算在北理,34,Cloud Computing Framework,基础设施即服务 (实用计算、虚拟化),IaaS Infrastructure as a Service 是为IT行业创造虚拟的计算和数据中心,使得其能够把计算单元、存储器、I/O设备、带宽等计算机基础设施,集中起来成为一个虚拟的资源池来为整个网络

14、提供服务。用多少算多少Amazon WebServices,简作AWS弹性计算云EC2 (Elastic Compute Cloud) 计算简单存储服务S3 (Simple Storage Service) 存储RackspaceEucalyptusGoogle,What are the benefits & challenges IaaS?,BenefitsSystems managed by SLA should equate to fewer breaches Higher return on assets through higher utilizationReduced cost d

15、riven byLess hardwareLess floor space from smaller hardware footprintHigher level of automation from fewer administratorsLower power consumptionAble to match consumption to demand,ChallengesPortability of applicationsMaturity of systems management toolsIntegration across the Cloud boundaryExtension

16、of internal security models,软件即服务,SaaS Software as a ServiceSaaS是一种基于互联网提供软件服务的应用模式。软件租赁:用户按使用时间和使用规模付费绿色部署:用户不需安装,打开浏览器即可运行不需要额外的服务器硬件软件(应用服务)按需定制,软件即服务,SaaS 产品Salesforce CRM阿里软件 Google apps,Alexa 排名:第一名 Salesforce第二名 阿里软件第三名 铭万第四名 金算盘第五名 中企动力第六名 神码在线第七名 商务领航第八名 友商网第九名 八百客第十名 ,What are the benefits

17、 & challenges of SaaS?,BenefitsSpeedReduced up-front cost, potential for reduced lifetime costTransfer of some/all support obligationsElimination of licensing riskElimination of version compatibilityReduced hardware footprint,ChallengesExtension of the security model to the provider (data privacy an

18、d ownership)Governance and billing managementSynchronization of client and vendor migrationsIntegrated end-user supportScalability,Strong governance required to prevent lines of business from purchasing application services externally without IT involvement,平台即服务,PaaS Platform as a Service把服务器平台或开发环

19、境作为一种服务提供的商业模式从系统定制到PaaS 的 800app 不再需要任何编程即可开发包括CRM、OA、HR、SCM、进销存管理等任何企业管理软件,What are the benefits & challenges of PaaS?,BenefitsPay-as-you-go for development, test, and production environmentsEnables developers to focus on application codeInstant global platformElimination of H/W dependencies and

20、capacity concernsInherent scalabilitySimplified deployment model,ChallengesGovernanceTie-in to the vendorExtension of the security model to the providerConnectivityReliance on 3rd party SLAs,Strong governance required to prevent lines of business from building applications without IT involvement,Sol

21、utions and vendors are emerging daily,External IaaS,Utility Systems Management Tools+,Utility Application Development,Data SynapseUniva UDElastra Cloud Server3tera App Logic,VMWareIBM TivoliCassattParallels,HP/EDS (TBD)IBM Blue CloudSun GridJoyent,Software as a Service (Saas),Google AppsZoho OfficeW

22、orkdayMicrosoft Office Live,Platform as a Service,Amazon E2CS FGoogle App EngineCoghead,Internal IaaS,HP Adaptive Infrastructure as a Service,Oracle On Demand AppsNetSuite ERPS SFA,EtelosLongJumpBoomiMicrosoft Azure*,XenZuoraAria SystemseVapt,IBM WebSphere XD BEA Weblogic Server VEMule,RackspaceJamc

23、racker,43,提纲,云计算概述云计算体系结构云计算关键技术 - 虚拟化云计算关键技术 - MapReduce云计算在北理,Role of OS,44,传统服务器,45,Web ServerWindowsIIS,App ServerLinuxGlassfish,DB ServerLinuxMySQL,EMailWindowsExchange,虚拟服务器,46,Virtual Machine Monitor (VMM) layer between Guest OS and hardware,虚拟化,47,虚拟化历史,48,1965 IBM M44/44X paging system1965

24、IBM System/360-67 virtual memory hardware1967 IBM CP-40 (January) and CP-67 (April) time-sharing1972 IBM VM/370 run VM under VM 1997 Connectix First version of Virtual PC 1998 VMWare U.S. Patent 6,397,2421999 VMware Virtual Platform for the Intel IA-32 architecture2000 IBM z/VM2001 Connectix Virtual

25、 PC for Windows2003 Microsoft acquired Connectix 2003 EMC acquired Vmware2003 VERITAS acquired Ejascent 2005 HP Integrity Virtual Machines,2005 Intel VT2006 AMD VT2005 XEN2006 VMWare Server2006 Virtual PC 20062006 HP IVM Version 2.02006 Virtual Iron 3.12007 InnoTek VirtualBox2007 KVM in Linux Kernel

26、2007 XEN in Linux Kernel,虚拟化,Virtualization:The ability to run multiple operating systems on a single physical system and share the underlying hardware resources,49,Low utilization metrics in servers across the organization,Too many servers for too little work,High costs and infrastructure needs,Mai

27、ntenanceLeasesNetworkingFloor spaceCoolingPowerDisaster Recovery,Heterogeneous Environments,虚拟技术: 四大特性,动态迁移,通过动态地将应用程序从一个服务器移动到另一个服务器,减少计划停机时间。通过允许您将工作负载从负载较重的服务器移动到具有空闲容量的服务器,可以应对不断变化的工作负载和业务需求。通过允许您简单地整合工作负载,并关闭不使用的服务器,减少能量的消耗。,54,定义,Hypervisor (or VMM Virtual Machine Monitor) is a software layer

28、that allows several virtual machines to run on a physical machineThe physical OS and hardware are called the HostThe virtual machine OS and applications are called the Guest,55,虚拟化架构,Bare-metal architecture-裸金属虚拟化结构Hosted architecture -主机虚拟化结构,虚拟化类型,57,完全虚拟化半虚拟化操作系统层虚拟化,全虚拟化,Guest os 内核不需要进行修改。Guest

29、 Domain不知道自己运行在Hypervisor上完全虚拟化技术的优点是有很好的兼容性,操作系统不用改动就能安装到虚拟服务器上;主要缺点是,hypervisor给处理器带来开销。,全虚拟化,准虚拟化,guest os 内核需要修改,能够与hypervisor协同工作。当Guest Domain是一个准虚拟化的虚拟机时,虚拟机的内核是被修改过的,它知道自己不是运行在真实的硬件上。其速度能力几乎不亚于未经过虚拟化处理的服务器;缺点是只适用于BSD、Linux、Solaris等某些开源操作系统,不适用于Windows等专有操作系统,兼容性差。,准虚拟化,操作系统层虚拟化,操作系统层虚拟化没有独立h

30、ypervisor层,主机操作系统本身负责在多个虚拟服务器之间分配硬件资源,并且让这些服务器彼此独立。操作系统层虚拟化的缺点是所有虚拟服务器必须运行同一操作系统(不过每个实例有各自的应用程序和用户账户),灵活性比较差;优点是本机速度性能比较高,由于架构在所有虚拟服务器上使用单一、标准的操作系统,管理起来比异构环境要容易。,kvm,Xen 3.0,Available from Xen Source (http:/)In association with University of Cambridge (http:/www.cl.cam.ac.uk/Research/SRG/netos/xen/)

31、Support for 64-Bit and 32-bit machinesSupports IntelVTLinux support only, Windows expected later this yearOpen Source Product One of the most actively maintained projects in the open source community$ - FreeTarget: 100 virtual OSes per machine,Xen Architecture,Domain 0,Domain U,Hypervisor,66,提纲,云计算概

32、述云计算体系结构云计算关键技术 - 虚拟化云计算关键技术 - MapReduce云计算在北理,海量信息处理,你需要一个多大的硬盘?,68,69,How much data?,Internet archive has 2 PB of data + 20 TB/monthGoogle processes 20 PB a day (2008)“all words ever spoken by human beings” 5 EBCERNs LHC will generate 10-15 PB a yearSanger anticipates 6 PB of data in 2009,640K oug

33、ht to be enough for anybody.,What is MapReduce?,Data-parallel programming model for clusters of commodity machinesPioneered by GoogleProcesses 20 PB of data per dayPopularized by open-source Hadoop projectUsed by Yahoo!, Facebook, Amazon, ,What is MapReduce Used For?,At Google:Index building for Goo

34、gle SearchArticle clustering for Google NewsStatistical machine translationAt Yahoo!:Index building for Yahoo! SearchSpam detection for Yahoo! MailAt Facebook:Data miningAd optimizationSpam detection,Example: Facebook Lexicon,Example: Facebook Lexicon,What is MapReduce Used For?,In research:Analyzin

35、g Wikipedia conflicts (PARC)Natural language processing (CMU) Bioinformatics (Maryland)Particle physics (Nebraska)Ocean climate simulation (Washington),MapReduce Goals,Scalability to large data volumes:Scan 100 TB on 1 node 50 MB/s = 24 daysScan on 1000-node cluster = 35 minutesCost-efficiency:Commo

36、dity nodes (cheap, but unreliable)Commodity networkAutomatic fault-tolerance (fewer admins)Easy to use (fewer programmers),Typical Hadoop Cluster,40 nodes/rack, 1000-4000 nodes in cluster1 Gbps bandwidth in rack, 8 Gbps out of rackNode specs (Facebook):8 cores, 16 GB RAM, 8 x 1.5 TB disks, no RAID,T

37、ypical Hadoop Cluster,Google Server,78,Challenges,Cheap nodes fail, especially if you have manyMean time between failures for 1 node = 3 yearsMTBF for 1000 nodes = 1 daySolution: Build fault-tolerance into systemCommodity network = low bandwidthSolution: Push computation to the dataProgramming distr

38、ibuted systems is hardSolution: Users write data-parallel “map” and “reduce” functions, system handles work distribution and failures,Hadoop Components,Distributed file system (HDFS)Single namespace for entire clusterReplicates data 3x for fault-toleranceMapReduce frameworkRuns jobs submitted by use

39、rsManages work distribution & fault-toleranceColocated with file system,Hadoop Distributed File System,Files split into 64MB blocksBlocks replicated across several datanodes (usually 3)Namenode stores metadata (file names, locations, etc)Optimized for large files, sequential readsFiles are append-on

40、ly,Namenode,Datanodes,1,2,3,4,1,2,4,2,1,3,1,4,3,3,2,4,File1,MapReduce,82,“Work”,w1,w2,w3,r1,r2,r3,“Result”,“worker”,“worker”,“worker”,Partition,Combine,MapReduce Programming Model,Data type: key-value recordsMap function:(Kin, Vin) list(Kinter, Vinter)Reduce function:(Kinter, list(Vinter) list(Kout,

41、 Vout),Parallel/Distributed Computing Programming Model,Input split,shuffle,output,MapReduce Programming Model,读入数据: key/value 对的记录格式数据Map: 从每个记录里extract somethingmap (in_key, in_value) - list(out_key, intermediate_value) 处理input key/value pair 输出中间结果key/value pairsShuffle: 混排交换数据把相同key的中间结果汇集到相同节点上

42、Reduce: aggregate, summarize, filter, etc.reduce (out_key, list(intermediate_value) - list(out_value) 归并某一个key的所有values,进行计算输出合并的计算结果 (usually just one) 输出结果,MapReduce Programming Model,86,Word Frequencies in Web pages,输入:one document per record用户实现map function,输入为key = document URLvalue = document

43、contentsmap输出 (potentially many) key/value pairs. 对document中每一个出现的词,输出一个记录,87,Example continued:,MapReduce运行系统(库)把所有相同key的记录收集到一起 (shuffle/sort)用户实现reduce function对一个key对应的values计算求和sumReduce输出,88,MapReduce Runtime System,Example: Word Count,def mapper(line): foreach word in line.split(): output(wor

44、d, 1)def reducer(key, values): output(key, sum(values),Word Count Execution,Input,Map,Shuffle & Sort,Reduce,Output,Fault Tolerance in MapReduce,1. If a task crashes:Retry on another nodeOK for a map because it had no dependenciesOK for reduce because map outputs are on diskIf the same task repeatedl

45、y fails, fail the job or ignore that input block,Fault Tolerance in MapReduce,2. If a node crashes:Relaunch its current tasks on other nodesRelaunch any maps the node previously ranNecessary because their output files were lost along with the crashed node,Fault Tolerance in MapReduce,3. If a task is

46、 going slowly (straggler):Launch second copy of task on another nodeTake the output of whichever copy finishes first, and kill the other oneCritical for performance in large clusters (“everything that can go wrong will”),Elastic MapReduce UI,Elastic MapReduce UI,Elastic MapReduce UI,97,提纲,云计算概述云计算体系

47、结构海量信息处理与MapReduce云计算在北理,98,平台描述,为充分利用高性能物理资源,提高资源利用率,为科学研究提供良好的科研平台,为各种复杂大规模应用提供高性能基础设施,打造针对高校的云计算平台针对教学提供虚拟主机租用、云存储、云端软件等服务针对科研提供高性能计算平台、弹性计算资源、虚拟存储等服务,99,平台特点 技术特点,大规模 虚拟化 高可靠性 通用性 高可扩展性 按需服务 极其廉价,支持VMware、KVM和Xen提供强大的基于Web2.0的Web管理界面提供针对不同层次应用的编程接口 资源系统监控 资源动态调度端到端的服务质量和云管理,100,平台逻辑架构,101,BIT Cloud,102,平台框架设计,物理资源虚拟化基础设施即服务设计平台即服务设计软件即服务设计,103,物理资源虚拟化,104,基础设施即服务设计,105,平台即服务设计 基于Hadoop的高性能分布式并行计算平台,106,平台即服务设计 基于Hadoop的高性能分布式并行计算平台,107,平台应用场景,IaaS层应用弹性计算虚拟主机教学环境虚拟主机服务云存储高性能流媒体点播PaaS层应用高性能分布式并行计算海量信息处理模式提取分析大规模场景渲染其他学科大计算量应用科教云SaaS层应用邮件,选课,数字图书馆,信息搜索,教务管理,财务管理,远程教育,

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 企业管理 > 经营企划

本站链接:文库   一言   我酷   合作


客服QQ:2549714901微博号:道客多多官方知乎号:道客多多

经营许可证编号: 粤ICP备2021046453号世界地图

道客多多©版权所有2020-2025营业执照举报