论文总字数:23263字
摘 要
批处理应用广泛存在于各种应用中,例如移动应用、电子商务和科学计算。这些应用通常需要通过一系列相关的操作,完成复杂的业务逻辑。其中每个操作都需要处理海量数据或者大规模计算。为了加快批处理应用中海量数据和大规模计算的速度,很多分布式计算平台被开发出来,例如Hadoop 1.0。在这些平台中,为了缩短计算时间,每个操作的海量数据被分割成多个小块进行并行处理,每个任务复杂处理一块数据。这样一个操作就包含了很多个并行的任务,这些任务整体上构成了一个批处理任务。由于各个批处理任务直接具有复杂的业务逻辑,很多更加灵活的分布式资源管理平台被提出,例如:YARN 和Mesos, YARN 平台允许用户选择更多的计算模式(例如Dryad ) 和设计更加灵活的资源调度方法。Dryad 使用Directed Acyclic Graphs (DAGs) 描述复杂的业务逻辑,例如批处理任务之间数据流动方式。为了实现弹性能力,很多企业和机构将应用部署到云计算资源上,本课题考虑基于云计算资源的YARN 系统,称为C-YARN。该系统在YARN 系统的基础上增加了从云计算系统租赁资源的能力。因此,必须为C-YARN 系统设计合适的资源供应、调度算法,帮助用户从公有云租赁合适数量和类型的资源,并且对YARN系统进行修改,实现动态从公有云申请和释放资源的功能。在这里,只考虑通过修改YARN平台,及搭建合理的公有云模拟平台,从公有云动态地为用户分配和回收他们所需要的资源,实现资源分配过程中的弹性能力。
关键词:批处理,YARN,云计算,动态资源分配
Design and implementation of the elastic resource management component of C-YARN
Abstract
Batch has been widely used in various applications, such as mobile application, e-commerce and scientific computing. These applications can complete complex business logic through a series of related operations. Each of these operations need deal with huge amounts of data or large-scale computing. In order to speed up a batch processing applications and large-scale computing speed in huge amounts of data., many distributed computing platform has been developed, such as Hadoop 1.0. In these platforms, in order to shorten the calculation time, each operation of huge amounts of data parallel processing is divided into many small pieces, Such an operation contains a number of parallel tasks, these tasks to form a batch processing tasks as a whole. Because each batch processing tasks directly with complex business logic, a lot more flexible distributed resource management platform was put forward, such as YARN and Mesos, YARN platform allows users to select more computing mode (for example a Dryad) and the design more flexible resource scheduling method. Dryad using the Directed Acyclic Graphs (DAGs) describe the complex business logic,such as the way of data flow between batch tasks. In order to achieve flexible ability, many enterprises and institutions deploy applications to the cloud computing resources, this topic considering the YARN system based on cloud computing resources, known as C - YARN. The system on the basis of the YARN system ability adds the rent from the cloud computing system resources. Therefore, we must design suitable resources supply, scheduling algorithm for C - YARN system ,to help users from public cloud lease appropriate amount and type of resources, and to modify the YARN system,dynamic and release resources from public cloud application. Here, only considered by modifying the YARN platform, and setting up reasonable public cloud simulation platform, from the public cloud dynamically allocated to users and recycling the resources they need, realize the elastic capacity in the process of resource allocation.
KEY WORDS: batch, YARN, cloud computing, dynamic allocation of resources
目录
摘要 I
Abstract II
第一章 绪论 1
1.1 引言 1
1.2 研究现状 1
1.2.1HADOOP平台发展 1
1.2.2云模块的分析 4
1.2.3YARN与云模块结合现状 5
1.3 研究内容 5
1.4 论文组织结构 6
第二章 系统设计 7
2.1 系统结构 7
2.2系统核心模块 7
2.2.1YARN平台 7
2.2.1虚拟云模块 8
2.3 本章小结 9
第三章 系统实现 10
3.1 搭建YARN平台 10
3.1.1 Hadoop2.2.0编译 10
3.1.2Hadoop2.2.0搭建 11
3.2 Hadoop2.2.0代码分析 20
3.2.1hadoop2.2.0-src构建eclipse工程 20
3.2.2hadoop2.2.0源码及yarn工作流程 21
3.3 模拟云平台搭建 22
3.3.1xfire框架下的Web Service 22
3.3.2SWT图形化界面 23
3.4结果的展示 23
3.4.1YARN系统运行结果展示 23
3.4.2云模块运行结果展示 26
3.5 本章小结 27
第四章 总结与展望 29
4.1 论文总结 29
4.2 工作展望 29
致谢 31
参考文献 32
第一章 绪论
1.1 引言
剩余内容已隐藏,请支付后下载全文,论文总字数:23263字
该课题毕业论文、开题报告、外文翻译、程序设计、图纸设计等资料可联系客服协助查找;