Virtualized GPU computing platform in clustered system environment

YANG Jingwei; MA Kai; LONG Xiang

doi:10.13700/j.bh.1001-5965.2015.0731

Volume 42 Issue 11

Nov. 2016

Turn off MathJax

Article Contents

Journal of Beijing University of Aeronautics and Astronautics > 2016 > 42(11): 2340-2348.

YANG Jingwei, MA Kai, LONG Xianget al. Virtualized GPU computing platform in clustered system environment[J]. Journal of Beijing University of Aeronautics and Astronautics, 2016, 42(11): 2340-2348. doi: 10.13700/j.bh.1001-5965.2015.0731(in Chinese)

Citation:

YANG Jingwei, MA Kai, LONG Xianget al. Virtualized GPU computing platform in clustered system environment[J]. Journal of Beijing University of Aeronautics and Astronautics, 2016, 42(11): 2340-2348. doi: 10.13700/j.bh.1001-5965.2015.0731(in Chinese)

Citation:

PDF( 3721 KB)

Virtualized GPU computing platform in clustered system environment

doi: 10.13700/j.bh.1001-5965.2015.0731

School of Computer Science and Engineering, Beijing University of Aeronautics and Astronautics, Beijing 100083, China

Received Date: 09 Nov 2015
Rev Recd Date: 15 Jan 2016
Publish Date: 20 Nov 2016

Abstract

Abstract

A virtualized GPU computing platform is proposed for clustered systems, which are often equipped with GPUs in some nodes. All GPUs in system are uniformly abstracted as virtualized ones in a commonly accessed resource pool. Legacy GPU programs can execute on the virtualized GPU computing platform without any modification and any free virtualized GPU in the common resource pool is available to it, which relieves the burden of MPI programming. The platform frees programs with the limit of GPUs in local node and makes it possible for them to access any available GPU in distributed nodes, leading to higher system utilization and throughput. Based on pipelined communication, the run-time overhead and inter-node transmitting latency in virtualized GPU computing platform are hidden by intra-node memory copying and GPU computing. Compared with the non-pipelined communication, the total transmission latency is decreased by approximately 50%-70%. It results in a comparable performance with intra-node local data transmission.
- GPU,
- MPI,
- CUDA,
- clustered systems,
- hardware acceleration,
- parallel computing,
- high performance computing

FullText(HTML)

References(16)

References

[1]	KIVITY A,KAMAY Y,LAOR D,et al.KVM:The Linux virtual machine monitor[EB/OL].Proceedings of the Linux Symposium,Ottawa[2015-11-01].https://www.kernel.org/doc/ols/2007/ols2007v1-pages-225-230.pdf.
[2]	BARHAM P,DRAGOVIC B,FRASER K,et al.Xen and the art of virtualization[C]//Proceedings of the 19th ACM Symposium on Operating Systems Principles.New York: ACM,2003:164-177.
[3]	NextION2800-ICA-Flexible and manageable I/O expansion and virtualization[EB/OL].Austin:NEXTIO[2015-11-01].http://www.nextio.com/docs/NextIO20N2800-ICAIOConsolidationApplianceProductBriefv0.18.pdf.
[4]	SHREINER D.OpenGL programming guide[M].7th ed.Boston:Addison-Wesley Professional,2009:1-28.
[5]	BLYTHE D.The direct 3D 10 system[J].ACM Transactions on Graphics,2006,25(3):724-734.
[6]	NVIDIA.CUDA C programming guide[EB/OL].Santa Clara:NVIDIA[2015-11-01].https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html.
[7]	KHRONOS OpenCL Working Group.OpenCL 2.0 specification[EB/OL].OR:Khronos[2015-11-01].http://www.khronos.org/registry/cl/specs/opencl-1.2.pdf.
[8]	DOWTY M,SUGERMAN J.GPU virtualization on VMware's hosted I/O architecture[J].ACM SIGOPS Operating Systems Review,2009,43(3):73-82.
[9]	LAGAR-CAVILLA H A,TOLIA N,SATYANARAYANAN M,et al.VMM-independent graphics acceleration[C]//VEE'07:Proceedings of the 3rd International Conference on Virtual Execution Environments.New York:ACM,2007:33-43.
[10]	SHI L,CHEN H,SUN J.vCUDA:GPU accelerated high performance computing in virtual machines[C]//23rd IEEE International Symposium on Parallel and Distributed Processing (IPDPS'09).Piscataway,NJ:IEEE Press,2009:418-428.
[11]	GUPTA V,GAVRILOVSKA A,SCHWAN K,et al.GViM:GPU-accelerated virtual machines[C]//Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing.New York:ACM,2009:17-24.
[12]	GIUNTA G,MONTELLA R,AGRILLO G,et al.A GPGPU transparent virtualization component for high performance computing clouds[C]//16th International Euro-Par-Conference on Parallel Processing.Berlin:Springer,2010:379-391.
[13]	KEGEL P,STEUWER M,GORLATCH S.dOpenCL:Towards a uniform programming approach for distributed heterogeneous multi-/many-core systems[C]//2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW).Piscataway,NJ:IEEE Press,2012:174-186.
[14]	BARAK A,BEN-NUN T,LEVY E,et al.A package for OpenCL based heterogeneous computing on clusters with many GPU devices[C]//2010 IEEE International Conference on Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS).Piscataway,NJ:IEEE Press,2010:1-7.
[15]	DUATO J,PEA A J,SILLA F,et al.rCUDA:Reducing the number of GPU-based accelerators in high performance clusters[C]//2010 International Conference on High Performance Computing and Simulation (HPCS).Piscataway,NJ:IEEE Press,2010,6:224-231.
[16]	PEA A J,REAO C,SILLA F,et al.A complete and efficient CUDA-sharing solution for HPC clusters[J].Parallel Computing,2014,40(10):574-588.

Relative Articles

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Get Citation

PDF

XML

Article Metrics

Article views(813) PDF downloads(758)

Virtualized GPU computing platform in clustered system environment

doi: 10.13700/j.bh.1001-5965.2015.0731

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Virtualized GPU computing platform in clustered system environment

doi: 10.13700/j.bh.1001-5965.2015.0731

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Export File

Citation

Format

Content