Automatic Parallelization of GPU Applications Using OpenCL

Solano Quinde, Lizandro Damián

Publication:
Automatic Parallelization of GPU Applications Using OpenCL

dc.contributor.author	Solano Quinde, Lizandro Damián
dc.date.accessioned	2018-01-11T16:47:50Z
dc.date.available	2018-01-11T16:47:50Z
dc.date.issued	2015-07-14
dc.description.abstract	Graphics Processing Units (GPUs) have been successfully used to accelerate scientific applications due to their computation power and the availability of programming languages that make more approachable writing scientific applications for GPUs. However, since the programming model of GPUs requires offloading all the data to the GPU memory, the memory footprint of the application is limited to the size of the GPU memory. Multi-GPU systems can make memory limited problems tractable by parallelizing the computation and data among the available GPUs. Parallelizing applications written for running on single-GPU systems can be done (i) at runtime through an environment that captures the memory operations and kernel calls and distributes among the available GPUs, and (ii) at compile time through a pre-compiler that transforms the application for decomposing the data and computation among the available GPUs. In this paper we propose a framework and implement a tool that transforms an OpenCL application written to run on single-GPU systems into one that runs on multi-GPU systems. Based on data dependencies and data usage analysis, the application is transformed to decompose data and computation among the available GPUs. To reduce the data transfer overhead, computation-communication overlapping techniques are utilized. We tested our tool using two applications with different data transfer requirements, for the application with no data transfer requirements, a linear speedup is achieved, while for the application with data transfers, the computation-communication overlapping reduces the communication overhead by 40%.
dc.description.city	Quito
dc.identifier.doi	10.1109/APCASE.2015.56
dc.identifier.isbn	9781479975884
dc.identifier.uri	https://www.scopus.com/inward/record.uri?eid=2-s2.0-84959361463&doi=10.1109%2fAPCASE.2015.56&partnerID=40&md5=d7c419381a7c08bab3a2f634f29bc02c
dc.identifier.uri	http://dspace.ucuenca.edu.ec/handle/123456789/29244
dc.language.iso	en_US
dc.publisher	INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS INC.
dc.source	Proceedings - 2015 Asia-Pacific Conference on Computer-Aided System Engineering, APCASE 2015
dc.subject	Gpu
dc.subject	Opencl
dc.subject	Program Transformation
dc.title	Automatic Parallelization of GPU Applications Using OpenCL
dc.type	Article
dc.ucuenca.afiliacion	solano-quinde, l.d., department of electrical, electronic and telecommunications engineering, university of cuenca, ecuador, ames laboratory, u.s. department of energy, united states
dc.ucuenca.embargoend	2022-01-01 0:00
dc.ucuenca.idautor	0102428893
dc.ucuenca.indicebibliografico	SCOPUS
dc.ucuenca.nombrerevista	Asia-Pacific Conference on Computer-Aided System Engineering APCASE 2015
dc.ucuenca.numerocitaciones	1
dspace.entity.type	Publication
relation.isAuthorOfPublication	db82deb3-5465-4f62-b097-bf4beba1623a
relation.isAuthorOfPublication.latestForDiscovery	db82deb3-5465-4f62-b097-bf4beba1623a