Designing Numerical Solvers for Next Generation High Performance Computing

Researcher: Mark Mawson

Supervisor(s): Dr Alistair Revell
Sponsor: EPSRC
Start Date: October 2009 End Date: July 2012
Keywords: HPC, CUDA, CELL, Linear Solver, Multigrid

Overall Research Aim

High Performance Computing is moving towards massive scales of parallelism. The changes in hardware towards large scale on chip parallelism requires the re-writing of solvers for various CFD problems. The aim of the project is to write solvers for various CFD models that take advantage of this increased parallelism. Hardware platforms that are likely to be used include GPUs from NVIDIA and ATI, and Intel's MIC hardware

Research Progress

GPU Technology

GPUs differ from traditional CPUs by having far more 'processing cores' at the expense of having less cache memory. Cache is very fast (and therefore expensive) memory located on the CPU (this is known as being 'on chip'), as opposed to RAM which sits on the motherboard. Access to cache memory can be one or two orders of magnitude faster than RAM, depending on the computer. In typical desktop computing, cache is important as a user may have several applications open at any one time. Large caches are used to store data for several programs, allowing the CPU to switch between them rapidly enough to give the illusion that the programs are running simultaneously. The access patterns to memory in this scenario are highly randomised, as user input has an effect.

In High Performance Computing the access to memory can be better defined; as we will not be switching between various programs we can predict how data will move through the computer. In this case large, fast caches are no longer as important. GPU computing takes advantage of this, the chip space and cost used for cache is instead used for more processing cores.
Overview of CPU and GPU Architectures
Fig 1: Overview of CPU and GPU architectures

Previous Work

V-Cycle Multigrid

The Multigrid method solves matrix inversion problems of the type Ax=b for the unknown vector x and the known vector b and known matrix A, where x can represent a multi-dimensional space. Iterative smoothing methods smooth high frequency errors in x well, but resolve lower frequency errors much more slowly. The Multigrid method solves this problem by repeatedly coarsening the discretised space (the grid/mesh) of x, increasing the frequency of the errors relative to the grid spacing. The errors at each level of coarseness are smoothed and then summed together (and smoothed to remove errors when interpolating to finer grids) to produce an error correction encapsulating all the frequencies in the grid space.

Red-Black Gauss Seidel

The Red-Black Gauss-Seidel Relaxation method will be used as the smoother for the multigrid solver. Preliminary results show a speedup of approximately 8-10x when running the smoother on a GPU compared with a single traditional CPU.

Fig 2: Preliminary Results for Red Black Smoother on GPU

Full V-Cycle Performance

The V-Cycle Multigrid implementation on a C1060 'Tesla' and C2050 'Fermi' GPU was compared against an implementation running on an AMD Opteron 2360 for the Laplace problem with grid sizes up to 4097x4097 nodes, the C1060 GPU out-performed the Opteron CPU by up to 12x and the C2050 outperformed the CPU by up to 24x.

Fig 2: Execution times for Fermi and Tesla GPUs vs AMD Opteron 2360

Last Modification: r10 - 2011-01-01 - 13:28:38 - MarkMawson

Current Tags:
create new tag
, view all tags
Topic attachments
I Attachment Action Size Date Who Comment
pngpng ExecutionTimesMultigridLaplace.png manage 22.7 K 2010-11-18 - 15:52 MarkMawson  
pngpng ExecutionTimesMultigridLaplaceResiduals.png manage 29.7 K 2010-11-18 - 16:06 MarkMawson  
pngpng GPUvsCPUoutline.png manage 59.2 K 2010-02-10 - 10:59 MarkMawson CPUvsGPUarchitectures
pngPNG MGFermiTesla.PNG manage 111.6 K 2011-01-01 - 13:26 MarkMawson  
pngpng RedBlackSpeedups.png manage 165.6 K 2010-02-10 - 11:02 MarkMawson Preliminary Red Black Results
Topic revision: r10 - 2011-01-01 - 13:28:38 - MarkMawson

Computational Fluid Dynamics and Turbulence Mechanics
@ the University of Manchester
Copyright & by the contributing authors. Unless noted otherwise, all material on this web site is the property of the contributing authors.