Below are the Summer Scholarship projects that were on offer in previous years.
Integer factorisation is a classic problem in number theory, and it forms the basis for encryption methodologies that facilitate secure e-commerce. While factorisation may be trivial for small integers, the problem becomes much harder for extremely large numbers (e.g. over 100 digits). The multiple polynomial quadratic sieve factoring algorithm, implemented on a powerful computer, makes it feasible to factorise such large numbers. The algorithm is very parallelisable which makes it an ideal candidate for implementation on HPC architectures. Thus the challenge for this project is to implement this algorithm on the GPU and to evaluate its performance. Working C code for CPUs will be provided as a starting point.
General Matrix Multiply (GEMM) is a matrix multiplication subroutine in the Basic Linear Algebra Subprograms (BLAS) that is often optimised for HPC architectures. It has significant impact on performance as it is the foundation of many other subroutines, scientific codes and the LINPACK benchmark. As hybrid CPU-GPU architectures becomes more prevalent in HPC, Massimiliano Fatica1 developed a host library that intercepts double-precision DGEMM calls and distributes the workload to be executed on both CPUs and GPUs simultaneously. The results showed that improved Linpack performance is readily achievable on hybrid CPU-GPU architectures.
Following on from that work, the goal of this project is to develop a more general-purpose version of the host library that will handle single-precision (SGEMM), complex double-precision (ZGEMM), as well as DGEMM operations. It will also ensure that best performance is obtained regardless of matrix dimension and form. As the [SDZ]GEMM subroutines are widely used, this project has the potential to benefit a large community.
1Fatica, M. (2009) Accelerating Linpack with CUDA on heterogeneous clusters. ACM ICPS 383: 46-51. [Link]
EXCITON is an electronic structure code under development in the School of Physics, Trinity College Dublin and ICHEC. Its purpose is computation of excitations in solids using state of the art methods such as the GW approximation and the Bethe-Salpeter Equation. It is written in C and is parallelised with MPI. We are investigating how the numerically intensive parts of EXCITON can be accelerated on Graphical Processor Unit (GPU) architectures. This project will involve porting EXCITON to GPU and use of the CUDA language.
Many bioinformatics software tools follow the single instruction, multiple data (SIMD) paradigm. Hence GPGPU has the potential to provide relatively inexpensive, scalable solutions to increasingly data-intensive problems in biology. There are already a number of algorithms which has been implemented on the GPU (e.g. Smith-Waterman, BLASTP), most of which report significant speed-ups over CPU implementations.
The goal of this project is to implement some of the existing bioinformatics GPGPU codes on ICHEC hardware and to assess their behaviour, applicability and useability. In collaboration with the Molecular Evolution and Bioinformatics Unit at NUI Maynooth, there are opportunities to carry out real scientific analyses and to enable previously-infeasible computations. The codes will cover a range of areas including biological sequence comparisons, molecular phylogenetics and high-throughput DNA sequencing.
GIPAW is a module within the Quantum ESPRESSO distribution that models NMR experiments (e.g. chemical shifts) from first principles. Until now, NMR modelling has been applied to relatively simple systems of tens of atoms in the unit cell at most. The current scalability of GIPAW is limited to a few hundreds of cores at most. In order to make GIPAW useful in biomedical and industrial applications, we need to extend its range to systems having hundreds to thousands of atoms, and to extend scalability up to thousands of cores. The aim of this project is to improve the parallelism inside GIPAW, by further distributing large arrays across processors and by adding a further parallelisation level on electronic states.
Methods for Computational Fluid Dynamics (CFD) has become increasingly important in many aspects of engineering. Traditionally, methods for CFD have been mesh-based, i.e. the computational nodes are interconnected and fixed in space. Mesh-free methods for CFD are a relatively recent development. These methods offer greater flexibility than traditional mesh-based approaches because the computational nodes can move with the fluid and have no pre-defined connectivity.
Code for mesh-free CFD has been developed at Mechanical and Biomedical Engineering at NUI Galway. This project will involve the parallelisation of certain elements of this mesh-free CFD code and will be conducted in collaboration with the research group of Dr. Nathan Quinlan at NUI, Galway.
Nested models are increasingly used in climate and weather, with a regional model running within a larger global model. Incorrectly nested models can have issues due to artificial noise at the model boundaries, or incorrectly filtered dynamics that remove the desired signals in the model.
The aim of the project is to investigate and implement techniques for showing the results of two nested models within VTK or paraview, enabling the users to distinguish the different datasets and investigate potential issues due to resolution, artificial noise or filtering within the nested system.
The input file syntax required for running HPC jobs while reasonably straight-forward and deterministic can be confusing for new users. Similarly debugging and scaling work can require modifications of normal production jobs which can be error prone, if it is not a day-to-day activity. The aim of this project is to develop a series of 'Wizard' based interfaces, probably web-based, which can be used to generate job submission and related files for users in a user-friendly manner. The generated files could then be used by users directly or as a basis for further customisation.
One method commonly used for large-scale parameter studies and other ideally-parallel workloads is so-called taskfarming. ICHEC's current taskfarm utility is light-weight and adaptable but it currently assumes that users have a good understanding of the system load and run-times of their jobs. However a more sophisticated taskfarming approach could automatically harvest and log performance data producing a summary, both textual and graphical, of the properties of a given run. This could benefit users by helping them to readily spot problematic tasks or inefficiencies.
There are a large number of tools for error checking and profiling code such as: Marmot, Vampir, Scalasca, gprof, Valgrind, Lint etc. They each have their strengths and are often complimentary. We could better exploit these and similar tools. The aim of this project would be to explore the notion of creating a "wizard" which given some basic information could be used to help users to configure some or all of these tools for use with their code in one step. In some cases the full testing process might be amenable to automation in others it may only be practical to generate required job scripts etc. But even this would eliminate an error prone process. In tandem with this technical report style documentation tailored to the ICHEC environment would be produced for the tools concerned.
Chapel is a new and very interesting language designed by Cray. The language is not ready for production use as yet however porting existing applications should be possible if they can be made to work without 3rd party libraries. It is proposed that a code be identified from amongst those used by the ICHEC userbase, ideally open source, which would be practical to port to Chapel. This may prove useful as a benchmark application in the future for the Chapel community. In the short term it would highlight the differences in the language and act as a measure as to how tractable porting existing code to the language would be were Chapel to be fully developed as a first tier HPC language.
Chapel is a new and very interesting language designed by Cray. The language is not ready for production use as yet however it has been designed from the ground up with parallel programming in mind. It incorporates a notion called a "distribution " which describes how data is mapped from an index space to an endpoint (locale) without specifying exact details of data indices. The aim of this project is to select and implement a non trivial distribution(s) which is relevant to the HPC community.
The aim of the project is to develop a light-weight MPI profiling library that differentiates between local communication (intra-node or nearby nodes) and distant communication (&grt; 1 hop) within the 3D torus of BlueGene systems. The library should use the standard PMPI profiling interface and be as portable as possible. It involves the instrumentation in C - all the point to point MPI functions (MPI_[ISB]Send and MPI_[I]Recv) and communicator creation (to keep track of the actual MPI ranks). It is also necessary to identify a suitable trace file format, and developing a tool to parse/analyse the trace files with a high-level programming language (e.g. Python). The impact of process placement on torus will be assessed on systems such as BG/L, BG/P and Cray XT4 (Hector, if the library is portable).
Data assimilation involves accurate re-analysis, estimation and prediction of an unknown, true state by merging observed information into a model. This issue arises in all scientific areas that enjoy a profusion of data. The problem is fundamental yet challenging as it does not naturally afford a clean solution. An area where data assimilation is predominantly important is in weather forecasting and hydrology. The aim of the project is to review the literature for existing data assimilation techniques applicable in weather forecasting. The simplest form of the existing algorithms will be implemented and their computational behaviour will be analysed.
The Climate Data Operators (CDO) is a tool that is used extensively in the climate community for manipulating climate datasets: conversions from one format to another, and obtaining averages / min / max, etc. across fields. This project will involve the optimisation and parallelisation of the CDO code (e.g. it should be possible to load large fields in parallel and parallelise summary tasks).
Biological data/relationships/networks can often be represented in graphs which can be manipulated and analysed using various algorithms. However, large graphs (e.g. those with more than hundreds of millions of nodes) poses a significant challenge for conventional algorithms. The project will involve implementation of existing parallel algorithms or the parallelisation of sequential algorithms to solve relevant graph-based problems in bioinformatics.
The Lotka-Volterra equations are of practical interest as they are frequently used to describe the dynamics of biological systems. In this project the student will:
For further information regarding the Summer Scholarships, please contact our Education & Training Coordinator.