# Computational Tools for Rapid and Reliable Development of High Performance Applications

### Location: ENSEEIHT

- morning and afternoon : course in room A005

### Organisers

- and Dr. Osni A. Marques

Computational Research Division

Lawrence Berkeley National Laboratory

Berkeley, California, USA

{LADrummond, OAMarques}@lbl.gov - Dr. Jose E. Roman

Institute for the Applications of Advanced Information and

Communication Technologies (ITACA)

Technical University of Valencia

Valencia, Spain - Dr. Sameer Shende

Department of Computer and Information Science

The University of Oregon

Eugene, Oregon, USA

### Motivation

This short course will be an introduction to a set of advanced computational software tools to leverage the development of high performance applications. The lectures will focus on the selection, installation and use of scalable and robust software tools. Functionalities implemented in these software tools include; numerical algorithms for the solution of large computational problems, performance monitoring and profiling, and automatic tuning. Participants should expect to learn about techniques used to solve common computational problems and monitor their performance. Participants are encouraged to bring laptop computers and follow live demonstrations through hand-on experiences. The software presented here is freely available and widely used by the computational sciences international community.### Intended Audience

- Application developers, striving for best application performance on HPC systems
- HPC support staff who work with customers on a daily basis
- System managers and administrators, responsible for operational aspects of HPC systems
- HPC decision makers interested in tool innovations and usability
- Anyone interested in programming tool environments and application tuning

### Duration

Full day### Level of Presentations

The level of the presentations requires a general understanding of High Performance Computing (HPC) applications and parallel programming. The level of difficulty of presented topics can be characterized as follows:Introductory: 60 %, Intermediate: 25 %, Advanced: 15 %

### Course Description

The main goal of this tutorial is to introduce users to a variety of software solutions that assists computational scientists to develop high-performance programs using robust and scalable libraries while understanding their performance. Thus, the tutorial covers scalable implementations of numerical algorithms, automatic optimization, performance monitoring and profiling tools. All tools to be presented in this tutorial are part of the United States Department of Energy Advanced Computational Software Collection, which offers a plethora of robust and high-performance computational services.

For numerical algorithms, we look at direct and iterative solutions of linear and non-linear systems of equations, general eigenvalue problems for systems leading to dense and sparse matrices. Attendees will not only learn about the use of the tools, but also will be provided with interactive examples that will help them with the selection of tools and interfaces, installation of tools, and automatic optimization of libraries. Tools to be presented here include ATLAS, Hypre, PETSc, ScaLAPACK, SLEPc. and SuperLU.

In addition, the tutorial will also cover performance data collection, analysis, and optimization. To evaluate the performance of their parallel, scientific applications, we will demonstrate TAU, PAPI, KOJAK, and Vampir. After describing and demonstrating how performance data (both profile and trace data) can be collected in a straightforward manner using TAU's (Tuning and Analysis Utilities) automated instrumentation, the workshop will cover how to analyze the performance data collected and drill down to find performance bottlenecks and determine their causes. The workshop will include some sample codes that illustrate the different instrumentation and measurement choices available to the users. Topics will cover generating performance profiles and traces with hardware performance counters data using PAPI. Trace-based visualization will be covered using the Vampir and VampirServer tools for scalable visualization of event traces.

### Outline of the Course

- General Introduction and Motivation to The ACTS Collection
- Solving Linear Systems of Equations
- Using Direct Techniques
- Dense Cases: Introduction to ScaLAPACK
- Sparse Cases: SuperLU
- Demonstration on How to Use ScaLAPACK and SuperLU

- Using Iterative Techniques
- Introduction to PETSc
- Introduction to Hypre
- Demonstration on How to Use PETSc and Hypre

- Using Direct Techniques
- Solving Eigenvalue Problems
- Dense Case: Introduction to ScaLAPACK
- Sparse Case: Introduction to SLEPc

- Performance Monitoring and Tuning
- Automatic Library Tuning: Using ATLAS
- Performance Monitoring and Profiling: Using TAU

### References

- An Overview of the Advanced CompuTational Software (ACTS) Collection, L. A. Drummond and O. A. Marques, ACM TOMS, 31:282-301, 2005.
- The TAU Parallel Performance System, S. Shende and A. Malony, IJHPCA, 20(2), 2006.
- SLEPc: A Scalable and Flexible Toolkit for the Solution of Eigenvalue Problems, V.Hernandez, J. Roman, and V. Vidal, ACM TOMS, 31(3):351-362, Sep, 2005.
- http://acts.nersc.gov/
- http://acts.nersc.gov/MatApps
- http://acts.nersc.gov/events/Workshop2007
- http://www.grycap.upv.es/slepc/

### Related Courses Presented in International Conferences and World Leading Institutions

- Institute of Mathematics and its Applications (2002)
- Eight yearly workshops on the ACTS Collection organized at the Lawrence Berkeley National Laboratory (2000 through 2007)
- Tutorials on the ACTS Collection at San Diego Supercomputing Center (2000)
- LACSI - Los Alamos - Course on the ACTS Collection (2001)
- Copper Mountain Meeting on Iterative Methods (2002)
- SIAM PP04 - San Francisco
- SIAM CSE05 - Orlando
- Pan-American Studies Institute, Honduras (2004)
- Pan-American Studies Institute, Oaxaca, Mexico (2006)
- VECPAR 2006, Rio de Janeiro (2006)
- EPSA, Porto, Portugal (2007)

NOTE: The tutorial proposed here is an improved version of the workshop in Rio de Janeiro in 2006, as we have added performance monitoring and tuning tools based on feedback from users.

### Bibliographic Sketches of the Presenters

#### Dr. Leroy Anthony Drummond

L. A. Drummond, Tony, is currently a researcher in the Scientific Computing Group in the Computational Research Division of the Lawrence Berkeley National Laboratory (LBNL). He is the co-Principal Investigator in the DOE's Advanced CompuTational Software Collection (ACTS) project and continues to work on the iterative methods. Prior to LBNL, Tony was a post-doctoral fellow at the University of California, Los Angeles (UCLA) in the Atmospheric Sciences Department, were his work included the optimization of a global atmospheric circulation numerical model, the development of a coupler tool to support the data exchanges between the different model components of the UCLA Earth System Model, and worked on the development of Virtual World Data Server for visualizing large datasets that arise in Earth System computer simulations.

Tony's research interests include distributed computing for large scientific applications, computational aspects of numerical modeling and linear algebra. Today he continues to work on block iterative methods, numerical linear algebra, coupling of muti-physics codes. He is the lead developer of PyACTS, a Python interface to the ACTS Collection and the Distributed Coupling toolkit (DCT).

#### Dr. Osni A. Marques

Osni Marques is a member of the Scientific Computing Group of the High Performance Computing Research Department, at the Lawrence Berkeley National Laboratory (LBNL). Currently, he is the PI for the project Advanced CompuTational Software (ACTS) Collection, funded by the Mathematical, Information, and Computational Sciences (MICS) Division of the US Department of Energy (DOE). The ACTS Collection (http://acts.nersc.gov) comprises a set of software tools developed mostly at DOE Laboratories and universities, and that can simplify the solution of common and important computational problems. The project aims at improving the usability, accessibility and acceptance of ACTS tools and to make them more widely used and more effective in solving DOE's and the nation's scientific problems.

Osni's research interests include the study, implementation and testing of algorithms for the solution of problems in numerical linear algebra, in particular eigenvalue problems, and highend scientific computing. The eigensolvers he has implemented have been used in applications related to protein motions, acoustics problems in automobile design and structural analyses. He has collaborated with LBNL's Earth Sciences Division in applications that require the computation of singular values and singular vectors of large, sparse matrices, for the solution of inverse problems in Geophysics. Currently, he is a collaborator in a project funded by DOE for the study of electronic properties of 3D, million-atom semiconductor nanostructures. Osni also holds a research position at the UC Berkeley Computer Sciences Dept., where he works in the framework of the LAPACK and ScaLAPACK projects.