Book description
Programming Massively Parallel Processors: A Hands-on Approach, Third Edition shows both student and professional alike the basic concepts of parallel programming and GPU architecture, exploring, in detail, various techniques for constructing parallel programs.
Case studies demonstrate the development process, detailing computational thinking and ending with effective and efficient parallel programs. Topics of performance, floating-point format, parallel patterns, and dynamic parallelism are covered in-depth.
For this new edition, the authors have updated their coverage of CUDA, including coverage of newer libraries, such as CuDNN, moved content that has become less important to appendices, added two new chapters on parallel patterns, and updated case studies to reflect current industry practices.
- Teaches computational thinking and problem-solving techniques that facilitate high-performance parallel computing
- Utilizes CUDA version 7.5, NVIDIA's software development tool created specifically for massively parallel environments
- Contains new and updated case studies
- Includes coverage of newer libraries, such as CuDNN for Deep Learning
Table of contents
- Cover image
- Title page
- Table of Contents
- Copyright
- Dedication
- Preface
- Acknowledgements
- Chapter 1. Introduction
- Chapter 2. Data parallel computing
- Chapter 3. Scalable parallel execution
- Chapter 4. Memory and data locality
- Chapter 5. Performance considerations
- Chapter 6. Numerical considerations
- Chapter 7. Parallel patterns: convolution: An introduction to stencil computation
-
Chapter 8. Parallel patterns: prefix sum: An introduction to work efficiency in parallel algorithms
- Abstract
- 8.1 Background
- 8.2 A Simple Parallel Scan
- 8.3 Speed and Work Efficiency
- 8.4 A More Work-Efficient Parallel Scan
- 8.5 An Even More Work-Efficient Parallel Scan
- 8.6 Hierarchical Parallel Scan for Arbitrary-Length Inputs
- 8.7 Single-Pass Scan for Memory Access Efficiency
- 8.8 Summary
- 8.9 Exercises
- References
- Chapter 9. Parallel patterns—parallel histogram computation: An introduction to atomic operations and privatization
- Chapter 10. Parallel patterns: sparse matrix computation: An introduction to data compression and regularization
- Chapter 11. Parallel patterns: merge sort: An introduction to tiling with dynamic input data identification
- Chapter 12. Parallel patterns: graph search
- Chapter 13. CUDA dynamic parallelism
- Chapter 14. Application case study—non-Cartesian magnetic resonance imaging: An introduction to statistical estimation methods
- Chapter 15. Application case study—molecular visualization and analysis
- Chapter 16. Application case study—machine learning
-
Chapter 17. Parallel programming and computational thinking
- Abstract
- 17.1 Goals of Parallel Computing
- 17.2 Problem Decomposition
- 17.3 Algorithm Selection
- 17.4 Computational Thinking
- 17.5 Single Program, Multiple Data, Shared Memory and Locality
- 17.6 Strategies for Computational Thinking
- 17.7 A Hypothetical Example: Sodium Map of the Brain
- 17.8 Summary
- References
-
Chapter 18. Programming a heterogeneous computing cluster
- Abstract
- 18.1 Background
- 18.2 A Running Example
- 18.3 Message Passing Interface Basics
- 18.4 Message Passing Interface Point-to-Point Communication
- 18.5 Overlapping Computation and Communication
- 18.6 Message Passing Interface Collective Communication
- 18.7 CUDA-Aware Message Passing Interface
- 18.8 Summary
- Reference
- Chapter 19. Parallel programming with OpenACC
- Chapter 20. More on CUDA and graphics processing unit computing
- Chapter 21. Conclusion and outlook
- Appendix A. An introduction to OpenCL
- Appendix B. THRUST: a productivity-oriented library for CUDA
-
Appendix C. CUDA Fortran
- C.1 CUDA Fortran and CUDA C Differences
- C.2 A First CUDA Fortran Program
- C.3 Multidimensional Array in CUDA Fortran
- C.4 Overloading Host/Device Routines with Generic Interfaces
- C.5 Calling CUDA C via ISO_C_Binding
- C.6 Kernel Loop Directives and Reduction Operations
- C.7 Dynamic Shared Memory
- C.8 Asynchronous Data Transfers
- C.9 Compilation and Profiling
- C.10 Calling Thrust from CUDA Fortran
- Appendix D. An introduction to C++ AMP
- Index
Product information
- Title: Programming Massively Parallel Processors, 3rd Edition
- Author(s):
- Release date: November 2016
- Publisher(s): Morgan Kaufmann
- ISBN: 9780128119877
You might also like
book
Programming Massively Parallel Processors, 4th Edition
Programming Massively Parallel Processors: A Hands-on Approach shows both students and professionals alike the basic concepts …
book
Programming Massively Parallel Processors, 2nd Edition
Programming Massively Parallel Processors: A Hands-on Approach, Second Edition, teaches students how to program massively parallel …
book
Engineering a Compiler, 2nd Edition
This entirely revised second edition of Engineering a Compiler is full of technical updates and new …
book
The C++ Programming Language, 4th Edition
The new C++11 standard allows programmers to express ideas more clearly, simply, and directly, and to …