Programming Massively Parallel A Hands-on Approach shows both students and professionals alike the basic concepts of parallel programming and GPU architecture. Concise, intuitive, and practical, it is based on years of road-testing in the authors' own parallel computing courses. Various techniques for constructing and optimizing parallel programs are explored in detail, while case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. The new edition includes updated coverage of CUDA, including the newer libraries such as CuDNN. New chapters on frequently used parallel patterns have been added, and case studies have been updated to reflect current industry practices.
Parallel Patterns Introduces new chapters on frequently used parallel patterns (stencil, reduction, sorting) and major improvements to previous chapters (convolution, histogram, sparse matrices, graph traversal, deep learning) Ampere Includes a new chapter focused on GPU architecture and draws examples from recent architecture generations, including Ampere Systematic Approach Incorporates major improvements to abstract discussions of problem decomposition strategies and performance considerations, with a new optimization checklist
This is very approachable textbook, albeit being tad too verbose. The book structure is almost perfect: starts with hardware level of GPUs with explanation of their drawbacks and benefits, then dives into main processing patterns that are driven by this kind of architecture and finally lays out all popular processing algorithms and applications. As far as intro to GPUs and parallel processing - this book is pretty good.
4 stars -being a textbook it's very verbose, if you read it you can't go away without understanding of what is being said, but the layout does not encourage you to skip parts. If you have basic understanding how parallel sorting works, but just want to see the gpu specific part - tough luck. On the other hand, if you just want to get basic understanding of how parallel sort works - you still have to read the whole chapter with all the intricate details. The book expects you to consume chapters as a whole and read chapters sequentially.
Another thing that bothered me is that principles of GPU parallel processing are pretty much the same as in distributed systems, but the books doesn't care about such generalizations, and everything is presented as GPU specific problems and solution. This felt off.
Written in C, good language but only for academics. No set up to practice and creating a feedback cycle. If I want to learn something new by myself, this is not the book to use. Most data science jobs do not ask for C. GPU is useful, but one needs to make money, too.
A much better resource than the first edition! It is still rather NVIDIA-centric, but it does spend some time on higher-level and platform-independent APIs as well. I do wish the editors had handled all the color references to black-and-white illustrations better, though.
A somewhat light overview of nVidia's GPU architecture and CUDA programming model.
The explanation of GPU architecture was clear if somewhat vague. Given that the whole book was focused on the GT200 series, and they referenced the GTX 295 specifically, I wish they had given exact latency numbers for the memory hierarchy and other hardware features as motivation for why you would do all the backflips necessary to fit your problem effectively into the cuda model. There are a few log-log graphs in the book to show you just how much speed a naive algorithm will give up but they could have done better.
The chapter on floating point was unnecessary, I felt. There was a brief discussion at the end of it about how sorting can affect accuracy due to fixed precision but hardly gave enough detail for the programmer to use effectively.
It has an appendix mapping the CUDA primitives and organization you've learned to their OpenCL equivalents.
This book provides enough information to start develop programs using CUDA technology. Besides CUDA specific information, it explains how to design algorithms for massively parallel hardware, how to optimize performance, and provides detailed description of two tasks, implemented on GPU. There is also a chapter, that provides short overview of OpenCL technology
The book is good and thorough. It is mainly focussed on CUDA, but it is generic on the algorithms approach. But the style doesn't suit my learning style. It defines things, but (in my case) doesn't do a proper job in setting up the concepts (both hardware and algorithms) in a step by step way. Probably a book for somebody with more knowledge, not an introductory book.
Useful explanations of how CUDA programming works. Nice to see more examples and depth than the software documentation. Would like to have seen mention of using the texture memory and associated functions, which were entirely absent.
Taught be some about the different memory model issues that can arrise with CUDA. This book asn't put together well. Most of the chapters seemed to spend a couple pages setting the stage as if they all originally stood alone.
Reading the book with the offered course on Coursera will be a very good start. It is written in a very simple way and good for beginners. little bit general but good enough.
Read all of the fundamental chapters and select chapters on tuning for particular algorithms. Very well written with clear exposition. Would recommend to learn GPU programming