Although most developers know or heard that GPGPUs are good at speeding up machine learning programs hardly anyone can say why. This talk will provide details about the implementation of modern techniques like convolutional and recurrent neural networks and how they relate to the advanced design features of today's processors: CPUs with short vectors, GPGPUs, custom FPGAs/ASICs.
The characteristics of current or future versions of those CPUs are described with the background of different machine learning workshops which will hopefully lead to the audience understanding better which hardware to use for what problem and how to adjust the implementation to work best with the given hardware.
Nothing really is needed except some basic understanding of how processors work (especially concurrent execution) and perhaps a bit of linear algebra to follow the machine learning examples. That latter isn't really necessary, though, since one can regard ML just one specific HPC problem and instead take away just the general information.
I found a lot of half-knowledge in the areas of CPU design and of machine learning (the currently fashionable technology). Just saying "GPGPU are good for that" isn't useful since like all problems ML is diverse and in many cases the implementation was changed and limited by the available hardware instead of the necessity and ability of the algorithm. People have to know a lot about ML, CPUs, OSes, and development tools to make correct decisions. Maybe even to understand and judge the remainder of the conference problem.
// Ulrich Drepper
is back working at Red Hat in the Office of Technology working on the next generation machine learning and high performance projects after a seven year stint at a Wall Street firm. All through his career he had interest in understanding how to make most out of the available hardware and wants others to know and care, too.