General Info | Program | 2007 Registration | Committee | Attendee | Sponsors | Press | Archives | Contact Us

SPONSORED BY


TECHNICAL
CO-SPONSORSHIP
GOLD PATRONS

Gold Patron


SILVER PATRONS

Silver Patron

SISTER
CONFERENCES



CONTRIBUTORS

Tutorial #1 Description
Tutorial #2
Tutorial type:
half-day

Title: Introduction to Programming High Performance Applications on the CELL Broadband Engine.

Authors:
Jakub Kurzak, Alfredo Buttari, University of Tennessee at Knoxville

Description:
Programming the STI CELL processor is about successfully exploiting its potential for delivering very high performance. The purpose of this tutorial is to give the programmer practical guidelines for achieving this goal. We begin by a brief overview of the main CELL architectural features and its software development environment. Then we discuss three basic aspects of CELL programming: SPE SIMD kernel development (vectorization), SPE parallelization and intra-chip communication. We show how high performance SPE kernels are created by replacing scalar operations with vector ones, heavily unrolling loops, and exploiting dual-issue nature of the SPE architecture. We explain coding using SIMD C language extensions (intrinsics), as well as using assembly language and discuss aspects specific to code development in assembly. We present static performance analysis using the spu-timing tool. The presentation of intra-chip communication follows, with emphasis on DMA communication both for bulk data transfers as well as for synchronization. We discuss message size and alignment restrictions, enforcing of message ordering using barrier and fence mechanisms and creation of complex data transfers using DMA lists. We conclude the topic with guidelines on implementing pipelined processing with direct local store to local store communication. We discuss basic profiling techniques using the SPE decrementer. We conclude with a set of practical tips and tricks and a list of "gotchas" or common rookie mistakes. A brief overview of academic and commercial CELL programming packages follows, and a discussion of a real life example - scanning network traffic using DFA-based string matching. The tutorial ends with a presentation of techniques for programming multi-CELL systems using message passing with MPI.

Bios:
Jakub Kurzak received the MSc degree in electrical and computer engineering from Wroclaw University of Technology, Poland, and the PhD degree in computer science from the University of Houston. He is a research associate in the Innovative Computing Laboratory in the Computer Science Department at the University of Tennessee, Knoxville. His research interests include parallel algorithms, specifically in the area of numerical linear algebra, and also parallel programming models and performance optimization for parallel architectures spanning distributed and shared memory systems, as well as next generation multi-core and many-core processors.

Alfredo Buttari received the MSc degree in computer science and the PhD degree in computer science and control engineering from the University of Rome, Italy. He is a research associate in the Innovative Computing Laboratory in the Computer Science Department at the University of Tennessee, Knoxville. His research interests include numerical linear algebra, dense and sparse methods, direct and iterative solvers, parallel algorithms and performance optimization for parallel architectures including next generation multi-core and many-core processors.


PROGRAM


Advanced Program


Registration CLOSED


KEYNOTE


Alex Dickinson [bio]
Luxtera




Tryggve Fossum [bio]
Intel


PRESS
COPYRIGHT ©2007 HOT INTERCONNECTS