Events

Seminars

Workshops

Summer School

 

Visiting I2PC Illinois

 

 

2013 Illinois Symposium on Parallelism: Current State of the Field and the Future - Speaker Abstracts

September 10th and 11th, 2013
Siebel Center for Computer Science
University of Illinois at Urbana-Champaign



James Larus, Microsoft Research

The World Did Not End

Eight or nine years ago, processor clock speed stopped its decades-long, continuous improvement and the multicore era began. Many astute people claimed that unless the "parallel programming problem" was solved, this transition marked the end of computing as we know it. That was then and now is now, and this problem is still unsolved (though, many new ideas have been proposed) and the world has not ended. Where these people exaggerating, where they wrong, or is disaster still awaiting us? What can we learn from this technology transition that might help us navigate the next one?


Kunle Olukotun, Stanford University

High Performance Domain Specific Languages with Delite

Today, all high-performance computer architectures are parallel and heterogeneous; a combination of multiple CPUs, GPUs and specialized processors. This creates a complex programming problem for application developers. Domain-specific languages (DSLs) are a promising solution to this problem because they provide an avenue for high-level application-specific abstractions to be mapped directly to low level architecture-specific programming models providing both high programmer productivity and high execution performance.

In this talk I will describe our approach to building high performance DSLs, which is based on embedding in Scala, light-weight modular staging and a DSL infrastructure called Delite. I will describe how we transform DSL programs into efficient first-order low-level code using domain specific optimization, parallelism optimization and locality optimization with parallel patterns, and architecture-specific code generation. All optimizations and transformations are implemented in an extensible DSL compiler architecture that minimizes the programmer effort required to develop new DSLs.


Paul Petersen, Intel

How to "Sketch" a Parallel Program

When an artist approaches a new project, they may first create a sketch to quickly illustrate what they are trying to create, and to evaluate if it will achieve the desired effect. Often, this sketch is then incrementally refined to eventually produce the final product. Can programming, or in our case parallel programming be like this? The key aspect of a sketch is that it purposely omits details that while relevant in the final product, are not relevant to understand how the process can be started. We can leverage this key aspect of sketching to introduce key aspects of parallel programming, without getting overwhelmed with the details that will ultimately be necessary in the final program.


Tilak Agerwala, IBM

The Future of Parallel Computing: A Principled Approach

Following the historic path for increased parallelism (more nodes, more cores, more SIMD) is unlikely to be sufficient for future high performance computing needs. Growing data volumes and emerging graph-based analytics workloads are changing the demands placed on system performance. These changes lead us to a data-centric system view with new requirements at every level of the system. To effectively manage this transition, we propose system design principles to guide the development of next-generation systems and to lead us to new, more balanced systems.


Josep Torrellas, UIUC

What We Need to Accomplish Next

Since we started this Center's effort on client parallelism, many things have changed. There is much more emphasis on handheld computing and cloud computing. Moreover, the industry is more fragmented. We will discuss some of the problems that we need to address next.


Arvind, MIT

Building Highly Parallel and Concurrent Systems

Systems-on-a-chip in client devices like cell phones has a lot of specialized hardware to reduce power/energy consumption. In this brave new world, the whole software stack must interface with ever changing hardware at the bottom. Such systems by definition are highly concurrent and reactive. A method of building such systems is to compose functionally specialized modules, which may be implemented in either hardware or software. Furthermore, composition methodology should be such that the functionality and performance of the system are predictable from the parts. We will discuss some characteristics of the languages, tools and architectures needed to satisfy these new requirements.


Keshav Pingali, University of Texas, Austin

Title: Ieri, Oggi, Domani

The Italian title of this talk is borrowed from an Alberto Moravia play, and it means "Yesterday, today, tomorrow." In that spirit, this talk will give a personal perspective on what we have learnt about "universal parallel computing" between the time the UPCRC was started and today, and what challenges remain to make parallel programming truly universal in the future.


Gilles Pokam, Intel

Title: Bridging the Programmability Gap between Single-Core and Multi-Core Processors

Parallel computing has become ubiquitous. Servers, laptops, and now even phones and tablets are powered by multicore processors that can execute more than one stream of instructions at a time. For developers, this creates new opportunities to harness the hardware, but it is coupled with new challenges such as identifying and expressing the parallelism available in programs, and improving program performance while still guaranteeing correctness. These goals can sometimes be difficult to achieve without the proper hardware support and tools. For instance, IBM and Intel have recently announced support for Transactional Memory (TM), which aims at reducing the challenge of writing parallel programs. Yet, to improve productivity, developers also need better visibility into the system and the proper tools for root-causing performance and correctness issues. Unfortunately, this support is still falling short of what parallel programmers desire. As a consequence, multicore developers fall back to techniques meant to debug programs running on single-core processors such as breakpoints, yet, these techniques are significantly underpowered, and generally incompatible, in debugging multicore systems.

In this talk, I will outline several techniques to bridge the programmability gap between single-core and multi-core processors. First, I will discuss a mechanism to provide more visibility into the interactions between threads or processes running on a multicore processor and I will show that this building block is fundamental to debugging multicore programs. I will illustrate this using TM debugging and record-and-replay, which was developed in collaboration with UIUC. Second, I will discuss an orthogonal technique by which current breakpoint support on single-core processors could evolve into concurrent breakpoints for multicore processors, providing developers with a powerful tool for root-causing issues faster.


Arch Robison, Intel

Title: The Evolution of Cilk Plus. Recent and Future Directions

Intel Cilk Plus is a framework of C/C++ language extensions for expressing thread and vector parallelism in a composable way. I'll start with recent developments in the framework, such as "pedigrees", which enable deterministic parallel random number generation and deterministic replay. Then I'll touch on potential future directions. These include support for parallel patterns beyond fork-join and more consistent syntax for vector vs. thread parallelism.


Burton Smith, Microsoft

Title: A Brief History of Parallel Computing: 1975-2050

Parallel computing offers an interesting example of technologic evolution. The forces that set its initial course were repeatedly supplanted by new forces reflecting evolution in computing itself: microelectronics, the internet, mobile computing, robotics, machine learning, immersive telepresence, and cyberaugmentation. Besides the escalating needs for efficient resource management and locality optimization, fine-grain synchronization and system responsiveness eventually became important for quite fundamental reasons. Ironically, today's parallel computing landscape is closer to the vision that prevailed in the 1980s than it is to the common view prevalent in 2013, say.