Systems Research Projects

Why multicore processors?

Multicore processors include of multiple processing cores on the same chip, so multiple threads of execution can run simultaneously on a single processor. Multicore is becoming the dominant processor architecture due its superior performance per watt and its potential to offer improved performance with each technology generation. According to Intel, 100% of its new server and high performance desktop processors were multicore by the end of 2007. Other major hardware vendors, AMD, IBM, Sun Microsystems, focus largely on multicore hardware.


Multicore processors pose new challenges to system designers. First of all, operating systems and virtual machine hypervisors must be built to enable applications make the most out of multicore hardware. For example, on multicore systems with shared resources, application threads must be scheduled so as to minimize contention for shared resources and maximize their cooperative sharing. On heterogeneous multicore processors, OS and hypervisors must allocate heterogeneous cores to user software in the most efficient manner. We envision that there will be a change to traditional roles of applications and system software: applications will be more responsible for managing hardware resources, while OS and hypervisors will "get out of the way", performing only minimal resource-sharing duties.


Second, applications must be parallelized to take advantage of the multicore hardware. Writing parallel applications is a challenging task, and so it is necessary to build new programming environments that make this task easier. Since these new programming environments often take on a role of an operating system, the operating systems must be redesigned to let those runtime environments perform their own resource management and interact nicely with other OS-like runtimes. To solve these problems we work on the following projects:

Ongoing Projects

OS Scheduler for Heterogeneous Multicore Processors Based on Architectural Signatures.

Heterogeneous multicore architectures promise greater energy/area efficiency than their homogeneous counterparts. This efficiency can only be realized, however, if the operating system assigns applications to appropriate cores based on their architectural properties. While several such heterogeneity aware algorithms were proposed in the past, they were not meant to scale to a large number of cores and assumed longlived threads due to reliance on continuous performance monitoring of threads for core assignment purposes. We propose a scheme that does not rely on dynamic performance monitoring. Instead, the information needed to make an appropriate core assignment decision is provided with the job itself. This information is presented as an architectural signature of the application, and is composed of certain microarchitecture-independent characteristics. An architectural signature is generated offline and can be embedded in the application binary.


Cypress, a Hypervisor for Heterogeneous Many-Core Processors

Existing hypervisors do not support heterogeneous multicore hardware. As such, they prevent guest OSs from using heterogeneous hardware in the most efficient manner. Our goal is to build a hypervisor that provides support for heterogeneous cores. Challenges in this research are: (a) allocation of heterogeneous cores to legacy hetero-unaware guests, (b) presentation and allocation of heterogenous cores to hetero-aware guests, and (c) design of policies for the allocation of heterogeneous cores to multiple VMs.


Cascade: A Parallel Programming Framework for Video Game Engines

Cascade is a parallel programming framework (PPF) whose design is driven by the application domain of video game engines. Video game engines are large complex applications whose structure does not easily map to primitive parallel constructs such as parallel-for or map/reduce; therefore, expressing paralle lism and achieving good parallel speedup is not trivial. A good PPF for a video game engine must allow the programmer to express parallelism in a way that naturally maps to the programmer's perception and understanding of the engine; otherwise the parallel implementation may become overly complex. In addi tion, video game engines have unique requirements, such as real-time constraints and the need for tigh t control over resource allocation. Our goal in designing Cascade is to cater to the needs of this imp ortant and fast-growing application domain. Contribution of our work is two-fold: (1) understanding the requirements of our target application domain and, through this understanding, determining which of the features offered by existing PPFs are the most appropriate for this domain; (2) innovation in the PPF itself: design of new PPF features that are needed by this application domain, but are not offered in existing systems.


Abacus: A Reconfigurable Hardware Profiler

The advent of multicore processors spurred intricate operating systems algorithms that model interaction between concurrent threads and estimate optimal ways to schedule them. Many of these algorithms are more complex and computationally expensive than the algorithms traditionally used in operating systems. One reason is that they need to perform intensive observation of threads' performance; the other reason is that they use complex modeling algorithms to predict thread interactions. Perhaps because of this complexity, these new algorithms have not been implemented in commercial OS despite having demonstrated the ability to improve performance. Our goal is to close this gap by designing a reconfigurable hardware profiler Abacus. Abacus is a profiling core that is connected to the main computation core and that allows gathering a rich set of hardware events for profiling purposes, eliminating the need for complex modelling and analysis in the OS. Abacus is highly configurable, so it allows gathering a much wider array of events than traditional hardware counters. We are currently building a prototype of Abacus using FPGAs. This research is done in collaboration with Dr. Lesley Shannon from the Department of Engineering at SFU.


OSTM: Operating System Support for Transactional Memory

In collaboration with Danny Hendler (University of Ben-Gurion), Bill Scherer (Rice University) and Yossi Lev (Brown Univeristy/Sun Microsystems) we are building an operating system scheduler that reduces conflicts in transactional applications. The scheduler is integrated with the transactional memory system to find information about transactional conflicts which it applies to scheduling decisions.