Verification and Validation

Participants: Madhusudan Parthasarathy, Darko Marinov, and Francesco Sorrentino

Executive Summary

Our primary goal will be to build a testing tool framework for concurrent shared-memory programs. We will take this forward to other programming paradigms (non-deterministic extensions of DPJ, actors, etc.) after we build more expertise on shared-memory programs.

Goals - Extended Description

The techniques and tools we build are divided into the following tasks:

Task 1: Build a benchmark suite

We will collect initial benchmark suites from code already used by our and related groups. We will generate relevant tests for these programs (very few are typically provided).
We will then build benchmark suites with evolving parallel code (ideally with evolving tests). Current programs examined by the tools have only one version.
We'll collect more versions by (1) trying to find software repositories for those code bases, (2) working with application groups to get access to their repositories, and/or (3) using mutation testing to simulate evolution.

Task 2: Improve tools for one test/version
We will continue working on the problem of searching the interleaving space effectively, for alternate schedules that exercise particular patterns and hence are more likely to have bugs. We expect to build a fairly full-fledged tool that will effectively examine alternate schedules for Java and C programs. The main challenges we will overcome here are:

automatically transform programs so that executions can be monitored
predict efficiently, and online, alternate schedules that violate atomicity or data-races
build automatic transformations of programs to exercise these alternate schedules in order to check for errors

Task 3: Evaluate tools across many tests/versions
Perform a study that evaluates existing techniques/tools across many tests. The research questions to answer are how sensitive in finding bugs these tools are to (1) the choice of tests and (2) the choice of schedules within tests. Some bugs can be found for some schedules/tests but not for the others. The results of this study should enable us to come up with new techniques for schedule/test prioritization and selection.

We will also perform a study that evaluates existing techniques/tools across several software versions. The research questions we will ask are: how does the sensitivity in finding bugs (choice of tests and schedules) change with code evolution? The results of this study should enable us to come up with new techniques for incremental analysis.

Additional Resources

A paper on efficient mutation testing for multithreaded code is under submission.

Autotuning

Gluon: Interface for
Trusted Programming

Interactive Porting

Libraries

Optimizations for Power

Refactoring

Safe Parallel Prog. Languages

Scheduling

Verification & Validation