ViVid - Vision Video Library

Participants: Sanjay Patel, Tom Huang, Mert Dikmen

Executive Summary

Create an easy to use, modular video analysis library, taking advantage of multi-core processing capabilities of the GPUs.

Goals - Extended Description

Our strategy with UPCRC applications work is to find compelling applications that have a clear need for speed on the client, and to use these as drivers for the language, tools, and correctness efforts going on within the Center. One such effort involves the creation of parallel libraries. Our students have already created a library of core functionality (ViVid) for parallel video analysis. It has the capability of rapidly prototyping and testing systems for recognition or detection of space-time entities (e.g. events, actions, etc..). Although much of the video analysis is an open research, most tools being researched share common computational primitives such as kernel evaluations, histogramming and arithmetic operations over a locality in video. ViVid's goal is to provide faster versions of such operations taking advantage parallel computation. By lowering the impact factor of issues related to computational timing on the systems, our vision is to allow the community to concentrate on the precision of the toolsets being developed, and let research progress at a faster pace.

ViVid is currently implemented with CUDA/C++ with the Python glue layer. But now that the functionality has been identified, we can create benchmarks, libraries, and applications that target the same functionality to other targets using, for example, DPJ, HTAs or refactoring tools, or other concepts being developed by our Center at large.

Results

GPU - accelerated video analysis functionalities

Salient region detection
Feature transformations
Clustering, classifier evaluation

Successfully built and used several event detection architectures
Demonstrated significantly improved speed speed in several tasks

Additional Resources

http://libvivid.sourceforge.net

Graphics

Computational Imaging and Video

Natural Language Processing

Secure Web Browsing

Tele-Immersion

Video Vision Library