|
The Exploratory Stream Processing Systems group at the Watson Labs engages in advanced research in stream processing platforms and applications.
Overview
The Exploratory Stream Processing Systems team at T.J. Watson Research center
conducts research on advanced topics in highly scalable stream processing applications
and systems. Most of the research efforts come under the umbrella System S project,
which spans several teams at Watson.
As the amount of data available to enterprises and other organizations dramatically
increases, more and more companies are looking to turn this data into actionable
information and knowledge. Addressing these requirements require systems and
applications that enable efficient extraction of knowledge and information from
potentially enormous volumes and varieties of continuous data streams. System S provides an execution platform and
services for user-developed applications that ingest, filter, analyze, and correlate
potentially massive volumes of continuous data streams. It supports the composition
of new applications in the form of stream processing graphs that can be created
on the fly, mapped to a variety hardware configurations, and adapted as requests
come and go, and relative priorities shift. System
S is designed to scale from systems that acquire, analyze, interpret, and organize
continuous streams on a single processing node, to high performance clusters
of hundreds of processing nodes. System S was designed to address
the following data management platform objectives:
- Parallel and high performance stream processing software platform capable
of scaling over a range of hardware capability
- Agile and automated reconfiguration in response to changing user objectives,
available data, and the intrinsic variability of system resource availability
- Incremental tasking in the face of rapidly changing data forms and types
- Secure, privacy-compliant, and auditable execution environment
System S provides abstractions to allow users to pose inquiries to the system
to express their information needs and interests. These inquiries are translated
into dataflow graphs specifying how available raw data and existing information
can be transformed to satisfy user objectives. The runtime environment accepts
these specifications, determines how it might reorganize itself in order to
best meet the requirements of newly submitted and already executing specifications,
and automatically effects the changes required. The runtime continually monitors
and adapts to the state and utilization of its computing resources, as well
as the information needs expressed by the users, and availability of data to
meet those needs.
Projects in the Exploratory Stream Processing Systems team can be described
under three broad categories :
- System Infrastructure for streaming :
Advanced research in the areas of stream programming languages and compilers,
massive scalability, high-performance stream data transport, adaptive resource
allocation and failure resiliency.
- Reference Applications : Applications
built by the ESPS team to demonstrate unique features of the system that enable
advanced stream mining applications.
- Pilots : Real-world stream processing applications
that the ESPS team is actively engaged in building and deploying.
Group Members : Lisa Amini, Henrique Andrade, Andy Frenkiel, Srinivas Kashyap,
Richard King, Ching-Yung Lin, Yoonho Park, Srinivasan Parthasarathy, Philippe Selo, Deepak Turaga, Chitra
Venkatramani, Olivier Verscheure and summer interns.
Last updated 4 Jan 2008
|