Over 156 papers published in conference proceedings and journals in 2003 were submitted by IBM Research authors worldwide for consideration as 2003 Pat Goldberg Memorial CS/EE/Math Best Papers. The 2004 Best Paper Award Winners will be announced in August 2005.
The following five 2003 papers were selected (in alphabetical order):
- ARC: A Self-Tuning, Low Overhead Replacement Cache
Nimrod Megiddo and Dharmendra Modha
FAST Conference on File and Storage Technologies
While many algorithms have been proposed for page replacement in caches over the years, Least Recently Used (LRU), one of the oldest algorithms, has remained largely popular given its simplicity, ease of implementation and low overhead. The Adaptive Replacement Cache (ARC) algorithm proposed in this paper is an elegant scheme that dynamically adapts itself to the changing characteristics of workloads without requiring any user-defined tuning parameters as many other attempts to improve on LRU have. The paper clearly covers the work done in cache replacement algorithms in the past, builds up the motivation for ARC and clearly describes the self-tuning nature. The results brought out in the paper convincingly show ARC consistently outperforming LRU. Like LRU, the algorithm has constant time complexity with regard to cache size and is simple to implement. It is expected that ARC will be used in storage systems as well as in several other applications that utilize caches such as virtualizers, databases and file systems. The ARC paper was presented in FAST '03, a key conference for Storage Systems and subsequently in Computer. A follow-on to ARC appeared in FAST '04.
- BHUNT: Automatic Discovery of Fuzzy Algebraic Constraints in Relational Data
Peter Haas and Paul Brown
International Conference on Very Large Databases (VLDB)
This paper develops a novel solution to the problem of automatically discovering hidden algebraic or fuzzy constraints between pairs of attributes in relational data. The constraints are of interest in the context of data mining and query optimization. The paper shows reduction in table access time up to two orders of magnitude by supplying this information to the optimizer in the form of constraint predicates along with selectivity estimates. The proposed method improves over previous data-driven efforts for learning about data relationships by needing only to provide probabilistic guarantees. The paper also explores the practicality by investigating techniques for using system catalog information and data samples to control processing costs of this discovery process. The technique proposed in the paper helped IBM achieve #1 position in the TPC-H benchmark performance for 10TB databases and resulted in an OTAA for the authors. By establishing that automatically discovered constraints in relational data can be exploited for such drastic improvements in query performance, this paper will greatly influence database community to explore further research and development in this direction.
- LeakBot: An Automated and Lightweight Tool for Diagnosing Memory Leaks in Java Applications
Nick Mitchell and Gary Sevitsky
European Conference on Object Oriented Programming (ECOOP)
A memory leak is an error in a program's dynamic memory allocation logic that causes it to fail to reclaim discarded memory, possibly leading to an abnormal program termination due to memory exhaustion. Memory leaks are a cause of major problems in languages that support the dynamic allocation of memory, which includes most modern programming languages such as Java and C#. They are also difficult to discover and debug, due to noise, complexity, and scale of heaps. This paper introduces a conceptual framework for categorizing and presenting memory behavior and describes a novel method called heap evolution analysis for detecting and locating memory leaks automatically. It presents details of its implementation in the LeakBot tool, a lightweight, fully automated approach to memory leak diagnosis that is applicable to industrial-strength Java applications. It also presents empirical results of using the tool on customer applications, demonstrating the effectiveness and scalability of the method. The method incorporates three new and largely independent techniques, ranking data structures by their characteristics, grouping objects by evolution equivalence, and lightweight monitoring of evolution, for which patent applications have been filed. Furthermore, the techniques have the potential for application in domains other than memory problems (e.g. compiler optimizations) and may be applicable to study graph evolution in other fields.
- Statistical Timing for Parametric Yield Prediction of Digital Integrated Circuits
J. A. G. Jess (U. of Eindhoven), K. Kalafala (IBM Microelectronics), S. R. Naidu (U. of Eindhoven), C. Visweswariah, R. H. J. M. Otten (U. of Eindhoven)
Design Automation Conference
Uncertainty in circuit performance due to circuit and environmental variations is increasing with each new generation of technology. Traditional corner-based analysis requires an exponential number of timing analysis runs, and results in pessimistic timing results. This paper presents a statistical timing approach that overcomes these problems. Delays of individual gates and wires are treated probabilistically to predict the distribution of the circuit performance of the chip. These distributions are expressed as parametric yield curves. For the first time in the literature, a practical method is proposed in this paper that could conduct statistical timing of large and complex digital chips, while taking the myriad correlations into account that are necessary to preserve accuracy. Statistical timing leads not only to less pessimistic design, but also enables a quantitative measure for a robust design methodology that targets circuits whose performance is insensitive to process and environmental variations to first order. The paper was featured in a June 9 front page EE Times article http://www.eedesign.com/story/showArticle.jhtml?articleID=17408446, which begins with lead in "The big technology splash at this year's Design Automation Conference came not from a commercial EDA company but from IBM Corp. ... "
- Steerable Interfaces For Pervasive Computing Spaces
Gopal Pingali, Claudio Pinhanez, Anthony Levas, Rick Kjeldsen, Mark Podlaseck, Han Chen, Noi Sukaviriya
IEEE International Conference on Pervasive Computing
This paper introduces and describes a novel model of pervasive computing in which interactive user interfaces can be realized without the need for dedicated computer interface devices, turning tables, walls and other surfaces into natural interaction interfaces. This is achieved through steerable video projection and human gesture/activity recognition. As the survey in the paper shows, there has been a large body of work on individual facets of this problem, such as configurable projection systems and image-analysis based human activity detection systems. Only recently have these individual pieces of work begun to mature. This paper shows how steerable interfaces can now be built (at least at a prototype level) by combining algorithmic and hardware improvements in each of these areas, many of them made by the papers authors. The paper also presents an overview of a hierarchical system architecture and APIs to write adaptable applications for such steerable interfaces. The different portions of the display environment can be associated with different widgets, and each widget has specified modes of human input. Moreover, the overall system framework demonstrates how the various components of the display (including the widgets) can be dynamically adjusted to reflect various activities in the physical world (e.g., the user moving, or some part of the wall getting shadowed). This paper received the “Best paper award” from IEEE PERCOM, which is regarded as one of the top 2 conferences in the area of Pervasive Computing.