Self-managing heterogeneous storage systems

Note: This project is dormant.

The growth in the amount of data being stored and manipulated for commercial, scientific, and intelligence applications is worsening the manageability and reliability of data storage systems. The expansion of such large-scale storage systems into petabyte capacities puts pressure on cost, leading to systems built out of many cheap but relatively unreliable commodity storage servers. These systems are expensive and difficult to manage (current figures show that management and operation costs are often several times purchase cost) partly because of the number of components to configure and monitor, and partly because system management actions often have unexpected, system-wide side effects. Also, these systems are vulnerable to attack because they have many entry points, and because there are no mechanisms to contain the effects either of attacks or of subsystem failures.

Kybos is a distributed storage system that addresses these issues. It will provide manageable, available, reliable, and secure storage for large data collections, including data that is distributed over multiple geographical sites. Kybos is self-managing, which reduces the cost of administration by eliminating complex management operations and simplifying the model by which administrators configure and monitor the system. Kybos stores data redundantly across multiple commodity storage servers, so that the failure of any one server does not compromise data. Finally, Kybos is built as a loosely-coupled federation of servers, so that the compromise or failure of some servers will not impede remaining servers from continuing to take collective action toward system goals.

Our primary application is the self-management of federated (but potentially unreliable) clusters of storage servers, but we anticipate that the algorithms we have developed (and will implement) will have broad applicability to the general class of problems involving the coordination of independent autonomous agents with a collective set of mission goals.

Publications include:

- Richard A. Golding and Theodore M. Wong. Walking toward moving goalposts: agile management for evolving systems. In Proceedings of the First Workshop on Hot Topics in Autonomic Computing (HotAC I), June 2006

- W. W. Wilcke, R. B. Garner, C. Fleiner, R. F. Freitas, R. A. Golding, J. S. Glider, D. R. Kenchammana-Hosekote, J. L. Hafner, K. M. Mohiuddin, KK Rao, R. A. Becker-Szendy, T. M. Wong, O. A. Zaki, M. Hernandez, K. R. Fernandez, H. Huels, H. Lenk, K. Smolin, M. Ries, C. Goettert, T. Picunko, B. J. Rubin, H. Kahn, and T. Loo. IBM Intelligent Bricks project—Petabytes and beyond. IBM Journal of Research and Development, 50(2/3), pp. 181–198, March–May 2006

- Theodore M. Wong, Richard A. Golding, Joseph S. Glider, Elizabeth Borowsky, Ralph A. Becker-Szendy, Claudio Fleiner, Deepak R. Kenchammana-Hosekote, and Omer A. Zaki. Kybos: Self-management for distributed brick-based storage. IBM Technical Paper RJ10356, August 2005