ECML PKDD 2006 Tutorial/Workshop

Autonomic Computing:

A New Challenge for Machine Learning

The Tutorial/Workshop on "Autonomic Computing: A New Challenge for Machine Learning" will be held on September 22, 2006 in conjunction with the 17th European Conference on Machine Learning and the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases in Berlin, Germany.

Workshop Description

In this combined tutorial and workshop, we will explore the many new challenges for machine learning that arise in the context of the emerging field of Autonomic Computing. Large-scale distributed computing systems are evolving into multi-vendor tangles of heterogeneous components, each of which may have hundreds of configuration and tuning parameters, and they are rapidly becoming too difficult for humans to configure, tune and maintain. The goal of Autonomic Computing is to avert this looming complexity crisis by giving computing systems and their components the ability to manage themselves in accordance with high-level objectives from administrators.

Machine learning appears to be a promising approach to the above challenges which could replace laborious, time consuming, and often suboptimal "hand-crafted" rules and models that are typically used for management tasks in state-of-art systems, by adaptive online learning and decision-making methods, such as, for example, reinforcement learning. Another hope is that adaptive online decision-making via ML approaches can cope better with the complex non-stationary dynamic system behavior than the explicit control and queuing models used in the most advanced systems management today. Moreover, manual selection of relevant metrics and measurements that is essential for effective management could be replaces by dimensionality reduction and active learning approaches that would help to "squeeze out" at minimal cost the most-relevant systems applications, it does not necessarily mean that all the data being collected are informative with respect to the question of interest or are relevant for decision needs to be made. Indeed, "we are drowning in data but starving for knowledge".

The good news about systems management domain is that it is naturally suited for active learning and exploration vs exploitation in reinforcement learning, since various measurements and tests can be constructed and performed on demand - the feature not always present in more "natural" domains such as biology or medicine where tests are more costly and constrained. In the tutorial, we plan to cover in-depth existing machine-learning approaches to various management tasks, such as:

  • reinforcement learning (RL), and particularly hybrid RL that combines expert's knowledge with learning from data
  • active learning, cost-sensitive learning and exploration vs exploitation trade-offs in decision-making
  • dimensionality reduction and feature selection
  • complex networks properties and algorithms
  • examples of applying the above approaches to various systems management tasks, including resource allocation and diagnosis.

    In the workshop following the tutorial we hope to bring together academic and industrial researchers to share their experience in applying machine learning to complex distributed systems management, identify new challenges to machine-learning community, match mature technologies to current problems, and chart the trajectory of inter-disciplinary research techniques that can be applied in Autonomic Computing. We also hope to build this workshop upon the success of our two recent NIPS workshops: NIPS-05 workshop on Value of Information (http://domino.research.ibm.com/comm/research_projects.nsf/pages/nips05workshop.index.html), which relates to active learning, feature selection and exploration vs exploitation in RL, as well as our NIPS-03 workshop on Complex Networks at (http://www.research.ibm.com/nips03workshop/) which relates to various issues in the today's actively researched area of artificial (internet, web) and natural (biological, social etc.) complex networks and distributed systems. We also hope to build upon the SysML-06 workshop (http://research.microsoft.com/workshops/sysml/) organized in conjunction with ACM SIGMETRICS conference this year (with one of our organizers, Gerry Teasuro, on PC committee).

    Topics

    Topics of interest include, but are not limited to:
    • Applications of machine learning to distributed systems and networks of various kinds, including Internet applications, peer-to-peer systems, Grid computing, sensor networks.
    • New challenging problems for machine learning that arise in systems management areas such as security, reliability, and performance.
    • Online learning methods for handling large volumes of real-time data, e.g. event streams
    • Scalability issues
    • Feature selection, dimensionality reduction, clustering and other techniques for filtering large amounts of systems data
    • Adaptive decision-making approaches that implement self-management capabilities (e.g., in resource allocation)
    • Any novel machine learning techniques that are successfully applied to systems management
    • Examples of real-world systems that benefit from machine-learning approaches

    Paper Submission

    We invite the submissions of both technical papers (12 pages maximum) and short position papers and/or work in progress (6 pages maximum). Papers will be selected for presentation and/or publication in the workshop proceedings based on their originality, technical merit, topical relevance and their likelihood of generating discussion at the workshop. Submitted papers will be reviewed by at least two members of the Program Committee. All submissions should be made electronically, by email attachment and preferably in Postscript or PDF format. All submissions must be sent to rish@us.ibm.com. Although not required for the initial submission, we recommend to follow the format guidelines of ECML/PKDD (Springer LNCS -- LaTeX Style File available at http://www.springer.de/comp/lncs/authors.html), as this will be the required format for accepted papers. The workshop proceedings will be available online and hard copies will be distributed during the workshop.

    Important Dates

    • Submission deadline: July 17, 2006. Submit papers to rish@us.ibm.com.
    • Notification of acceptance: July 31, 2006
    • Camera-ready copies due: August 21, 2006. Submit papers to rish@us.ibm.com.
    • Workshop: September 22, 2006
    Workshop Program

    Morning Session: 10:40 - 12:10

    10:40 TUTORIAL: Part 1
    Irina Rish and Gerald Tesauro

    Afternoon Session: 13:30 - 18:00

    13:30 TUTORIAL: Part 2
    Irina Rish and Gerald Tesauro

    14:30 Reliable, Adaptive Distributed Systems: RADical New Challenges For Machine Learning
    Armando Fox

    15:10 coffee break

    15:30 Inferring Network Structure from Co-Occurrences
    Michael Rabbat

    15:30 Statistical Software Debugging
    Alice Zheng

    16:50 break

    17:00 Resource Access Pattern Mining for Dynamic Energy Management
    Dinesh Rajan

    17:20 A New Distributed Data Mining System on Grid
    Huaiguo Fu

    17:40 discussion

    Organization

  • Irina Rish, Gerald Tesauro, Jeff Kephart, Rajarshi Das
    IBM T.J. Watson Research Center
    19 Skyline Drive, Hawthorne, NY 10532
    email: {rish,gtesauro,kephart,rajarshi}@us.ibm.com

  • Photo by Land Berlin/Thie