In this paper, we describe an efficient algorithm for lazily decomposing aggregates such as records and arrays into simpler components based on the access patterns specific to a given program. This process allows us both to identify implicit aggregate structure not evident from declarative information in the program, and to simplify the representation of declared aggregates when references are made only to a subset of their components. We show that the structure identification process can be exploited to yield the following principal results:
-
A fast type analysis algorithm applicable to program maintenance
applications such as date usage inference for the "Year 2000"
problem.
An efficient algorithm for atomization of aggregates. Given
a program, an aggregate atomization decomposes all of the data that
can be manipulated by the program into a set of disjoint
atoms such that each data reference can be modeled
as one or more references to atoms without loss of semantic
information. Aggregate atomization can be used to adapt
program analyses and representations designed for scalar
data to aggregate data. In particular, atomization
can be used to build more precise versions of program
representations such as SSA form or PDGs. Such representations
can in turn yield more accurate results for problems such as
program slicing.

A fast type analysis algorithm applicable to program maintenance
applications such as date usage inference for the "Year 2000"
problem.