Integrity Measurement Architecture (IMA)

Prologue: This informational page answers the What, Why, and How regarding the Integrity Measurement Architecture (IMA) in a generally understandable manner. You can find technical and scientific papers by following the links in the 'more information' section.

Contents:

What is IMA?
Why IMA?
How does IMA work?
Trade-offs & Challenges
Download
Short History
Last and Least
More Information




What IMA is

IMA is a software architecture and implementation for Linux that provides verifiable evidence regarding the current run-time of a measured system, which can be used by another system to derive run-time properties of this measured system. Using IMA evidence, the verifying system does not rely on the trustworthiness of the software environment of the measured system to establish such guarantees but builds instead on a Trusted Platform Module (TPM) hardware extension (which is protected against the system software) of the measured system. TPMs exist today on most client systems and equivalent functions are planned for many server systems as well. IMA leverages the TPM as a hardware root of trust on which trust into system properties can be built; the TPM is protected against the system software of the measured system by providing a slim and well-designed interface through which it can be addressed by the system software.

IMA maintains a list of hash values covering all executable content loaded into a Linux system run-time since the start (boot) of the system. It integrates measurements by the BIOS and bootloader and operating system and offers an integrated interface to retrieve these hash values (measurements) from a remote system. This list is integrity-protected by the TPM chip at all times. IMA offers interfaces to retrieve these measurements from the kernel as well as an integrity value over these measurements from the TPM chip. Providing the measurements and the TPM integrity value over the measurements to a verifier, this verifier can establish trust into the measurements and through the measurements trust into the run-time properties of the measured system.

What IMA is not

IMA is not controlling your system. IMA is non-intrusive and is best described as an independent observer collecting integrity information of loaded code or sensitive application files on demand. Consequently, IMA does not prevent a system from illegal behavior that might compromise the system including the integrity measurement architecture itself. Recognizing the danger of being by-passed, IMA simply invalidates its own measurement list by invalidating the TPM integrity aggregate and thereby rendering the evidence useless (non-verifiable) until it is reset during the next system reboot. For example, if applications write directly to a device (/dev/hda, /dev/sda) or kernel memory (/dev/kmem), then IMA invalidates the TPM aggregate.

IMA is not a Digital Rights Management tool either. IMA collects evidence on the local system, which can be used for many purposes but whose release is fully controlled by the local system. It is getting harder to lie about what you are running when you use IMA; no doubt about it. However, if system security is going to be reality, systems that lie at will seem not a convincing alternative in a distributed environment where the weakest link determines the security of the distributed service. The price to pay is that properties must be established securely and that the balance between use and abuse of knowledge about such properties as well as the validity of requiring such evidence (e.g., before connecting a system to a video-on-demand service) must be controlled by rules that are enforced the same way they are in other areas of our daily life (most likely by laws).

Guarantees

IMA guarantees that all executable content and all application-requested file content is accounted for in the measurement list before it is loaded and that the list of accumulated measurements cannot be changed afterwards by a compromised system software unnoticeably.

Assumptions (Trusted Computing Base)

To operate correctly, IMA must be enabled and the kernel must be working as expected since IMA relies on (i) being called before executable code is loaded, (ii) kernel functions for reading the file from the file system and building a SHA1 hashsum over the file, (iii) access to the Trusted Platform Module to maintain and protect an integrity value over the history of hash sums (measurements). IMA intercepts the loading of executable content and kernel modules automatically and resides in the operating system kernel. IMA currently uses Linux Security Module hooks that are spread throughout the kernel and to which security modules (such as IMA) can register. IMA adds a unique hook specific to IMA to measure kernel modules before they are relocated in the memory; this was done because there is no LSM hook available. If such a hook becomes available, IMA will use this LSM hook. IMA is called by the kernel before any file is mapped into the virtual memory space (mmap LSM hook) and therefore before the file contents affects the system behavior.

IMA does not prevent a system from becoming compromised. Therefore, we must assume that the system is compromised and use the measurements itself and the TPM integrity value to reason about the validity of measurement list. Since a compromised system can append measurements to the end of the measurement list at will or omit new measurements, the main challenge is to determine up to which point the measurement list is valid and complete. By measuring programs before they are run, IMA enables a verifier to decide if a loaded program is malicious or exhibits fundamental vulnerabilities. The first measurement corresponding to such a distrusted program is the last measurement in the list that can be trusted to have been recorded correctly.

The TPM itself is (so far) not designed to withstand physical attacks. If attackers have physical access to the TPM chip on a system, they can manipulate the way the TPM works, potentially extract the signature keys and forge TPM PCR signatures, or simply control which measurements from the operating system actually reach the TPM and are protected by its PCR registers. As a result, to establish trust into a remote system where physical attacks cannot be tolerated to break the security, additional security perimeter protection must be applied. See for example our virtualized TPM project that enables us to run a virtual TPM in a tamper-responding secure coprocessor.

What about IMA's future?

Recently, IMA has also been used to simply protect measurements and leave the way these measurements are created to other kernel modules. One application of this form of IMA is the Trusted Linux Client (TLC, slides, paper draft), which implements a low-watermark mandatory integrity policy based on the Linux Security Module. The TLC component protects and measures system files when loaded and then calls IMA to maintain and protect the measurement list and support remote attestation. In this case, IMA guarantees that the measurements that are collected are protected and can be securely attested to by remote parties. It is the responsibility of the module calling IMA (here: the TLC component) to ensure that the collected measurements are sufficient to enable remote parties to establish desired properties into the system. This approach proved to be very flexible and moves IMA into the position of a generic kernel module that manages the TPM PCRs and offers interfaces for managing, protecting, and retrieving kernel measurements.



Why IMA?


Distributed applications are ruling todays digital world. However, establishing trust into distributed applications, specifically into distributed run-time environments is very difficult. IMA improves this situation by keeping a secure log of all executable code, kernel modules, and important configuration files that are loaded into a system's run-time since the system-startup. Since the sum of loaded files makes up the run-time environment of a system and determines its behavioral properties, such information can be very helpful in establishing (security) properties of a remote system. Whether or not and to whom a system exposes its load history is fully under control of the system, however a system cannot cheat unnoticeably about its logs (freshness and untampered-ness of the log are guaranteed).

Another good reason for IMA is that some systems deploy well-designed, updated, and evaluated code and some don't. Using IMA, a system can prove to a customer or even its administrator (directly or in-directly through trusted third parties) that it is running good programs. This can be a decisive factor for getting a return-on-investment for better code. This reminds me of organic food ... everybody I know has some doubt: is it really organic and worth the additional money? -- For food, there are controls that allow producers to label organic food with protected labels. To be allowed to use such a label, the growing facilities and handling of the food must conform to defined standards. Organic growth facilities are inspected and screened from time to time to ensure (and encourage) that they conform to the rules. IMA does the same thing for system software: it creates evidence of how a system grows and develops. Systems that care to show that they are doing good work can go through additional efforts to receive the "label" for which a customer is willing to pay. IMA evidence can either be used directly by delivering measurement logs of loaded software in real-time to a customer, or indirectly through certified trust of third parties.

Why this name?

Integrity
relates to the integrity of data and programs loaded into the run-time of a system

Measurement
relates to measuring properties of loaded data and code. IMA creates a unique fingerprint of selected data files and all executable files loaded into the runtime of a system by applying a hash function, such as SHA1, over the file contents. Different files will yield different fingerprints, files with exactly the same content will yield the same fingerprint. Comparing such a fingerprint of a loaded file to fingerprints of known data and executable files enables us to transfer the properties of the known files to the files loaded on the remote system. It allows a verifier to measure the properties of the loaded file by matching its fingerprint against properties of known files.

Architecture
emphasizes the potential of IMA to be a generic solution. While we showed (by implementation) that the IMA approach works for Linux and while it is likely to work for Unix-based systems in general, it is not clear that it works for MS Windows-based systems.



How does IMA work?

Operating principle

"measure what you load before you load it"

Figure 1 depicts the components of the Integrity Measurement Architecture and their interactions. As part of the kernel initialization, IMA integrates the measurements taken during the system boot as a seed into its own measurement list. Then, IMA continuously measures (a) executable content that is loaded into the run-time and (b) application-determined sensitive files. Any measured file is treated the same way: its SHA1 value (configurable is also MD5) is taken over the file, then added to the measurement list and aggregated into the TPM integrity aggregate so that the aggregate matches the new list. This measurement list is reset at re-boot time where also the TPM registers are reset automatically.

ima_image

a) Automatic measurements of executable files

IMA automatically measures executable code and stores and protects the measurement before executing the code. This way, the loaded executable can -- even if it compromises the system -- not hide the fact that it was loaded without, at the same time, forfeiting the ability to successfully attest to the run-time environment. IMA extends the Trusted Computing Group's 'measure-before-load' principle followed throughout the boot process into the run-time of a system. IMA represents a "one-stop-shop" for measurements collected during the boot time and during the run-time of a system up to the current time. It exposes the measurement list through the security file system /sys/kernel/security/ima. The integrity of this list can be verified by retrieving the signed integrity value from TPM through a TPM library. Both, measurement list and signed integrity value, can be released to authorized remote parties in a controlled way by implementing an attestation service. The receiver validates the measurement list using the signed integrity value. Afterwards the receiver uses the list of measurements to derive system properties. The challenging party sends a nonce in the request, which is included in the quote to protect against replay attacks.

b) Manual (application-induced) measurements of important input files

IMA offers applications a convenient and simple interface to induce the measurement of an important input file, such as a configuration file, or interpreted scripts such as bash startup scripts. An application can request IMA to measure a file simply by opening this file read-only and then writing its file descriptor into /sys/kernel/security/ima/request. As a proof of concept, we have equipped the bash with such measure calls where script or source files are read to be executed or sourced by the bash shell. We added only 2 such measure calls into the bash source code to include all executed bash command and source files into the measurement list before they affect the system. Although this increases the length of the measurement list, it certainly also increases the value of the measurement list since the system run-time depends heavily on the correct startup of the privileged system services. IMA usually benefits from existing file protection for memory mapped executable files so that it is ensured by the kernel that a file that is measured does not change between measuring and loading. For user space measurement requests, this does not hold and any measured file could change after it is measured and before or while it is read by the application. For this reason, IMA keeps track of which measured files are open; if such a file changes before the measuring processes have closed the file then this is a violation and IMA will invalidate the measurement list and TPM aggregate. It is the responsibility of the application to request the measurement of a sensitive file before reading from it. Again, IMA does not prevent illegal write access to files being measured but prevents such systems from successfully attesting its potentially invalid measurement list.

A real example of a full measurement list after booting a Linux system and starting the major services can be found here. The depicted list includes application-induced measurements of the Bash shell. Figure 1 also shows how these measurements and the TPM signature are retrieved (1) from a challenging party and validated (2) before they are used to establish properties of the attested system using a data base to map measurements onto files and their properties (3). Additionally, known vulnerabilities of loaded code or known malicious input files and their criticality can be determined using external sources such as the CERT or SANS vulnerability data base.

Remote Attestation using IMA

The functioning of the Linux Integrity Measurement Architecture and the related remote attestation is best explained by following the messages that are exchanged between the attested and the challenging system and by considering the processing steps of the messages. Figure 2 shows the message-time-diagram of a typical attestation based on IMA:

attestation_diagram

First, the challenging party creates an unpredictable nonce and sends it in step 2 together the attestation request to the attesting party. An example of an attestation request is captured here. The attesting party initiates the TPM to load a so-called Attestation Identity Key (AIK), more specifically the asymmetric secret key part. Then it submits the nonce to the TPM in a "quote" request. The TPM signs the nonce together with the current Platform Configuration Register (PCR) contents with the loaded AIK secret key and returns it to the attesting party. Finally, the attesting party retrieves the complete ordered list of measurements (that were taken since reboot) from the IMA extension of the kernel through the IMA file system interface. In step 3, the signed PCRs and the list of measurements are returned as a response to the attestation request. An example of an attestation response including measurements and TPM quote can be found here (not including the BIOS measurements).

The challenging party, receiving the measurement list in step 3, first verifies the signature on the nonce and PCR registers using its certified public Attestation Identity Key in step 4. If the signature verifies and the signed nonce equals the nonce of the attestation request in step 2, then the challenging party recalculates the PCR over the measurements by simulating the PCR extensions consuming all the delivered ordered measurements one after the other. Recalculating the PCR content can be done as follows:

Verifying the PCR aggregate

in PCR: TPM-signed aggregate from step 3
in MList: Measurement list from step 3

 {
   uchar PCR_tmp[20] = {0...0}

   for (i=0; i<MList.len; i++)
        PCR_tmp = SHA1(PCR_tmp|MList[i])

   if (PCR == PCR_tmp)
        return OK
   else
        return INVALID

 }

If after consuming all measurements the result in the virtual PCR equals the signed PCR (integrity value) from step 3, then the measurement list is untampered and the attesting party moves on to step 6. If the recalculated PCR does not match the signed PCR, then the measurement list is tampered (by the attesting party or during transmission or by the challenging party) and cannot be used.

In step 6, the challenging party starts identifying the source for each measurement in the measurement list and determines if this source (specific data file or executable) is trusted not to manipulate future measurements. If all measurements are trusted not to manipulate the measurement environment, then the measurement list reflects all files loaded into the run-time of the attesting party. To identify the source of a measurement, we used a data base of known fingerprints that determines for each measurement (SHA1 or MD5 value) the respective program/file and its version as well as additional property information, e.g., trusted program or kernel, vulnerable program, or malicious rootkit component. Using these properties, the run-time of the attested system can be approximated. Our data bases for Redhat Fedora Core 3 or 4 include around 25000 measurements, typical measurement lists accumulate about 700-1000 measurement.

It is then up to the application of our Integrity Measurement Architecture to decide which properties of these files and the resulting run-time on the attesting system are important. Examples are to determine if known malicious code was executed on the attesting party, if the attesting party is running a known configuration of programs, or if the loaded applications have known vulnerabilities.

Two security mechanisms in the above protocol deserve mentioning: First, an unpredictable nonce provided by the remote party (challenging party) that is included into the TPM signature ensures freshness of the quote and thus of the measurement list. The attesting party cannot replay an old measurement list and an old quote to hide some recently loaded programs. Second, the TPM signature is verified to ensure that the used signature key is actually a TPM key and a so-called platform certificate connected with the TPM public key certificate validates the system the TPM is embedded into and the correct embedding of the TPM chip.

It should be obvious from Figure 2, that the attesting party is active and controls if a signature and the measurement list is provided to the challenging party. The attesting party, however, cannot cheat unnoticeably with the measurement list.

Application Example:

Assume you would like to use a server application on a remote system to compute some partial result for you. Using IMA, you can evaluate the remote system and its application run-time (i) before sending the computation request as well as (ii) after receiving the result of the computation. If the system properties are acceptable both times and the system did not reboot in-between (hardware counters in the TPM can help to detect reboots between attestations), then the system was acceptable throughout the computation as well and you would assign a high trustworthiness to the result. For privacy reasons regarding the measured system, you might need to choose a trusted third party that retrieves and evaluates the measurements for you and reports the results to you while keeping the specific measurements undisclosed.

Generalizing the above example, to supervise properties of remote systems, periodic inspection using IMA helps to provide the necessary trustworthy evidence. We name the periodic property attestations the "integrity heart-beat" of the system. At each "beat", we verify that the expected system properties still persist.

How does IMA relate to remote attestation?

Remote attestation refers to the act of remotely verifying that software or hardware is genuine (as expected) or correct. IMA helps to securely maintain and later retrieve evidence describing what was loaded into a system's run-time. Remote parties can relate system properties to this evidence. IMA in this case supports to attest to such properties from a remote location (e.g., cluster management console).



Trade-offs & Challenges

Public Key Infrastructure: Attestation in general requires the challenging party to establish and maintain TPM attestation key certificates to relate measurements and related properties to the correct system and to ensure that the quote and PCR was signed by a TPM and not by a masquerading corrupt system.Therefore, TPM-based attestation can inherit the scalability problems inherent in today's PKI infrastructure.

Measurement Scope: IMA only keeps a single measurement for each file loaded into the run-time. I measures it the first time it is loaded and stores the measurement. The measurements do not reflect the order of later loads or the number of times a file is loaded into the run-time. Keeping a measurement for each loading would severely lower the scalability for systems with prolonged up-time.

Furthermore, IMA adds dirty flagging to only re-measure files that could have changed since their last measurement. It also adds by-pass protections to detect actions that try to change files without invoking the dirty-flagging. Such detections result in violations that will invalidate the TPM PCR and disable the system from successfully attesting its measurements (step 5 in Figure 2 will fail until reboot and reset).

Additionally, measuring everything loaded into the run-time (any file, input, etc.) would capture the run-time better, since the run-time depends on all loaded code and handled input on todays COTS systems with no strong confinement properties. However, most such measurements would not be predictable and the challenging party would not know what they mean. Therefore we focus on measuring executables and important configuration files or interpreted files. Current enhancements, such as the Policy Reduced Integrity Measurement Architecture (PRIMA), measure confinement and information flow properties of the attesting system with regard to certain target applications and then focus on measuring those files that potentially affect the target applications and their TCP while avoiding to measure all the potentially proprietary other parts of the system.

Runtime Corruption: IMA vouches for what was loaded into a system but does not give guarantees after the loading. If malicious input can render a good program evil then IMA will not produce evidence of such a corruption. However, the property of a program to be susceptible to malicious input is part of the property of the loaded code. If this vulnerability of a program is known when the measurements are evaluated, then this vulnerability becomes part of the properties that are preserved by IMA. Here, the system environment and risk management can provide further decision points to conclude if this program is assumed to work correctly.

Known insufficient boundary checks and validations are measurable properties of the software, but what about 'sufficient' boundary checks? Formal verification of software might remain more dream than reality for most software for the time being. Unless there is a formal specification of what an application is supposed to handle as input to produce correct output (known as partial correctness), the 'sufficient' attribute for input checking remains opinion rather than science.

Scalability: Evaluating the measurements requires the challenging party to assign programs and then properties to the protected hash values in the measurement list.The database that stores measurements and related programs and properties can become large and its maintenance laborious and should therefore be automated (integrated into the system software management). This becomes a scalability issue if proprietary or locally compiled software is used on systems that run IMA; in this case, the translation of measurements into programs and their properties cannot be done in advance but requires to inspect the proprietary object code related to such a measurement at evaluation time.



Where can I download IMA?


This technology, named linux-ima, can be downloaded from sourceforge.net and resides in the projects/linux-ima folder. It is available under the GNU Public Licence 2. It will be available also for virtualized environments (Xen) using virtual TPMs to attest to the run-time environments of virtual machines.

To use IMA, you need
  • a Trusted Platform Module (we can experiment without TPM by configuring IMA in the Linux kernel with the TPM-bypass option; in this case, the measurement list is not protected against a corrupted system and you won't have boot stage measurements)
  • a Trusted Boot Loader that makes use of the TPM to measure and log boot stages before they are loaded and kernel configurations that IMA builds upon (we can experiment without the TPM Grub extension but we won't be able to establish a measurement chain from the TPM core root of trust into the IMA kernel measurements since we will miss the boot measurements)
  • the IMA kernel patch
  • a library implementing the Quote function to retrieve the signed PCR from the TPM (e.g., TrouSerS, an open-source TPM software stack, or any other open-source TPM library).
  • a service that responds to remote trusted parties' attestation requests with the measurement list and the signed PCR aggregate value
The IMA documentation (Documentation/ima/INSTALL and Documentation/ima/integrity_measurements.txt) describes how to retrieve measurement lists from the kernel and the TSS or TPM library description shows how to retrieve signed PCRs from the TPM.



Short history

IMA started out as a project aimed at leveraging the TPM hardware and finding ways to improve system security with TPMs. IMA was open-sourced under the Gnu Public Licence 2 through its submission to the Linux kernel mailing list (lkml) in 2005. It was officially rejected as a main-stream Linux Security Module candidate because it does not enforce security on the local system. This is a very limited interpretation of access control but nevertheless a valid one. IMA was largely re-written to address the useful comments of the kernel maintainers and is currently available (still as a Linux Security Module) on the sourceforge project page. IMA was and will be a group effort and many people have contributed to the current implementation. IMA will likely change its form and become a kernel module or library that can be used universally throughout the kernel to manage TPM PCRs and to protect measurements and other evidence (e.g., audit events) using the TPM security chip.



Last and least

There have been many negative remarks from well-known but not well-informed people about the alleged evils of hardware Trusted Platform Modules and the Trusted Computing Group (formerly known as TCPA) in general. Please refer to the knowledgable response (tcpa-rebuttal) of my valued colleague if you are open-minded and interested in facts about TPM and TCG.


More Information

Publications

David Safford: TCPA Misinformation Rebuttal. http://www.research.ibm.com/gsal/tcpa. October 2002.
(This white paper responds point by point to several papers and web pages which have criticized the TCPA chip --today called TPM-- based on misunderstandings and incorrect analysis.)

David Safford, Jeff Kravitz, Leendert van Doorn: "Take Control of TCPA", Linux Journal, August 2003 issue.
(This article describes basic functions of the TPM using a small TPM library that the authors made available).

Reiner Sailer, Xiaolan Zhang, Trent Jaeger, Leendert van Doorn. Design and Implementation of a TCG-based Integrity Measurement Architecture. 13th Usenix Security Symposium, San Diego, California, August, 2004.
(This paper describes the architecture and implementation of IMA for Linux).

Reiner Sailer, Leendert van Doorn, James P. Ward: The Role of TPM in Enterprise Security. Datenschutz und Datensicherheit (DuD), September, 2004.
(This paper gives a high-level overview of IMA and describes a prototype that was built for e-business environments).

Reiner Sailer, Trent Jaeger, Xiaolan Zhang, Leendert van Doorn: Attestation-based Policy Enforcement for Remote Access. 11th ACM Conference on Computer and Communications Security (CCS) 2004, Washington, October, 2004.
(This paper describes an application of IMA to securely determine properties of a remote access client at the corporate VPN access point.)

Examples

Some examples of real measurements and remote attestation messages are available on our secure systems department project web page http://www.research.ibm.com/ssd_ima.

Software Downloads

Grub bootloader extension issuing measurements to TPM during boot.
Code: http://sourceforge.net/projects/trousers
Installation instructions: http://trousers.sourceforge.net/grub.html

Integrity Measurement Architecture LSM patch for newer kernel versions
Code: http://sourceforge.net/projects/linux-ima

Comprehensive TPM software stack, e.g., to retrieve signatures over PCRs (Quotes)
Code: http://sourceforge.net/projects/trousers

TPM device drivers are part of the newer Linux kernels
Code: ftp://ftp.kernel.org/pub/linux/kernel