IBM Researchers have found a better way to ensure the integrity and
authenticity of data sent over the Internet
In Brief
Data transmitted in the ethereal world of the Internet is ripe for attacks
by thrill seekers or those with more malicious intentions. Without suitable
safeguards, data is subject to eavesdropping, tampering and outright
destruction. Imposters can also "spoof" other users, pretending to be
someone they are not and sending messages under an assumed name. A means of
preventing such attacks developed by IBM researchers has been incorporated into
a number of standard Internet security protocols, as well as Netscape's popular
Web browser and other commercial software packages.
With the burgeoning of the Internet, the World Wide Web and electronic
commerce, the problem of verifying the integrity of data and the identity of
both sender and recipient has become all the more critical. Stockbrokers receive
electronic information on stock prices and make instantaneous decisions
involving millions of dollars based on that information. Banks and traders offer
online services. And, more generally, companies send critical business data to
suppliers, partners and customers.
While secrecy can be important in these kinds of transactions, the integrity
and authenticity of the data is even more crucial. Indeed, most of the data on
the World Wide Web is publicly available, and confidentiality (or secrecy) is
not an issue. But when one accesses online financial information about
companies, stock quotations, airline flight times and similar information, one
wants to be certain it is the genuine article. In the language of security, the
process of verifying that a message really comes from the stated sender and that
none of the transmitted data has been changed in transit is called
authentication and integrity verification, and is broadly referred to as message
authentication.
In 1994, researchers at IBM's Thomas J. Watson Research Center became
involved in the effort to define an Internet standard to ensure data
authenticity and integrity. Their brainchild, the Hash-based Message
Authentication Code (HMAC), has quickly become the Internet standard for
cryptographic message authentication. It is already offered in several popular
commercial software packages such as Netscape's® Web browser, and will be
included in the forthcoming IP (Internet Protocol) version 6, the industry
standard for Internet communication, as well as in many other proposed Internet
and banking standards.
Finding a common way
The path leading to the creation of HMAC began in discussion about Internet
security protocols. Defining Internet protocols - the formal set of rules
to transmit data in the heterogeneous world of the Internet - has been
largely done through a process of Netizen self-government. Starting in 1969,
programmers and scientists have voluntarily circulated drafts of documents and
standards proposals, called Request for Comments (RFCs), for review by experts
and the Internet community at large. The process is coordinated by the Internet
Engineering Task Force (IETF), an open international community of network
researchers, designers, operators and vendors. IETF coordinates both the
evolution and operation of the Internet. Once there is agreement on an RFC, it
becomes the standard for the entire Internet community.
IETF's Internet Protocol Security (IPSEC) working group has for some years
been grappling with standards for authentication and confidentiality
(encryption). In mid-1994, at the time that Hugo Krawczyk, then a research staff
member in the cryptography group at Watson, joined IPSEC, the group's mission
was to add security features to the existing IP protocol. If you have security
at the basic IP level, explains Krawczyk (who currently has a joint appointment as a researcher at IBM
at the Technion in Israel), then applications running on top
of that level inherit its security.
Krawczyk, however, felt that the techniques that the IPSEC group were
considering adopting for message authentication were not demonstrably secure. In
effect, the confidence placed in the security features of the techniques was
based on immunity to specific kinds of attacks rather than on detailed
cryptographic analysis. So, along with colleagues Mihir Bellare and Ran Canetti,
he set out to design a new function that would authenticate both the
participants and the data.
In addition to security design goals, the IBM research team kept in mind the
engineering requirement of reasonable performance. If the authentication process
took too long to run, no one would use it. The result was an innovative
combination of basic cryptographic tools
that achieved both a simplicity of
use and a reliable authentication mechanism. Plus, it satisfied cross-platform
requirements, and the technology involved was not regulated by the export
controls the United States government imposes on certain kinds of cryptographic
tools.
Authenticating a message
Message integrity and authentication require that the receiver has some way
of verifying that the information contained in the message is precisely what the
sender sent (data integrity) and that it was sent by the named sender and not an
imposter (authentication). That can be done by means of a Message Authentication
Code (MAC), based on a secret key shared by the sender and receiver.
The idea is that the sender computes a number derived from a combination of
the key and the message and appends it to the message itself. The receiver, on
receipt of the message, uses the same key and computation procedure to recompute
the special number, called an authentication tag. If it matches the number sent
together with the message, one concludes that the data has not been tampered
with. Also, since the shared key is presumed secret and only the holders of the
key can correctly compute the tag, the identity of the sender is verified. This
checking process runs automatically in the background on both the senders and
the receivers end.
One way of constructing a MAC is to use standard encryption technology, such
as the digital encryption standard (DES), invented by IBM researchers in the
1970s. DES and other encryption techniques rely on a shared secret key. However,
encryption has certain drawbacks for use in an Internet protocol, which range
from the costs of hardware and software and their subjection to export controls,
to licensing fees (in some cases) and the speed limitations at which encrypted
data can be sent.
These shortcomings led to the consideration of alternate ways of creating
MACs. One of the most promising is so-called cryptographic hash functions. A
hash function is a procedure that takes a string of bits and creates a unique
short number. Originally introduced for various purposes in computer science,
they were later adapted for cryptographic applications.
The Watson team decided that any new authentication technique they developed
stood a better chance of wide adoption if it started with an already widely used
hash function, such as the Message-Digest 5 algorithm. The MD5 algorithm was
developed in the early 1990s by Professor Ron Rivest at MIT and RSA Data
Security, a well-known security vendor. MD5 provides a way to create a 128-bit "fingerprint,"
or "message digest," from a message of any length in such a way that
it is hard to produce two messages with the same fingerprint or to create a
message that would yield a given fingerprint.
Because MD5 and other cryptographic hash functions do not involve the use of
a secret key, they are not, by themselves, suitable for authentication. The
finding of an appropriate way to integrate such a key into a hash function was
one of the teams major accomplishments. Indeed, the essence of HMAC lies in the
specific manner in which the key, the data and the hash function are combined.
The team was able to show that HMAC was secure as long as some basic
properties of the underlying hash function were preserved, and they designed
HMAC so that the hash function could easily be replaced by another, improved
hash function if MD5 were shown to be insecure. An example of such an improved
(although slower) hash function is Secure Hash Algorithm (SHA), which has been
adopted as the U. S. standard for cryptographic hash functions. Today, HMAC is
widely used with both MD5 and SHA. In the spring of 1996, Pau-Chen Cheng, of
Watson, developed the first implementation of HMAC the actual code as part of
his work on IPSEC protocols.
Readying for the future
While HMAC is already widely employed, it is expected to have an even wider
impact, says Canetti. Every computer connected to the Internet will be using it.
In addition to many commercial products incorporating HMAC, it has also been
incorporated (for a slightly different purpose) in VISAs and MasterCards SET
standard for credit card transactions over the Internet.
By design, HMAC can be improved. Indeed, the advent of video-on-demand,
telepresence conferencing, and other applications requiring broader bandwidth
and massive data pumping, will undoubtedly require higher speed authentication.
Already, Krawczyk and Shai Halevi, of Watson, have demonstrated an experimental
authentication system that runs much faster than HMAC on advanced platforms.
This work may well help Internet security keep pace with the lightning-quick
evolution of the Net itself.
Sara Reese Hedberg is a freelance technology writer based in Seattle,
Washington.