One-Way Web
The New Face of Linux®
Keeping Web Sites in Sight
DNA Crooks a Finger
In Remembrance: Emanuel Piore
One-Way Web
Apparently, it's not such a small world after all. IBM researchers, in collaboration with scientists from Compaq Corporate Research Laboratories and AltaVista Company, have completed the most extensive map of the Web's structure to date. In the process, they have overturned conclusions drawn from previous studies, which claimed that the Web is so highly interconnected that any two randomly selected pages are separated by as few as 19 clicks. The last study surveyed only about 40 million pages — an area one-fifth the size of that covered by the most recent research effort.
In the latest study, which surveyed some 200 million Web pages, researchers found that earlier experiments failed to capture the Web's complexity. "The prevailing view was that the Web was a big ball of connected pages, and if you knew the right path, you could surf your way from anywhere to anywhere," says Andrew Tomkins, a researcher at IBM's Almaden Research Center. "That doesn't appear to be the case. Instead, we've found that you can't reach significant portions of the Web from other significant portions. It's much tougher to navigate than previously thought."
According to the study, approximately 90 percent of the Web is divided into four regions resembling a bow tie. The central core, or the knot of the bow tie, represents about 27 percent of all Web pages. The core is so well connected that Web surfers can easily navigate between any two pages in this region via hyperlinks. Most corporate Web sites, including IBM's, can be found in this group of core pages.
Another 21.5 percent — the left side of the bow tie — consists of "origination pages." These pages have links that allow users to reach the central core but cannot be reached from it. According to Tomkins, "this region often consists of pages that only recently came into being and have not yet been linked to." Personal Web pages often fall into this category. Tomkins explains, "I might have a special interest — lawn grass, for example — that I might write a Web page about. On my page, I might have links to various seed companies, or to products like fertilizer. Many of these would be in the core, but no one in the core would have heard of my grass page. There would be no way to get to it unless you knew the URL."
Yet another 21.5 percent — the right side of the bow tie — contains "termination pages" that can be accessed from the connected core but do not link back to it. An example would be a technical report on a corporate Web site. "You could follow some of the links on the IBM Web site and eventually get to some technical papers, but that would be the end of the road," says Tomkins. "You couldn't get back to the core from there."
The fourth piece of the bow tie, consisting of 20 percent of the Web, is completely disconnected from the central core. Researchers envision these remaining pages as "tendrils" linking away from origination pages or to termination pages. Also possible are "tubes," linking origination and termination pages without going through the core. Tomkins admits, however, that tubes presently exist only as theoretical possibilities. Finally, the remaining 10 percent of the Web consists of isolated "island" pages.
Researchers hope that the study will enable search engines to develop Web crawling strategies that allow them to capture more of the Web. "AltaVista® can't reach tendrils or origination pages simply by crawling from core pages, so at the end of a crawl, they've covered only part of the Web," says Tomkins. "This study shows that you can't double the size of a crawl simply by increasing bandwidth or buying more machines. If search engines really want to crawl the entire Web, they will have to develop alternative strategies. This might mean that they start encouraging people with interesting pages to come directly to them."
www.research.ibm.com/resources/news/20000511_bowtie.html
The New Face of Linux®
IBM research has demonstrated a wristwatch computer that is the
first such device to use the Linux operating system. Among the benefits of Linux is the ease with which new applications can be developed and tested by IBM and its industry partners.
Users interact with the watch through a combination of a touch-sensitive screen and a mouselike roller wheel. The watch itself contains a powerful processor along with eight megabytes of flash memory and another eight megabytes of dynamic random-access memory.
Designed to communicate wirelessly with PCs, cell phones and other wireless-enabled devices, the watch has the ability to view condensed email messages and directly receive short text messages and audio alerts. In addition, the watch will provide users with calendar, address book and to-do list functions. Future enhancements include a high-resolution screen and applications that will allow the watch to be used as an access device for various Internet-based services such as up-to-the-minute information about weather, traffic conditions, sports results and the stock market.
www.research.ibm.com/WearableComputing/
Web Technology
Keeping Web sites in sight for people and businesses that need to keep tabs on certain Web sites, the job of constantly checking for new information is time-consuming and frustrating. But a group of researchers led by Hiroshi Nomiyama at IBM's Tokyo Research Laboratory (TRL) have come up with a solution: a server-based program called Site Outlining that automatically monitors the Web pages of your choice. "Any changes are captured as HTML files — 'snapshots' of specific content at a specific time, which can be viewed on a client browser," says project manager Koichi Takeda.
Site Outlining consists of three major elements: a Web crawler that can be programmed to check sites hourly, daily or weekly; an XML-based tool that extracts metadata — title, URL and keywords and phrases — from each time-stamped HTML snapshot; and a browser manager for maintaining your list of sites to be monitored.
By comparing the metadata of two consecutive snapshots of a particular Web page, Site Outlining can show any changes that have been made. In this way, metadata can indicate a site's current activity, as well as changes made over the long term. This information is conveyed in the form of visualizations that include directories of recent activities, timelined titles of articles and frequency lists of keywords and phrases.
Takeda says executives can use Site Outlining to create a personalized portal of "hot news" composed of Web sites they want to follow closely. The program can also help in the management of large intranets by indicating which pages are active and which are no longer being updated.
The Site Outlining software was first publicly demonstrated at an IBM Japan technology fair in 1998. In 1999 it was incorporated into IBM Japan's Gold Service, which provides personalized Web pages for certain large account customers. These customers are using it to monitor business news in their industries. TRL is now working with the Personal Systems Group in Japan to turn the software into a consumer product preloaded into Aptiva® PCs.
DNA Crooks a Finger
A team of researchers from IBM's Zurich Research Laboratory (ZRL) and the University of Basel has discovered that DNA can be used to bend tiny silicon "fingers" that have a thickness of less than one 50th of a human hair.
Using a process called "molecular recognition," in which molecules bind according to a lock-and-key mechanism, an array of fingers or cantilevers arranged like the teeth of a comb was made attractive to specific DNA sequences and proteins. By observing the way different cantilevers bent as the DNA adhered to them, the researchers were able to detect the tiniest possible defect in a DNA sequence, a so-called single-base mismatch.
The discovery could also make it possible to build nanorobotic machinery, which would operate as valves using the molecules' specific code. According to ZRL's James Gimzewski, one possible application of the work involves the development of a system to attack cancerous growth.
"The release of just the proper doses of chemicals in the appropriate locations of the body could be achieved using tiny microcapsules equipped with nanovalves," he says. "They could be programmed chemically to open only when they get biochemical signals from a targeted tumor type."
Emanuel Piore
1908-2000
Emanuel Ruben ("Mannie") Piore, ibm's first director of research, died on May 9 at age 92 after a long illness.
As director of IBM Research from 1956 — when it was established as an independent division — to 1961, Piore oversaw a shift that aligned research less closely with development, and strengthened its ties to the academic community. Piore, who was named an IBM vice president in 1960, also served as a member of the IBM Board of Directors from 1962 to 1973.
Born in Wilno, Russia in 1908, Piore moved to the United States at age nine. He earned his Ph.D. in physics at the University of Wisconsin in 1935 and served during World War II on the staff of the deputy chief of Naval operations for air. As chief scientist of the Office of Naval Research from 1946 to 1955, his support of university research provided a model for other federal organizations, such as the National Science Foundation and the Atomic Energy Commission.
Among the significant changes during his tenure as director of Research was the transformation of the Research population itself, which gained newly minted Ph.D.s in unprecedented numbers. As part of his effort to establish closer links with universities, he moved the main Research laboratory from Poughkeepsie, New York, to Westchester County, where the Thomas J. Watson Research Center was opened in 1960.
Piore saw his mission at IBM Research as threefold: to identify future trends, to ensure that exploratory research was carried out in support of current product technologies and to establish Research's reputation as a technical leader. He accomplished this by encouraging the publication of high-quality work and by stressing the importance of a strong patent position.
Much of the research under Piore struck out boldly into new territory, including efforts to build a computer using superconductors and another using microwaves. Neither proved successful, but Piore recognized the risks when he began them, stating that there was no way to discover their potential without experimentation. Piore set Research on a course that valued scientific achievement at the highest level. While that led to a greater separation from product development, it also contributed to the eminence that Research achieved in the following decades.