A Seeing Eye for the Web
The Nanoworld on a Shoestring
Automated Web Publishing
Is It Live or Is It Text to Speech?
IBM Software + Web = Science Online
Why the ThinkPad Keeps Coming and Going
A Seeing Eye for the Web
While most of us take the World Wide Web for granted, this invaluable information resource has been closed to the visually impaired. But new software developed by a team at IBM's Tokyo Research Laboratory makes the online world accessible at last. "When I discovered the Web and found how useful it could be," says team leader Chieko Asakawa, "it occurred to me that a Web-reading environment would be a great help to blind people." The resulting product, Homepage Reader, is available in a Japanese-language version. An English version is in preparation.
Asakawa, who is blind, says she was inspired by IBM's Screen Reader 2, which enabled her to access the Netscape Navigator Web browser. The problem was that the software reads text-based information only, and was therefore not suited to the Internet's multimedia environment, which contains embedded images and hyperlinks.
Asakawa and her team set out to design a system capable of interpreting the Web's special coding. One of the biggest challenges, Asakawa notes, was devising a simple way to navigate. "Remember, blind people can't use the mouse," she says.
The team's solution was to use the computer's numeric keypad. Once online, Homepage Reader announces the default home page, and the surfer can use number keys to back up and move forward through pages, lines and individual characters. "If I want to read something, I just double-click key 2 and the system will read the page from the beginning," says Asakawa. Other keys let the user fast-forward, rewind or jump to the next link on the page. When the reader arrives at a hyperlink, the voice switches from male to female -- a cue that Asakawa deems "intuitive and natural."
As part of an outreach program, IBM has already trained some 400 people in the
use of Homepage Reader, and plans to train another 1,500 over the next year.
Asakawa says users can master the basic functions in less than an hour. The biggest hurdle,
she says, is explaining the concept of a hyperlink -- or, indeed, the Internet itself --
to some visually impaired users. But the rewards are enormous. "One trainee," Asakawa reports,
"said Homepage Reader made him feel empowered. He had never imagined he would ever be able to
access the Web, let alone so easily."
-- Paul Kallender
FYI:
http://www.research.ibm.com/topics/innovate/hci/
The Nanoworld on a Shoestring
THE ATOMIC FORCE microscope (AFM), which can image nanometer-scale surfaces
and structures, has become a basic tool of science and industry. But the
instrument has been something of a luxury. With a typical price tag of $100,000,
the AFM has been too costly for many corporate laboratories, let alone for schools.
In May, however, Seiko Instruments introduced a novel AFM -- initially for the Japanese market
-- that lowers the price barrier.
As the result of a joint effort with IBM's Zurich Research
Laboratory, where the development of the low-cost AFM was
begun by Gerd Binnig and Walter Häberle in 1994, Seiko
Instruments is now manufacturing and marketing the system,
called Nanopics, at less than a third of the price of previous AFMs.
One cost-saving innovation is the use of integrated sensors, instead of lasers,
to measure the minute deflections of the microscope's cantilever caused by the
forces acting between itself and the surface atoms. (Those tip movements are translated
into an image of the contours of the object under study.) Nanopics also replaces costly
piezoelectric elements with inexpensive voice coils -- similar to the drivers in stereo
speakers -- to control the tip movement. To cut costs further, only the head with
the cantilever, not the sample itself, moves during scanning, and images are captured
through simple video-quality recording.
Another innovation is a compact design that reduces the AFM's sensitivity to vibration.
This places less stringent requirements on the operating environment.
"Our collaboration with Seiko Instruments," says Binnig, "has produced a microscope
that we believe will become standard equipment in many laboratories." Binnig, who invented
the original AFM and developed it with his Zurich colleague Christopher Gerber and with
Calvin Quate of Stanford University, predicts that Nanopics will allow even high-school
students to explore nature on the nanometer scale.
-- Katherine Silberger
Automated Web Publishing
Converting printed materials such as books and catalogs into Web pages is
about to become a lot simpler, thanks to a tool being developed at IBM's
China Research Laboratory, in Beijing. With current Web publishing products,
such as Microsoft's FrontPage98®, text and images must be input separately,
and a new page layout must be created using HTML tags. Generating the hyperlinks for
navigating the document is yet another time sink. IBM's prototype Web publishing tool
speeds up the process by automatically recognizing a document's component parts --
table of contents, headings, page numbers, text and images -- and structuring the Web version
accordingly. Although the work is aimed at developing a product for the Chinese market,
the technology could be adapted to any language, notes Hui Su, who leads the team working
on the prototype.
To create a Web page, a printed document is first scanned into a computer.
An optical character recognition program developed at IBM then maps the document
into blocks of text, images and titles. (The program can recognize more than 4,000
Chinese characters in six fonts, or about 99 percent of the characters used in
general text.) Finally, the publishing tool restores files to their original layout,
but in accordance with a chosen Web format such as HTML.
No Web page is complete without hyperlinks, and the publishing tool
simplifies their addition to a page. Users can create a list of URLs,
together with related strings of text. The tool then automatically
detects those strings on the page and inserts links to the desired URLs.
Hyperlinks can also be added manually.
The tool is especially useful for publishing book-length documents.
"We have used our single-page technique to automatically build a 'Web book,'"
says Su. Because the tool recognizes tables of contents and page numbers, it can create
links that take the reader directly to a chapter or page of interest.
As the technology matures, large virtual libraries will become easier and easier to build.
-- Xu Fang
Is It Live or Is It Text to Speech?
In their quest for customer-service perfection, three large Japanese banks have raised
their automated telephone banking systems to a new level of user friendliness.
When customers of Fuji Bank, Sanwa Bank and the Bank of Tokyo-Mitsubishi call
in to check their account balances or transfer funds, they hear not the usual
disjointed "robot speak" but a natural-sounding human voice. The secret is a
new text-to-speech (TTS) system developed at IBM's Tokyo Research Laboratory.
The basic technology was originally used in IBM's award-winning Japanese program called ProTALKER,
which reads PC text aloud. But as researcher Takashi Saito explains, adapting the technology
to server-based telephony required considerable tweaking.
At the outset, the research team recognized that even state-of-the-art text-to-speech
technology falls short of a recorded voice. So the system
uses prerecorded words and numbers, combining them with personal and company
names generated by TTS. To smooth transitions, the same speaker who records
the set phrases also records the phonetic dictionary -- called the voice font --
that is used to generate the TTS names.
One key innovation is a simple method for automating the creation of a voice font.
The TTS engine uses a database of 1,600 words recorded for the voice font and segments
them into phonetic units. Each unit, made up of a consonant-vowel pair, is classified
according to context, which can affect the way the sound is articulated. A syllable
may be subject to 100 or so contextual variations. Automating the creation of a voice
font reduces the time and effort involved in configuring or modifying a telephone banking
system.
In developing the new system, Saito's team overcame a common TTS problem: the distortion
that arises when phonetic units are strung together. The researchers patented a method of
connecting units so that pitch shaking and rumbling sounds are minimized. Saito says the
technique could be applied to any language in the world.
Another modification was optimizing the voice for the narrow, 4 kilohertz bandwidth
of telephones. The researchers also improved processing by configuring a server that
is dedicated to TTS and connected by a high-speed network to a Direct Talk/6000
call transaction handler.
The Tokyo group is working to improve the product even further. Saito says the next
step will be to refine the prosody -- the rhythm and intonation -- of TTS speech.
Within three to five years, he hopes to raise TTS quality to the point where the
recorded speech can be eliminated.
-- Dennis Normile
IBM Software + Web = Science Online
SINCE THE FIRST days of the Web, researchers and students have been frustrated by a flaw in
Web browsers: their inability to display the elegant symbols of math and science as easily as
they do regular text. Now, IBM is providing a solution in the form of the techexplorer
Hypermedia Browser -- software that displays mathematical expressions and scientific
documents in a variety of electronic media, including Web pages and CD-ROM-based textbooks.
The techexplorer software builds on the TeX and LaTeX formatting
languages used in scientific and technical publishing, and adds features such as
hypertext and multimedia. The software is also starting to support
the Mathematical Markup Language specification from the World Wide Web Consortium.
This new standard makes the creation and display of equations easier and more interactive than
is possible using plain HTML and embedded images.
The new browser is available in two versions, each offering a different level of functionality.
The Introductory Edition is a no-charge viewer that lets users read interactive documents containing
text and mathematical expressions. The commercial Professional Edition offers other features
as well, including an interface that allows users to write Java applets that interact
with techexplorer. That edition contains a search engine, supports printing and has an
extension mechanism for communicating with other applications.
IBM expects the Professional Edition to be the foundation of a new generation of interactive
textbooks and online journals. "Our goal is to introduce novel forms of interactivity into
technical and scientific documents, so that researchers and students can actually solve problems
within their Web browsers," says Bob Sutor, a research staff member and manager of the interactive
scientific publishing group at IBM's Thomas J. Watson Research Center.
The company is working with several partners to explore the possible uses
of techexplorer for education and Web-based publishing. One partner is the
NSF-funded Project Links group at Rensselaer
Polytechnic Institute in Troy, New York.
The project is developing online science courses in which undergraduates can conduct
experiments and simulations using Java applets. "Our techexplorer software provides
new ways to really explore the topics discussed in scientific documents," Sutor says.