Arnon Amir

About me

Arnon Amir

Research Staff member, computer vision, speech, image and video analysis, indexing and retrieval.


Research lab: Almaden Research Center


Dr. Arnon Amir is a research staff member with the Interaction Sciences group at the IBM Almaden Research Center. His work covers multiple aspects of computer vision and multimedia information retrieval, from data analysis, indexing and search, to browsing and visualization.

Dr. Amir sees the task of information retrieval as part of a percetual, "high" layer in Human Computer Interaction (HCI), where in order to accomplish a task, the user has to convey an information need to the computer. Hence improvements can arise from making the computer better interpret the human information need and match it against indexed data. His work in this area includes speech, image and video analysis, computer vision, image and video segmentation, shot boundary detection, speech indexing (both phonetic and speech-recognition based), multimodal query retrieval and efficient video browsing, visualization and summarization. Within this scope he was a core member of the CueVideo project, the Multimedia Mining Adventurous Research project and the Multimedia Understanding and Semantic Extraction (MUSE) project. He is a member of the IBM team for the NIST TRECVID video retrieval benchmark since its inception in 2001.

Fascinated by the human vision system, Dr. Amir explores new computer vision algorithms for detecting and tracking human eyes and determining the point of regard, and their applications in HCI. His work includes eye detection, eye contact sensors and eye gaze tracking. As part of the BlueEyes project he developed calibration-free eye gaze tracking with free head motion and a single-chip eye detection prototype. He seeks to make these technologies robust, easy to use, widely available and affordable to the mass.



Short biography

Publications and patents


Last updated 6 Apr 2006