IBM Systems Journal - 2002 Copyright

IBM Skip to main content
  Home     Products & services     Support & downloads     My account  

  Select a country  
Journals Home  
  Systems Journal  
    Current Issue  
    Recent Issues  
    Papers in Progress  
    Search/Index  
    Orders  
    Description  
    Author's Guide  
Journal of Research
and Development
  Staff  
  Contact Us  
  Related links:  
     IBM Research  

IBM Journal of Research and Development  
Volume 40, Number 2, Page 442 (2001)
Deep computing for the life sciences
  Full article: arrowHTML arrowPDF arrowASCII   arrowCopyright info





   

Fastfinger: A study into the use of compressed residue pair separation matrices for protein sequence comparison

by B. Robson
Protein sequences are diverse in size and in content meaningful to researchers. They are rich in what seems to be “noise,” or aspects of lesser interest that obscure clearer core features required to establish true relatedness and function. This paper represents part of a larger study that explores the possible efficient use and storage of “fingers” for protein sequence analysis, i.e., matrices of uniform size and shape that can “stand for” protein sequences by making more explicit the essential aspects of protein sequence pattern information. The essence of the study relates to data compression. Compression invokes an interesting alternative idea of pattern—the concept of “primeness” as in number theory is used to create the notion of an irreducible and potentially recurrent pattern element, and then this philosophy is mapped onto number theory by the unique factorization theorem, in order to define a novel measure of pattern difference. Other possible approaches are also discussed. Because compression and other approximations involve information loss, this is also a study of performance in the face of such loss. Because of the effects of this loss, no claims are made that encourage replacement of established sequence comparison methods, but the concept may have value in a number of applications within, and outside, molecular biology.
Related Subjects: