Publications

Journal Articles    Conference Papers   Patents    Awards Invited Talks   
PUBLICATIONS - Journal Articles

  1. G. Saon, "Cursive Word Recognition Using a Random Field Based Hidden Markov Model", International Journal of Document Analysis and Recognition IJDAR, 1(4):199--208, 1999.
  2. G. Saon and A. Belaid. "High Performance Unconstrained Word Recognition System Combining HMMs and Markov Random Fields" International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI) Special Issue on Automatic Bankcheck Processing, 11(5):771--788, 1997.
  3. A. Belaid and G. Saon, "Utilisation des processus markoviens en reconnaissance de l'ecriture" Traitement du signal, 14(2):161--178, 1997.
  4. B. Singer and G. Saon, "An Efficient Algorithm for Parallel Integer Multiplication", Journal of Network and Computer Applications, 19(4):415--419, 1996.
  5. G. Saon and M. Padmanabhan, "Data-driven Approach to Designing Compound Words for Continuous Speech Recognition" IEEE Transactions on Speech and Audio Processing, 2000.
  6. M. Padmanabhan, G. Saon, J. Huang, B. Kingsbury and L. Mangu, "Automatic speech recognition performance on a voicemail transcription task", IEEE Transactions on Speech and Audio Processing, 2002.
  7. F. Yvon, G. Zweig and G. Saon, "Arc Minimization in Finite-state Decoding Graphs with Cross-word Acoustic Context", Journal on Computer, Speech and Language Processing, Elsevier Publishing, 2004.
  8. S. Chen, B. Kingsbury, L. Mangu, D. Povey, G. Saon, H. Soltau and G. Zweig. "Advances in Speech Transcription at IBM under the DARPA EARS Program". IEEE Transactions on Speech and Audio Processing, 2006.
PUBLICATIONS - Conference Papers

  1. G. Saon, A. Belaid and Y. Gong, "Off-line Handwriting Recognition by Statistical Correlation" MVA'94 IAPR Workshop on Machine Vision Applications, pages 371--374, Japan, December 1994.
  2. A. Belaid and G. Saon, "Use of Stochastic Models in Text Recognition", KOSEF-CNRS French-South Korean Workshop on Text Recognition, pages 79--98, South-Korea, September 1994.
  3. G. Saon and A. Belaid, "Recognition of Unconstrained Handwritten Words Using Markov Random Fields and HMMs", Fifth International Workshop on Frontiers in Handwriting Recognition (IWFHR5), pages 429--432, University of Essex, England, September 1996.
  4. G. Saon, A. Belaid and Y. Gong, "Stochastic Trajectory Modeling for Recognition of Unconstrained Handwritten Words", Third International Conference on Document Analysis and Recognition (ICDAR'95), pages 508--511, Montreal, Canada, 1995.
  5. G. Saon and A. Belaid, "Binary Pattern Recognition Using Markov Random Fields and HMMs", International Conference on Acoustics, Speech and Signal Processing (ICASSP'97), volume 5, pages 3725--3728, Munich, Germany, April 1997.
  6. G. Saon and A. Belaid, "Off-line Handwritten Word Recognition Using A Mixed HMM-MRF Approach", Fourth International Conference on Document Analysis and Recognition (ICDAR'97), volume 1, pages 118--122, Ulm, Germany, August 1997.
  7. G. Saon and A. Belaid, "Reconnaissance de l'ecriture manuscrite hors-ligne par une approche markovienne planaire \`a base de champs aleatoires", Conference Internationale Francophone sur l'Ecrit et les Documents (CIFED'98), pages 41--49, Montreal, Canada, Mai 1998.
  8. M. Padmanabhan, G. Saon, S. Basu, J. Huang and G. Zweig, "Recent improvements in voicemail transcription" 6th European Conference on Speech Communication and Technology, Budapest, Hungary, 1999.
  9. G. Saon, M. Padmanabhan. Data-driven approach to designing compound words for continuous speech recognition Automatic Speech Recognition and Understanding Workshop ASRU'99, Keystone Colorado, 1999.
  10. G. Zweig, G. Saon, M. Padmanabhan, J. Huang, S.Basu, "Recent Improvements in Voicemail Transcription", DARPA Workshop on Broadcast News Transcription, 1999.
  11. G. Saon, M. Padmanabhan, R. Gopinath, S. Chen, "Maximum Likelihood Discriminant Feature Spaces", International Conference on Acoustics, Speech and Signal Processing, Istanbul, Turkey, 2000.
  12. G. Saon, M. Padmanabhan, "Minimum Bayes Error Feature Selection for Continuous Speech Recognition", Advances in Neural Information Processing Systems 13, Denver, Colorado, 2000.
  13. G. Saon, M. Padmanabhan and G. Zweig, "Linear feature space projections for speaker adaptation", International Conference on Acoustics, Speech and Signal Processing, Salt Lake City, Utah, 2001.
  14. A. Aaron, S. Chen, P. Cohen, S. Dharanipragada, E. Eide, M. Franz, J. Leroux, X. Luo, B. Maison, L. Mangu, T. Mathes, M. Novak, P. Olsen, M. Picheny, H. Printz, B. Ramabhadran, A. Sakrajda, G. Saon, B. Tydlitat, K. Visweswariah, and D. Yuk, "Speech recognition for DARPA Communicator", International Conference on Acoustics, Speech and Signal Processing, Salt Lake City, Utah, 2001.
  15. G. Saon, H. Huerta and E. E. Jan, "Robust digit recognition in noisy environments: the IBM Aurora 2 system", 7th European Conference on Speech Communication and Technology, Aalborg, Denmark, 2001
  16. G. Saon, M. Padmanabhan and R. Gopinath, "Eliminating inter-speaker variability prior to discriminant transforms", Automatic Speech Recognition and Understanding Workshop ASRU'01, Madonna di Campiglio, Italy, 2001.
  17. G. Zweig, G. Saon, M. Padamanabhan, L. Mangu, B. Kingsbury, J. Huang and S. Chen, "The IBM 2001 conversational speech recognition system", NIST HUB5 workshop presentation, 2001.
  18. M. Padmanabhan, G. Saon, G. Zweig, J. Huang, B. Kingsbury and L. Mangu, "Evolution of the performance of speech recognition systems in transcribing conversational telephone speech", Instrumentation and Measurement Technology Conference, 2001.
  19. J. Huang, B. Kingsbury, L. Mangu, G. Saon, R. Sarikaya and G. Zweig, "Improvements to the IBM Hub5e system", NIST RT'02 workshop presentation, 2002.
  20. B. Kingsbury, G. Saon, L. Mangu, M. Padmanabhan, R. Sarikaya, "Robust speech recognition in noisy environments: the IBM 2001 SPINE evaluation system", International Conference on Acoustics, Speech and Signal Processing, Orlando, Florida, 2002.
  21. S. Fine, G. Saon and R. Gopinath, "Digit recognition in noise via a sequential GMM/SVM system", International Conference on Acoustics, Speech and Signal Processing, Orlando, Florida, 2002.
  22. G. Saon and J. Huerta, "Improvements to the IBM Aurora 2 multi-condition system", 5th International Conference on Spoken Language Processing, 2002.
  23. G. Zweig, G. Saon and F. Yvon, "Arc minimization in finite state decoding graphs with cross-word acoustic context", 5th International Conference on Spoken Language Processing, 2002.
  24. G. Saon, G. Zweig, B. Kingsbury, L. Mangu and U. Chaudhari, "An architecture for rapid decoding of large vocabulary conversational speech", 8th European Conference on Speech Communication and Technology, Geneva, Switzerland, 2003.
  25. B. Kingsbury, L. Mangu, G. Saon, G. Zweig, S. Axelrod, V. Goel and M. Picheny, "Toward domain-independent conversational speech recognition". 8th European Conference on Speech Communication and Technology, Geneva, Switzerland, 2003.
  26. G. Saon, S. Dharanipragada and D. Povey, "Feature Space Gaussianization", International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, 2004.
  27. R. Sarikaya, Y. Gao and G. Saon, "Fractional Fourier Transform Features for Speech Recognition", International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, 2004.
  28. D. Povey, B. Kingsbury, L. Mangu, G. Saon, H. Soltau and G. Zweig, "fMPE: Discriminatively Trained Features for Continuous Speech Recognition", International Conference on Acoustics, Speech and Signal Processing, Philadelphia, PA, 2005.
  29. H. Soltau, B. Kingsbury, L. Mangu, D. Povey, G. Saon, and G. Zweig, "The IBM 2004 Conversational Telephony System for Rich Transcription", International Conference on Acoustics, Speech and Signal Processing, Philadelphia, PA, 2005.
  30. G. Saon. "A Non-linear Speaker Adaptation Technique Using Kernel Ridge Regression", International Conference on Acoustics, Speech and Signal Processing, France, 2006.
  31. D. Povey and G. Saon. "Feature and Model Space Adaptation with Full Covariance Gaussians", 7th International Conference on Spoken Language Processing, Pittsburgh, 2006.
  32. G. Saon, B. Ramabhadran and G. Zweig. "On the Effect of Word Error Rate on Automatic Quality Monitoring", Spoken Language Technology, Aruba, 2006.
  33. H. Soltau, G. Saon, B. Kingsbury, J. Kuo, L. Mangu, D. Povey and G. Zweig. "The IBM 2006 GALE Arabic ASR System", International Conference on Acoustics, Speech and Signal Processing, Hawaii, 2007.
  34. G. Saon and M. Picheny. "Lattice-based Viterbi Decoding Techniques for Speech Translation", Automatic Speech Recognition and Understanding Workshop, Kyoto, Japan, 2007.
  35. D. Povey, D. Kanevsky, B. Kingsbury, B. Ramabhadran, G. Saon and K. Visweswariah. "Boosted MMI for Model and Feature Space Discriminative Training", International Conference on Acoustics, Speech and Signal Processing, Las Vegas, 2008.
  36. G. Saon and D. Povey. "Penalty Function Maximization for Large Margin HMM Training", Interspeech, Brisbane, Australia, 2008.
PATENTS

  1. G. Saon and M. Padmanabhan, "Methods and apparatus for forming compound words for use in a continuous speech recognition system", 6,385,579 issued 5/7/2002.
  2. G. Saon, M. Padmanabhan and R. Gopinath, "Methods and apparatus for performing heteroscedastic discriminant analysis in pattern recognition systems", 6,609,093 issued 8/19/2003.
  3. M. Padmanabhan, G. Saon and G. Zweig, "Lattice-based unsupervised maximum likelihood linear regression for speaker adaptation", Current filing.
  4. H. Kuo, S. Balakrishnan, Y. Gao, S. Axelrod, M. Picheny, B. Maison, S. Chen, R. Gopinath, D. Nahamoo, G. Saon and G. Zweig, "Speech recognition utilizing multitude of speech features", Current filing.
AWARDS

  1. IBM Outstanding Innovation Award in appreciation for: Outperformance in 2004 DARPA EARS Evaluation, November 2005.
  2. IBM Research Division Technical Group Award for: best 2004 EARS telephony system, December 2004.
Invited Talks

  1. "Linear Discriminant Feature Space Transformations", AT&T, August, 2000.
  2. "Towards SuperHuman Speech Recognition", Eurecom Institute, 2002.
  3. "Progress and Challenges in Automatic Speech Recognition", Pace University, April, 2005.
  4. "Progress and Challenges in Acoustic Modeling and Speaker Adaptation", Johns Hopkins University, October, 2005.


Content navigation