Shourya Roy

About me

Shourya Roy

Technical Staff Member


Research lab: India Research Lab


Shourya Roy is a Technical Staff Member in IBM India Research Lab (IRL) in New Delhi, India. He did his post-graduation from the Dept. of Computer Science and Engineering", IIT Bombay. There he did his thesis work with Prof. Soumen Chakrabarti. Shourya has done his graduation from Computer Science and Engineering Department, Jadavpur University, Kolkata. He was associated with Strand Genomics Pvt. Ltd. for a very short period of 4 months prior to joining IBM IRL on July 2002. Since then he had been associated with the Information Management group in IRL for nearly 5 years.
Shourya is currently involved with the Business Development and Solutions team. His job responsibility includes taking mature research technologies to market and engage with customers. He also looks after how research technologies can help IBM to gain competitive advantage and win deals with customers.
Prior to this, Shourya was involved in a project for building an infrastructure for analytics for Contact Centers. Contact Centers produce humongous amount of unstructured data in the form of emails, chat-logs, customer feedback, call recordings, call transcriptions, SMS and so on. Analysis of such data is important for many reasons such as improving customer satisfaction, improving complaint handling skills of agents. Till date, analysis of such data is performed primarily manually which prohibits complete and in-depth analysis of this data. Typically, analysts sample a very small fraction of the data and produce report based on that. In this project, Shourya was involved towards building a scalable infrastructure for analysis of such huge volume of data as well as producing Business Intelligence in the form of reports at different level of abstraction.

Shourya's research interest includes Information Retrieval, Machine Learning, Text Categorization, Data Mining, Speech Analytics and Data Bases. As a part of IBM Research, Shourya has published a number of research papers and patents based on his work. If you want a copy of any of the following papers please send an email to him.

As part of research Shourya is very interested in the area of noisy text analytics. Every day all of us are producing huge amount of noisy text in the form of email, chat, SMS, blog, postings. Add to that other traditional sources of noisy data such as Automatic Speech Transcription System (ASR), Optical Character Recognition (OCR). How do these kind of noises affect traditonal text analytics? Does classification or clustering quality reduces with increasing amount of noise? Does Information Extraction from text get affected by presence of noise? Is it better to preprocess to make the text clean or postprocessing is enough? Contact centers are major sources of such data and a lot of important applications are required in that domain.
Shourya, along with others, organized a workshop on Analytics for Noisy Unstructured Text Data (AND 07 ) in conjunction with 20th International Joint Conference on Artificial Intelligence, 2007 held in Hyderabad India. He strongly feels that this is one of the emerging research areas in text analytics which requires serious attention from researchers from various fields such as Machine Learning, Text Analytics, Natural Language Processing.
Publications:

  • Computer Science
    • “An Integrated System for Automatic Customer Satisfaction Analysis in the Services Industry”, Demonstration paper, KDD 2008, Shantanu Godbole and Shourya Roy
    • “Unsupervised Learning of Multilingual Short Message Service (SMS) Dialect From Noisy Examples”, 2nd Workshop on Analytics for Noisy Unstructured Text Data (AND 2008), Sreangsu Acharyya, Sumit Negi, L Venkata Subramaniam and Shourya Roy
    • “Integrating Text Classification, Business Intelligence, and Interactive Labeling for Services Industry Deployments”, Industry/Govt Track, KDD 2008, Shantanu Godbole and Shourya Roy
    • “Text to Intelligence: Building and Deploying a Text Mining Solution in the Services Industry for Customer Satisfaction Analysis”, IEEE International Conference on Services Computing (SCC) 2008, Shantanu Godbole and Shourya Roy
    • “Adding Sentence Boundaries to Conversational Speech Transcriptions using Noisily Labelled Examples", International Journal of Document Analysis and Recognition(IJDAR), Tetsuya Nasukawa, Diwakar Punjani, Shourya Roy, L V Subramaniam and Hironori Takeuchi
    • “Analytical Techniques for Noisy and Unstructured Text Data - I”, Encyclopedia of Artificial Intelligence, published by Springer, Shourya Roy and L V Subramaniam
    • “Analytical Techniques for Noisy and Unstructured Text Data - II”, Encyclopedia of Artificial Intelligence, published by Springer, L V Subramaniam and Shourya Roy
    • "Unsupervised Segmentation of Conversational Transcripts"; Krishna Kummamuru, Deepak P, Shourya Roy, L Venkata Subramaniam; SIAM Data Mining 2008 Conference, Atlanta, GA, April 2008
    • "“How Much Noise in Text is too Much: A Study in Automatic Document Classification”, ICDM 2007, Sumeet Agarwal, Shantanu Godbole, Diwakar Punjani and Shourya Roy
    • "Automatic Identification of Valuable Segments and Expressions for Mining of Business-Oriented Conversations at Contact Centers", EMNLP 2007, Hironori Takeuchi, Tetsuya Nasukawa, L V Subramaniam, Shourya Roy and Sreeram Balakrishnan
    • “ProACT: A solution for Automatic Customer Satisfaction Analysis and Business Intelligence in Contact Centers”, 16th Annual Frontiers in Service Conference, Sumeet Agarwal, Shantanu Godbole, Raghu Krishnapuram, Diwakar Punjani and Shourya Roy
    • “A Conversation-Mining System for Gathering Insights to Improve Agent Productivity", 9 th IEEE Conference on E-Commerce Technology (CEC' 07) and the 4th IEEE Conference on Enterprise Computing, E-Commerce and E-Services (EEE ' 07), Hironori Takeuchi, L Venkata Subramaniam, Tetsuya Nasukawa, Shourya Roy and Sreeram Balakrishnan
    • "Identity Delegation In Policy Based Systems”, IEEE Workshop on Policies for Distributed Systems and Networks (POLICY 2007), Rajeev Gupta, Shourya Roy, and Manish Bhide
    • “Adding Sentence Boundaries to Conversational Speech Transcriptions using Noisily Labelled Examples", IJCAI-2007 Workshop on Analytics for Noisy Unstructured Text Data, Tetsuya Nasukawa, Diwakar Punjani, Shourya Roy, L V Subramaniam and Hironori Takeuchi
    • “A Middleware for Storage, Federation, Security and Access Control of Policies ", Special Issue of Journal of Autonomic and Trusted Computing on Autonomic and Trusted Computing Systems and Applications, Anuradha Bhamidipaty, Manish Bhide, Rajeev Gupta, Mukesh Mohania and Shourya Roy
    • “Analysis of Agents from Call Transcriptions of a Car Rental Process", LAICS-NLP (Language, Artificial Intelligence and Computer Science for Natural Language Processing applications), Swati Challa, Shourya Roy and L V Subramaniam
    • "Automatic Generation of Domain Models for Call-Centers from Noisy Transcriptions; Shourya Roy, L V Subramaniam, the joint conference of the International Committee on Computational Linguistics and the Association for Computational Linguistics (ACL-COLING), 2006, Sydney, Australlia
    • "Identity Delegation in Policy Based Systems; Rajeev Gupta, Shourya Roy, Manish Bhide; poster in the 3rd IEEE International Conference on Autonomic Computing, 2006, Dublin, Ireland
    • "OPTICS On Text Data: Experiments and Test Results; Deepak P, Shourya Roy; Text Mining Workshop (TM-2006) held in conjunction with the SIAM International Conference on Data Mining (SIAM DM-2006), Maryland, USA
    • "Scaled Entropy and DF-SE: Different and Improved Unsupervised Feature Selection Techniques for Text Clustering; Deepak P, Shourya Roy; International Workshop on Feature Selection for Data Mining (FSDM 2006) held in conjunction with the SIAM International Conference on Data Mining (SIAM DM-2006), Maryland, USA
    • "Automatic categorization of web sites based on source types; Shourya Roy, Sachindra Joshi, Raghu Krishnapuram; August 2004; Proceedings of the fifteenth ACM conference on Hypertext and hypermedia HYPERTEXT '04
    • "A hierarchical monothetic document clustering algorithm for summarization and browsing search results; Krishna Kummamuru, Rohit Lotlikar, Shourya Roy, Karan Singal, Raghu Krishnapuram; May 2004 ; Proceedings of the 13th international conference on World Wide Web
    • "Fast and accurate text classification via multiple linear discriminant projections; Soumen Chakrabarti, Shourya Roy, Mahesh V. Soundalgekar; August 2003; The VLDB Journal - The International Journal on Very Large Data Bases, Volume 12 Issue 2
    • "Fast and accurate text classification via multiple linear discriminant projections; Soumen Chakrabarti, Shourya Roy, Mahesh Soundalgekar; 28th Intenational Conference on Very Large Databases(VLDB), Hong Kong, August 2002.
  • Economics
    • "Economic Freedom and Economic Growth: An Analysis of BRIC Countries"; Simrit Kaur, Shourya Roy; Business Horizon: A Journal of Commerce and Economics; 2007


Shourya is also pursuing Master OF Business Administration (Part Time) from Faculy of Management Studies (FMS), University of Delhi and is expected to complete in April 2009.
On the personal side, Shourya is basically from the City of Joy Kolkata in the state of West Bengal, India. His schooling was in one of the renowned school in the city, Hindu School . Currently, he is settled in New Delhi and leading a happily married life with Sonali. She is a teacher in G. D. Goenka Public School, Vasant Kunj, New Delhi.
Attached file: Contact.jpg

Last updated 3 Jul 2008

Content navigation

Related links