Publications 2001-Present
2007
-
Data Acquisition and Cost-Effective Predictive Modeling: Targeting Offers for Electronic Commerce by F. Provost, P. Melville, and M. Saar-Tsechansky, Proceedings of the Ninth International Conference on Electronic Commerce, 2007.
-
Predictive Modeling for Collections of Accounts Receivable by S. Zeng, P. Melville, Christian Lang, Ioana Boier-Martin, and Conrad Murphy, Workshop on Mining Multiple Information Sources, The Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, August, 2007.
-
Finding New Customers Using Structured and Unstructured Data, Workshop on Mining Multiple Information Sources by P. Melville, Y. Liu, R. Lawrence, I. Khabibrakhmanov, C. Pendus, and T. Bowden, Workshop on Mining Multiple Information Sources, The Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, August, 2007.
-
Looking for Great Ideas: Analyzing the Innovation Jam, Workshop on Web Mining and Social Network Analysis by M. Helander, R. Lawrence, Y. Liu, C. Reddy, and S. Rosset, Workshop on Web Mining and Social Network Analysis, The Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, August, 2007.
-
A Data Mining Case Study: Analytics-driven Solutions for Customer Targeting and Sales Force Allocation by R. Lawrence, C. Perlich, S. Rosset, I. Khabibrakhmanov, S. Mahatma, and S. Weiss, Workshop on Data Mining Case Studies and Data Mining Practice Prize, The Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, August, (Award: 2nd Place, KDD Data Mining Practice Prize). Earlier version in
IBM Systems Journal, November, 2007.
-
KDD Cup 2007 Task 2 Winner's Report by S. Rosset, C. Perlich, and Y. Liu, KDD Cup and Workshop, The Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, August, 2007. (Award: 1st Place in KDD Cup, Task 2
-
Predicting Who Rated What in Large-Scale Datasets by Y. Liu and Z. Kou, KDD Cup and Workshop, The Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, August, 2007. (Award: 3rd Place in KDD Cup, Task 1)
-
Temporal Causal Modeling with Graphical Granger Methods by A. Arnold, Y. Liu, and N. Abe, Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug 12-15, 2007, San Jose, CA.
-
High-Quantile Modeling for Customer Wallet Estimation and Other Applications by C. Perlich, S. Rosset, R. Lawrence, and B. Zadrozny, Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug 12-15, 2007, San Jose, CA.
-
Identifying Bundles of Product Options using Mutual Information by C. Perlich and S. Rosset, Proceedings of the Seventh SIAM International Conference on Data Mining, April 26-28, 2007, Minneapolis
-
Harmonimum Models for Semantic Video Representation and Classification by J. Yang, Y. Liu, E. Ping, and A. Hauptmann, Proceedings of the Seventh SIAM International Conference on Data Mining, April 26-28, 2007, Minneapolis
2006
-
A Parameterized Probabilistic Model of Network Evolution for Supervised Link Prediction by H. Kashima and N. Abe, Proceedings of the 2006 IEEE International Conference on Data Mining (ICDM 2006), December 2006
-
Data Analytics for Marketing Decision Support by S. Rosset and N. Abe, Tutorial Notes from The Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Tutorial on Data Analytics for Marketing Decision Support, August 2006, Philadelphia, USA
-
A New Multi-View Regression Method with an Application to Customer Wallet Estimation by S. Merugu, S. Rosset, and C. Perlich, to appear in The Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2006, Philadelphia, USA
-
Outlier Detection by Active Learning by N. Abe, B. Zadrozny, J. Langford, to appear in The Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2006, Philadelphia, USA
- Using Secure Coprocessors for Privacy Preserving Collaborative Data Mining and Analytics by B. Bhattacharjee, N. Abe, K. Goldman, B. Zadrozny, V. Reddy, M. del Carpio, C. Apte, in Second International Workshop on Data Management on New Hardware (DaMoN 2006) Chicago, Illinois, June 25, 2006.
- Embedded Predictive Modeling in a Parallel Relational Database by A. Dorneich, R. Natarajan, E. Pednault and F. Tipu, in Proceedings of the 21st ACM Symposium on Applied Computing, Special Track on Data Mining, April 2006, Dijon, France.
- Data-Enhanced Predictive Modeling for Sales Targeting by S. Rosset and R. Lawrence. in Proceedings of the Sixth SIAM International Conference on Data Mining, April 20-22, 2006, Bethesda, Maryland.
- Transform Regression and the Kolmogorov Superposition Theorem by E. Pednault. in Proceedings of the Sixth SIAM International Conference on Data Mining, April 20-22, 2006, Bethesda, Maryland.
2005
- Wallet Estimation Models by S. Rosset, C. Perlich, B. Zadrozny, S. Merugu, S. Weiss and R. Lawrence, In Proceedings of the International Workshop on Customer Relationship Management: Data Mining Meets Marketing (CRM Workshop), 2005.
- Data Mining and Clinical Data Repositories: Insights from a 667,000 Patient Data Set by B. Robson, C. Apte, S. Weiss et al., to appear in Computers in Biology and Medicine, 2005.
- Ranking-Based Evaluation of Regression Models by S. Rosset, C. Perlich and B. Zadrozny, in Proceedings of the Fifth IEEE International Conference on Data Mining, November 2005.
- An Improved Categorization of Classifier's Sensitivity on Sample Selection Bias by W. Fan, I. Davidson, B. Zadrozny, and P. S. Yu, in Proceedings of the Fifth IEEE International Conference on Data Mining, November 2005.
- Business Performance Management System for CRM and Sales Execution by M. Ettl, B. Zadrozny, P. Chowdhary and N. Abe, in Proceedings of the Sixteenth International Conference on Database and Expert Systems Applications, pp. 908-913, August 2005.
- Robust Boosting and Its Relation to Bagging by S. Rosset, in Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2005.
- Gene Classification: Issues and Challenges for Relational Learning by C. Perlich and S. Merugu, in Proceedings of the Workshop on Multi-Relational Data Mining (MRDM), at the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2005.
- One-Benefit Learning: Cost-Sensitive Learning with Restricted Cost Information by B. Zadrozny, in Proceedings of the Workshop on Utility-Based Data Mining (UBDM), at the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2005.
- ROC Confidence Bands: An Empirical Evaluation by S. Macskassy, F. Provost and S. Rosset, in Proceedings of the Twenty-Second International Conference on Machine Learning, August 2005.
- Error Limiting Reductions Between Classification Tasks by A. Beygelzimer, V. Dani, T. Hayes, J. Langford and B. Zadrozny, in Proceedings of the Twenty-Second International Conference on Machine Learning, August 2005.
- Relating Reinforcement Learning Performance to Classification Performance by J. Langford and B. Zadrozny, in Proceedings of the Twenty-Second International Conference on Machine Learning, August 2005.
- Approaching the ILP Challenge 2005: Class-Conditional Bayesian Propositionalization for Genetic Classification by C. Perlich, in Proceedings of the Fifteenth International Conference on Inductive Logic Programming, August 2005.
- Weighted One-Against-All by A. Beygelzimer, J. Langford and B. Zadrozny, in Proceedings of the Twentieth National Conference on Artificial Intelligence, July 2005.
- Learning from Identifier Attributes: Distribution-Based Aggregation for Relational Learning by C. Perlich and F. Provost, in Proceedings of the Dagstuhl Seminar 05051 (Probabilistic, Logical and Relational Learning - Towards a Synthesis), February 2005.
- Sparsity and Smoothness via the Fused Lasso by R. Tibshirani, M. Saunders, S. Rosset, J. Zhu and Keith Knight, in Journal of the Royal Statistical Society Series B, Vol. 67 No. 1, 2005.
- Estimating Class Membership Probabilities using Classifier Learners by J. Langford and B. Zadrozny, in Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, January 2005.
- Improvements to the Linear Programming based Scheduling of Web Advertisements by A. Nakamura and N. Abe, in Journal of Electronic Commerce Research, 5(1), 75-98, 2005.
- Sequential Risk Management in E-Business by Reinforcement Learning, by N. Abe, E. Pednault, B. Zadrozny, H. Wang, W. Fan, and C. Apte, in
Handbook of Integrated Risk Management for E-Business: Measuring, Modeling and Managing Risk, A. Labbi, eds., J.Ross Publishing, 2005.
2004
- A Grid-based Approach for Enterprise-Scale Data Mining by R. Natarajan, R. Sion, C. Apte and I. Narang, in Proceedings of the Workshop on Data Mining and the Grid
at the Fourth IEEE International Conference on Data Mining, November 2004.
- Boosting as a Regularized Path to A Maximum Margin Classifier by S. Rosset, J. Zhu and T. Hastie. Journal of Machine Learning Research, 5(Aug):941-973. August 2004.
- Tracking Curved Regularized Optimization Solution Paths by S. Rosset, in Neural Information Processing Systems, December 2004.
- A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning by S. Rosset, J. Zhu, H. Zou and T. Hastie, in Neural Information Processing Systems, December 2004.
- The Entire Regularization Path for the Support Vector Machine by T. Hastie, S. Rosset, R. Tibshirani and J. Zhu. Journal of Machine Learning Research 5(Oct): 1391--1415, October 2004. R package (short version to appear in NIPS 2004).
- An Iterative Method for Multi-Class Cost-Sensitive Learning by N. Abe, B. Zadrozny and J. Langord, in Proceedings of the International Conference on Knowledge Discovery and Data Mining, Seattle, August 2004.
- Cross Channel Optimized Marketing by Reinforcement Learning by N. Abe, N. Verma, C. Apte and R. Schroko, in Proceedings of the International Conference on Knowledge Discovery and Data Mining, Seattle, August 2004.
- Empirical Evaluation of Feature Subset Selection Based on a Real-world Data Set by P. Perner and C. Apte. Engineering Applications of Artificial Intelligence, Volume 17, Issue 3, Pages 285-288, April 2004.
- Model Selection via the AUC by S. Rosset, in Proceedings of the Twenty-First International Conference on Machine Learning, July 2004.
- Learning and Evaluating Classifiers under Sample Selection Bias by B. Zadrozny, in Proceedings of the Twenty-First International Conference on Machine Learning, July 2004.
- Discussion of "Least Angle Regression" by Efron et al. by S. Rosset and Ji Zhu, in Annals of Statistics, April 2004 (preliminary version).
- Sampling Approach to Resource Light Mining by N. Abe, C. Apte, B. Bhattacharjee, K. Goldman, J. Langford and B. Zadrozny, in Proceedings of the Data Mining in Resource Constrained Environments Workshop
at the Fourth SIAM International Conference on Data Mining, April 2004.
2003
- 1-norm
Support Vector Machines by J. Zhu, S. Rosset, T. Hastie, and R.
Tibshirani, in Seventeenth Annual Conference on Neural Information Processing
Systems (NIPS), 2003.
- Margin
Maximizing Loss Functions by S. Rosset, J. Zhu, and T. Hastie, in
Seventeenth Annual Conference on Neural Information Processing Systems
(NIPS), 2003.
- Integrating
Customer Value Considerations into Predictive Modeling by S. Rosset
and E. Neumann, in IEEE International Conference on Data Mining (ICDM),
2003.
- Cost-Sensitive
Learning by Cost-Proportionate Example Weighting by B. Zadrozny,
J. Langford,N. Abe, in IEEE International Conference on Data Mining
(ICDM), 2003.
- Knowledge-Based
Data Mining by S.M. Weiss, S.J. Buckley, S. Kapoor, and S. Damgaard,
in Proceedings of the International Conference on Knowledge Discovery
and Data Mining, Washington DC, August 24-27, 2003.
- Passenger-Based
Predictive Modeling of Airline No-show Rates by R. D. Lawrence,
S.J. Hong, and J. Cherrier, in Proceedings of the International Conference
on Knowledge Discovery and Data Mining, Washington DC, August 24-27,
2003.
- Data Mining Analytics for Business Intelligence and Decision Support by C. Apte, in OR/MS Today, February 2003.
- Data Intensive Analytics for Predictive Modeling by C. Apte, S.J. Hong, R. Natarajan, E.P.D. Pednault, F. Tipu, and S. Weiss, in IBM Journal of R&D, Vol. 47, No. 1, Pages 17-23, January 2003.
- A
Machine-Learning Approach to Optimal Bid Pricing by R. D. Lawrence,
in Proceedings of the Eighth INFORMS Computing Society Conference on
Optimization and Computation in the Network Era, Chandler, Arizona,
January 2003.
- Reinforcement Learning with Immediate Rewards and Linear Hypotheses by Naoki Abe, Alan Biermann, and Philip Long, in Algorithmica, 37, 263-293, 2003.
2002
- Empirical
Comparison of Various Reinforcement Learning Strategies in Sequential
Targeted Marketing by N. Abe, E.P.D. Pednault, H. Wang, B. Zadrozny,
W. Fan, and C. Apte, in Proceedings of the 2002 IEEE International Conference
on Data Mining, December 2002.
- Prediction
of MHC Class I Binding Peptides by Dynamic Experiment Design based on
Query Learning with Hidden Markov Models by K. Udaka, H. Mamitsuka,
Y. Nakaseko and N. Abe, in Journal of Immunology, 169(10), 5744-5753,
2002.
- Business
Applications of Data Mining, by C. Apte, B. Liu, E.P.D. Pednault,
and P. Smyth, in Communications of the ACM, Vol. 45, No. 8, August 2002.
- A
Probabilistic Estimation Framework for Predictive Modeling Analytics,
by C. Apte, R. Natarajan, E.P.D. Pednault, and F. Tipu, in IBM Systems
Journal, Vol. 41, No. 3, August 2002.
- Predictive
Algorithms in the Management of Computer Systems, by R. Vilalta,
C. Apte, J. Hellerstein, S. Ma, and S.M. Weiss, in IBM Systems Journal,
Vol. 41, No. 3, August 2002.
- Automated
Generation of Model Cases for Help-Desk Applications, by S.M. Weiss
and C. Apte, in IBM Systems Journal, Vol. 41, No. 3, August 2002.
- Experiments
in High-Dimensional Text Categorization, by F. Damerau, T. Zhang,
and S.M. Weiss, in Proceedings of ACM SIGIR International Conference
on Information Retrieval, August 2002.
- Sequential
Cost-Sensitive Decision Making with Reinforcement Learning, by E.P.D.
Pednault, N. Abe, and B. Zadrozny, in Eigth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (SIGKDD), Edmonton,
Canada, July 2002.
- A
System for Real-time Competitive Market Intelligence, by S.M. Weiss
and N.K. Verma, in Eigth ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining (SIGKDD), Edmonton, Canada, July 2002.
- Segmented
Regression Estimators for Massive Data Sets by R. Natarajan and
E.P.D. Pednault, in Proceedings of the SIAM Second International Conference
on Data Mining, Crystal City, Virginia, April 2002.
- Multiplicative
Adjustment of Class Probability: Educating Naive Bayes by S.J. Hong,
J. Hosking, R. Natarajan, IBM Research Report RC-22393, April 2002.
Condensed version in IEEE ICDM 2002.
2001
- Personalization
of Supermarket Product Recommentations by R. D. Lawrence, G. Almasi,
V. Kotlyar, M. Viveros, and S. Duri, in Data Mining and Knowledge Discovery,
5, 11-32, 2001.
- Segmentation-Based
Modeling for Advanced Targeted Marketing, by C. Apte, E. Bibelnieks,
R. Natarajan, E.P.D. Pednault, F. Tipu, D. Campbell, and B. Nelson,
IBM Research Report RC-21982. In Seventh ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (SIGKDD), San Francisco, August
2001.
- Using
Simulated Pseudo Data to Speed Up Statistical Predictive Modeling from
Massive Data Sets, by R. Natarajan and E.P.D. Pednault, in SIAM
First International Conference on Data Mining, Chicago, IL, April 2001.
- Solving
Regression Problems with Rule-Based Ensemble Classifiers, by N.
Indurkhya and S.M. Weiss, in Seventh ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (SIGKDD), San Francisco, August
2001.
- Lightweight
Collaborative Filtering Method for Binary-Encoded Data, by S.M.
Weiss and N. Indurkhya, in Fifth European Conference on Principles and
Practice of Knowledge Discovery in Databases (PKDD), Freiburg, Germany,
September 2001.
- A
New Approach for Item Choice Recommendations, by S.J. Hong, R. Natarajan,
and I. Belitskaya, IBM Research Report RC-21962, in Third International
Conference on Data Warehousing and Knowledge Discovery (DaWaK'01), September
2001, Munich, Germany.