Research Areas - Cornell University

advertisement
Rich Caruana
4157 Upson Hall
Cornell University
Ithaca, NY 14853
caruana@cs.cornell.edu
phone: 607-255-1164
fax: 607-255-4428
Research Areas
Machine learning and data mining, medical informatics and decision making,
bio-informatics, feature selection, missing values, inductive transfer (e.g., multitask learning),
rank learning, artificial neural networks, ensemble learning, memory-based learning.
Education
Ph.D., Computer Science, 1997, Carnegie Mellon University, Pittsburgh, Pennsylvania. Ph.D.
Thesis: "Multitask Learning."
Committee: Tom Mitchell, Herb Simon, Tom Dietterich, Dean Pomerleau.
M.S., Computer Science, 1984, Villanova University, Villanova, Pennsylvania.
M.S. Thesis: "BANDIT: A Fast Algorithm for the Resolution of Spectra."
B.S., Mathematics, 1982. Minors in Physics, Chemistry, and Honors Liberal Arts, Villanova
University, Villanova, Pennsylvania.
Research Positions
July 2001–Present
Assistant Professor, Department of Computer Science, Cornell University, Ithaca, New York
July 2000–July 2001
Research Faculty, Center for Automated Learning and Discovery, Carnegie Mellon University,
Pittsburgh, Pennsylvania
September 1998–July 2000
Assistant Professor, Radiology and Computer Science, University of California, Los Angeles,
Los Angeles, California
Visiting Professor, Center for Automated Learning and Discovery, Carnegie Mellon University,
Pittsburgh, Pennsylvania
July 1996–March 2000
Research Scientist (machine learning), Justsystem Pittsburgh Research Center (JPRC), Pittsburgh,
Pennsylvania
October 1986–August 1989
Research Scientist (machine learning), Philips Labs, Briarcliff Manor, New York
September 1984–October 1986
Knowledge Engineer (expert systems), GTE, Mountain View, California
August 1980–August 1983
Research Scientist (biomedical pattern recognition), Geometric Data Company, King of Prussia,
Pennsylvania
Professional Activities














Area Chair for Neural and Information Processing Systems (NIPS), 2003
Area Chair for International Conference on Machine Learning (ICML), 2003
Program Committee for International Conference on Data Mining (ICDM), 2003
Program Committee for ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining (KDD), 2003
Program Committee American Medical Informatics Association Conference (AMIA), 2003
Thesis Committee Member, Sylvain le Tourneau, University of Ottawa, August 2003
Co-Chair of Workshop on Rank Learning, Neural and Information Processing Systems
(NIPS), 2002
Instructor for advanced course in artificial neural nets (35 lecture hours) for NeuralWare
Corporation, 2000, 2001
Program Committee for the International Conference on Artificial Neural Networks in
Medicine and Biology (ANNIMAB), 2000
General Chair of Workshops, Neural and Information Processing Systems (NIPS), 2000
General Chair of Workshops, Neural and Information Processing Systems (NIPS), 1999
Co-Chair of Workshop on Integrating Supervised and Unsupervised Learning, Neural and
Information Processing Systems (NIPS), 1998
Co-Chair of Workshop on Tricks of the Trade, Neural and Information Processing Systems
(NIPS), 1996
Co-Chair of Workshop on Inductive Transfer, Neural and Information Processing Systems
(NIPS), 1995
Reviewing












American Association for Artificial Intelligence (AAAI)
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Neural Nets
IEEE Transactions on Systems, Men and Cybernetics
International Conference on Artificial Neural Nets (ICANN)
International Conference on Artificial Neural Networks in Medicine and Biology
(ANNIMAB)
International Conference on Data Mining (ICDM)
International Conference on Machine Learning (ICML)
International Joint Conference on Artificial Intelligence (IJCAI)
International Joint Conference on Artificial Neural Nets (IJCNN)
Journal of Analytical Chemistry
Journal of Artificial Intelligence Research (JAIR)





Journal of Machine Learning Research (JMLR)
Knowledge Discovery and Data Mining (KDD)
Machine Learning Journal (MLJ)
Neural and Information Processing Systems (NIPS)
Portuguese Conference on Artificial Intelligence
Honors








Nomination for Best Paper: Caruana, Rich, Niculescu, Stefan, Rao, Bharat, and Simms,
Cynthia, "Evaluating the C-section Rate of Different Physician Practices: Using Machine
Learning to Model Standard Practice" to appear at the American Medical Informatics
Conference (AMIA), November 2003.
Invited speaker, AAAI Workshop on Learning From Imbalanced Data, July 2000: "Methods
for Learning From Imbalanced Data"
Invited speaker, Research Jamboree, University of Edinburgh, Edinburgh, Scotland, May
2000 (3 talks): "Unabridged and Multitask Learning", "Machine Learning, Medical Decision
Making, and Explanation", "Semi-Supervised Clustering"
Invited speaker, University of Skovde, Sweden, December 1994: "Multitask Learning"
Participant, Generalization in Neural Nets and Graphical Models Summer School, Newton
Institute, Cambridge, England, 1997
Participant, Learning in Graphical Models Summer School, Ettore Majorana, Erice, Italy,
1996
Participant, Connectionist Models Summer School, Boulder, Colorado, 1993
Award for Outstanding Contribution, Philips Labs, 1988
Journal Publications
Caruana, Rich, and de Sa, Virginia R., “Benefitting from the Variables that Variable Selection
Discards,” Journal of Machine Learning Research, Vol. 3, March 2003, pp.1245-1264.
Goldenberg, A., Shmueli, G., Caruana, R., Fienberg, S., "Early Statistical Detection of Anthrax
Outbreaks by Tracking Over-the-counter Medication Sales," Proceedings of the National
Academy of Sciences, 99, 5237-5240, 2002.
Simms, Cynthia J., Meyn, Leslie, Caruana, Rich, Rao, R. Bharat, Mitchell, Tom, Krohn,
Marijane, "Predicting Cesarean Delivery with Decision Tree Models," The American Journal of
Obstetrics and Gynecology, Vol. 183, No. 5, November 2000, pp. 1198-1206.
Cooper, G. F., Aliferis, C. F., Ambrosino, R., Aronis, J., Buchanan, B. G., Caruana, R., Fine, M.
J., Glymour, C., Gordon, G., Hanusa, B. H., Janosky, J. E., Meek, C., Mitchell, T., Richardson,
T., Spirtes, P., "An Evaluation of Machine Learning Methods for Predicting Pneumonia
Mortality." Artificial Intelligence in Medicine 9, 1997, pp. 107-138.
Caruana, Rich, "Multitask Learning." Machine Learning, Vol. 28, pp. 41-75, Kluwer Academic
Publishers, 1997.
Mitchell, Tom, Caruana, Rich, Freitag, Dayne, McDermott, John, Zabowski, David, "Experience
with a Learning Personal Assistant." Communications of the ACM, 1994.
Caruana, Rich, Searle, Roger B., Shupack, Saul I., "Additional Capabilities for a Fast Algorithm
for the Resolution of Spectra." Journal of Analytical Chemistry, 1988.
Caruana, Rich, Searle, Roger B., Heller, Thomas, Shupack, Saul I., "Fast Algorithm for the
Resolution of Spectra." Journal of Analytical Chemistry, 1986.
Book Chapters
Caruana, Rich, "15 Useful Tricks with Extra Outputs." Neural Networks: Tricks of the Trade, G.
B. Orr and K.-R. Muller (Eds.), Springer-Verlag, 1998.
Caruana, Rich, "Multitask Learning." Learning to Learn, S. Thrun and L. Pratt (Eds.), Kluwer
Academic Publishers, 1997.
Caruana, Rich, Freitag, Dayne, "How Useful Is Relevance?" Intelligent Relevance: Papers from
the 1994 Fall Symposium, AAAI Report FS-94-02, ISBN 0-929280-76-8, 1994.
Caruana, Rich, "The Automatic Training of Rule Bases that Use Numerical Uncertainty
Representations." Uncertainty in Artificial Intelligence, Vol. 3, North-Holland, 1988.
Refereed Conference Papers
Caruana, Rich, Niculescu, Stefan, Rao, Bharat, and Simms, Cynthia, "Evaluating the C-section
Rate of Different Physician Practices: Using Machine Learning to Model Standard Practice" to
appear at the American Medical Informatics Conference (AMIA), November 2003.
Langford, John., and Caruana, Rich, "(Not)Bounding the True Error," Neural and Information
Processing Systems, Vol. 14 (Proceedings of NIPS*2001), MIT Press, 2002.
Caruana, Rich, "A Non-Parametric EM-Style Algorithm for Imputing Missing Values," Artificial
Intelligence and Statistics, January 2001.
Caruana, Rich, Lawrence, Steve, and Giles, Lee, "Overfitting in Artificial Neural Nets Trained
with Backpropagation, Conjugate Gradient, and Early Stopping," Neural and Information
Processing Systems, Vol. 13 (Proceedings of NIPS*2000), MIT Press, 2001.
Berger, Adam, Caruana, Rich, Cohn, David, Freitag, Dayne, Mittal, Vibhu, "Bridging the Lexical
Chasm: Automatic FAQ Answer Finding." Special Interest Group on Information Retrieval
(SIGIR), Athens, Greece, July 2000.
O'Sullivan, Joseph, Langford, John, Caruana, Rich, Blum, Avrim, "Unabridged Learning."
International Conference on Machine Learning (ICML), Stanford, California, July 2000.
Caruana, Rich, Cohn, David, McCallum, Andrew, "Semi-Supervised Clustering with User
Feedback." Machines that Learn, Snowbird, Utah, April 2000.
Simms, Cynthia, M.D., Caruana, Rich, Krohn, M. J., Meyn, Leslie, Mitchell, Tom, Rao, R.
Bharat, and Schmeuking, Ingo, "Predicting Caesarean Section with Decision Trees." Annual
Meeting of the Society of Fetal and Maternal Medicine, February 2000.
Caruana, Rich, "Case-Based Explanation of Artificial Neural Nets." Artificial Neural Nets in
Medicine and Biology (ANNIMAB), Goteborg, Sweden, May 2000.
Caruana, R., Kangarloo, H., David, J., Dionisio, N., Sinha, U., Johnson, D. "Case-Based
Explanation of Non-Case-Based Learning Methods." Proceedings of the 1999 American Medical
Informatics Association (AMIA) Symposium, 1999, pp.212-215.
Caruana, Rich, Dionisio, John, Johnson, David, Taira, Ricky, Kangarloo, Hooshang, "Automatic
Imaging Protocol Selection." American Radiology Conference (IRAS), 1999.
Caruana, Rich, "Applying Case-Based Explanation to Non-Case-Based Methods such as
Artificial Neural Nets or Decision Trees." American Radiology Conference (IRAS), 1999.
Caruana, Rich, O'Sullivan, Joseph, "Multitask Pattern Recognition for Autonomous Robots."
IEEE International Conference on Intelligent Robotic Systems (IROS), Victoria, B.C., Canada,
October 1998.
Caruana, Rich, de Sa, Virginia, "Using Feature Selection to Find Inputs that Work Better as Extra
Outputs." The International Conference on Neural Nets (ICANN), Skövde, Sweden, September
1998.
Caruana, Rich, O'Sullivan, Joseph, "Multitask Pattern Recognition for Vision-Based Autonomous
Robots." The International Conference on Neural Nets (ICANN), Skövde, Sweden, September
1998.
Caruana, Rich, de Sa, Virginia, "Promoting Poor Features to Supervisor: Some Inputs Works
Better as Outputs." Neural and Information Processing Systems, Vol. 9 (Proceedings of
NIPS*96), MIT Press, 1997, pp. 389-395.
Caruana, Rich, "Algorithms and Applications for Multitask Learning." Machine Learning,
Proceedings of the 13th International Conference on Machine Learning (ICML 1996, Bari,
Italy), Morgan Kauffmann, 1996, pp. 87-95.
Caruana, Rich, Baluja, Shumeet, Mitchell, Tom, "Using the Future to 'Sort Out' the Present:
Rankprop and Multitask Learning for Medical Risk Evaluation." Advances in Neural Information
Processing Systems, Vol. 8 (Proceedings of NIPS*95), MIT Press, 1996, pp. 959-965.
Baluja, Shumeet, Caruana, Rich, "Removing the Genetics from the Standard Genetic Algorithm."
Proceedings of the 12th Annual Conference on Machine Learning, 1995, pp. 38-46.
Caruana, Rich, "Learning Many Related Tasks at the Same Time with Backpropagation."
Advances in Neural Information Processing Systems 7 (Proceedings of NIPS*94), MIT Press,
1995 pp. 657-664.
Caruana, Rich, Freitag, Dayne, "Greedy Attribute Selection." Machine Learning, Proceedings of
the Eleventh International Conference on Machine Learning, (ICML 1994, New Brunswick, New
Jersey) Morgan Kauffmann, 1994, pp. 28-36.
Caruana, Rich, "Multitask Connectionist Learning." Proceedings of the 1993 Connectionist
Models Summer School, 1993, pp. 372-379.
Caruana, Rich, "Multitask Learning: A Knowledge-Based Source of Inductive Bias." Proceedings
of the 10th International Conference on Machine Learning, 1993, pp. 41-48.
Caruana, Rich, Schaffer, J. David, "Using Multiple Representations to Control Inductive Bias:
Gray and Binary Codes for Genetic Algorithms." Proceedings of the Sixth International
Workshop on Machine Learning (ML 1989), Morgan Kaufmann, 1989, pp. 375-378.
Schaffer, J. David, Caruana, Rich, Eshelman, Larry J., "Designing Neural Nets that Generalize
Optimally with Genetic Algorithms." Los Alamos Conference on Emergent Computation, May
1989.
Eshelman, Larry J., Caruana, Rich, Schaffer, J. David, "Biases in the Crossover Landscape." The
1989 International Conference on Genetic Algorithms, June 1989.
Schaffer, J. David, Caruana, Rich, Eshelman, Larry J., Das, Raj, "A Study of Control Parameters
Affecting Online Performance of Genetic Algorithms for Function Optimization." The 1989
International Conference on Genetic Algorithms, June 1989.
Caruana, Rich, Eshelman, Larry J., Schaffer, J. David, "Representation and Hidden Bias II:
Eliminating Defining Length Bias in Genetic Search with Shuffle Crossover." International Joint
Conference on Artificial Intelligence (IJCAI), August 1989.
Caruana, Rich, Schaffer, J. David, "Representation and Hidden Bias: Gray vs. Binary Coding for
Genetic Algorithms." Fifth International Conference on Machine Learning, June 1988.
Peer-Reviewed Workshops
Caruana, Rich, Hodor, Paul, "A High-Precision Workbench for Extracting Information from the
Protein Data Bank (PDB)." Knowledge and Data Discovery (KDD) Workshop on Text and
Information Extraction, Boston, Massachusetts, 2000.
Caruana, Rich, Mullin, Matt, "Estimating the Number of Local Minima in Complex Search
Spaces." International Joint Conference on Artificial Intelligence Workshop on Optimization,
(IJCAI), Stockholm, Sweden, 1999.
Caruana, Rich, "15 Useful Tricks with Extra Outputs." Neural and Information Processing
Systems (NIPS) Workshop on Tricks of the Trade, 1996.
Caruana, Rich, Freitag, Dayne, "How Useful Is Relevance?" AAAI Fall Symposium on
Relevance, New Orleans, Louisiana, 1994.
Caruana, Rich, "Generalization vs. Network Size." Neural and Information Processing Systems
(NIPS) Workshop on Generalization, 1993.
Chrisman, Lonnie, Caruana, Rich, Carriker, Wayne, "Intelligent Agent Design Issues: Internal
Agent State and Incomplete Perception." AAAI Fall Symposium on Sensory Aspects of Robotic
Intelligence, Asilomar, California, 1991.
Caruana, Rich, "The Automatic Training of Rule Bases that Use Numerical Uncertainty
Representations." AAAI-87 Workshop on Uncertainty in Artificial Intelligence, Seattle,
Washington, 1987.
Technical Reports
David, Cohn, Rich Caruana, and McCallum Andrew, "Semi-Supervised Clustering with User
Feedback." Cornell University Technical Report, TR2003-1892, 2003.
Caruana, Rich, Artigas, Pedro, Goldenberg, Anna, and Likhodedov, Anton, "Meta Clustering."
Cornell University Technical Report, TR2002-1884, 2002.
Caruana, Rich, "Multitask Learning." Ph.D. Dissertation, School of Computer Science, Carnegie
Mellon University, CMU-CS-97-203, 1997.
Buntine, Wray, Caruana, Rich, "Introduction to IND and Recursive Partitioning." NASA Ames
Research Center, TR# FIA-91-28, 1991.
Caruana, Rich, "BANDIT: A Fast Algorithm for the Resolution of Spectra." Master's Thesis,
Departments of Computer Science and Chemistry, Villanova University, 1988.
Caruana, Rich, "Estimating the Number of Minima in Complex Search Spaces." Philips Labs,
TR-88-159, 1988.
Caruana, Rich, Schaffer, J. David, "Optimizing Digital Filters with Simulated Annealing and
Genetic Algorithms." Philips Labs, TR-88-123, 1988.
Caruana, Rich, Schaffer, J. David, "An Investigation of Parameter Sets for Genetic Algorithms."
Philips Labs, TR-88-045, 1988.
Caruana, Rich, Coffey, Brian J., "Searching for Optimal FIR Multiplierless Digital Filters with
Simulated Annealing." Philips Labs, TR-88-031, 1988.
Pelavin, Rich N., Coffey, Brian J., Caruana, Rich, "Research in Diagnosis Using Design
Knowledge." Philips Labs, TR-88-001, 1988.
Caruana, Rich, Schaffer, J. David, "Gray vs. Binary Coding for Genetic Algorithm Function
Optimizers." Philips Labs, TR-87-080, 1987.
Benjamin, D. Paul, Caruana, Rich, "Partial-Matching as Search." Philips Labs, TR-87-019, 1987.
Caruana, Rich, "Experiments in Rule-Based Learning in Systems Using Numerical Uncertainty
Representations." GTE Western Division Technical Report, 1986.
Patents and Invention Disclosures
Sukthankar, Rahul, Caruana, Rich, Hasegawa, Keiko, Mullin, Matt, "Using Active Monitor
Illumination for 3-D Active Imaging." Patent disclosure filed April 2000.
Caruana, Rich, "Iterated K-Nearest Neighbor Method and Article of Manufacture for Filling in
Missing Values." United States Patent 6,047,287. Assignee: Justsystem Pittsburgh Research
Center, Pittsburgh, Pennsylvania. Filed May 5, 1998, granted April 4, 2000.
Work in Progress or Recently Submitted
"Selecting Ensembles from Libraries of Models," with Alex Niculescu, Geoff Crew, and Alex
Ksikes.
"An Automatic Method for Finding Interpretations for Dimensions in Multidimensional Scaling,"
with Marc Fasnacht.
"Temporal Clustering of Microarray Protein Activities," with Paul Hodor.
"Finding High-Quality Protein Structures in the Protein Data Bank(PDB)," with Paul Hodor,
Bruce Buchanan, and John Rosenberg.
"Using Machine Learning to Discover Sequence Patterns for Protein Folding Motifs," with Paul
Hodor, Bruce Buchanan, and John Rosenberg.
"Backprop Nets Do Unsupervised Clustering of Outputs."
"Improving Kernel Regression with Multitask Learning."
"Soft Ranks: Making Rank Metrics Differentiable for Easier Optimization."
"A Machine Learning Approach to Data Cleansing," with Adam Kalai.
"Combining Multiple Databases in Order to Improve Prediction," with P. Spirtes, V. Abraham, J.
Aronis, B. Buchanan, G. Cooper, M. Fine, G. Livingston, and S. Monti.
"Predicting Dire Outcomes of Patients with Community Acquired Pneumonia (CAP) Using All
Available and Relevant Data," with G. Cooper, C. Aliferis, J. Aronis, B. Buchanan, M. Fine, J.
Janosky, T. Mitchell, and P. Spirtes.
"Predicting Etiology of Patients with Community Acquired Pneumonia (CAP)," with G. Cooper,
C. Aliferis, J. Aronis, B. Buchanan, M. Fine, J. Janosky, T. Mitchell, and P. Spirtes.
Teaching (at Cornell)
COM-578: Empirical Methods in Machine Learning and Data Mining, Fall '01, Fall '02, Fall '03.
COM-678: Special Topics in Machine Learning, Spring '02 (with Thorsten Joachims).
COM-778: Advanced Topics in Machine Learning, Spring '03.
Download