Ido Dagan - Curriculum Vitae

advertisement
Ido Dagan - Curriculum Vitae
Personal Data
Address:
Computer Science Department
Bar Ilan University
Ramat Gan, 52900
Israel
Phone:
Email:
+972-(0)54-4395336
dagan@cs.biu.ac.il
Recent CV Highlights



Initiated the research area of textual entailment; organized the first three PASCAL
Recognizing Textual Entailment (RTE) Challenges (2005-2007) and transferred the
challenge organization to the U.S. National Institute of Standards and Technology
(NIST), starting with the 4-th RTE challenge in 2008.
Nominated as candidate and elected as the President of the Association of
Computational Linguistics (ACL) for 2010 (Vice-President in 2009).
Recipient of the 2008 Wolf Foundation Krill Prize.
Education
1986: B.Sc. Computer Science, Summa Cum Laude, Technion – Israel Institute of
Technology. On the Technion President’s List of Excellence 1984,1985,1986.
1992:
Ph.D. Computer Science, Technion – Israel Institute of Technology.
Thesis topic: Multilingual statistical methods for natural language disambiguation.
Supervisor: Prof. Alon Itai.
Employment
1978 - 1983: Israel Defense Forces. Software development and system analysis.
1984 - 1986: CAD Software Engineer, INTEL Israel Ltd.
1990 - 1991: Research Fellow, IBM Israel Scientific Center, Haifa.
1992 - 1994: Member of Technical Staff, AT&T Bell Laboratories, Research Division.
1994 - 1996: Post Doctoral Research Fellow, Dept. of Mathematics and Computer
Science, Bar Ilan University.
1996 - 1998: Visiting Lecturer, Dept. of Mathematics and Computer Science, Bar Ilan
University.
1998 - 2001: Founder, Chief Technology Officer and Director, FocusEngine.
2002 – 2003: Vice President of Technology, LingoMotors (following FocusEngine
acquisition).
1
2003 - 2008 : Senior Lecturer, Dept. of Mathematics and Computer Science, Bar Ilan
University.
2008 - : Associate Professor, Dept. of Mathematics and Computer Science, Bar Ilan
University.
Teaching









Algorithms 1
Automata and Formal Languages
Natural Language Processing (NLP)
Empirical Methods for NLP
Text Understanding
Information Retrieval
Applied Probabilistic Models in Computer Science
NLP seminars
Statistical Methods for Computer Science
Research Interests
Natural Language Processing (NLP, Computational Linguistics) and textual information
access:



Applied semantic inference; particularly textual entailment, a field which was
initiated by Dr. Dagan and his colleagues as a generic framework for applied
semantics.
Empirical Natural Language Processing and Machine learning methods
Applications for textual information access and extraction (such as information
retrieval, information extraction, question answering, text categorization)
Publications
Book
1. Quinonero-Candela, J.; Dagan, I.; Magnini, B.; d'Alché-Buc, F. (Eds.) Machine
Learning Challenges. Lecture Notes in Computer Science, Vol. 3944, 462 p.
Springer, 2006.
Special Issue Editing
1. Co-editor of the Journal of Natural Language Engineering Special Issue on Textual
Entailment, Volume 15, Special Issue 04, October 2009.
Journal Articles
1. Dagan, Ido, Martin C. Golumbic and Ron Y. Pinter. Trapezoid graphs and their
coloring, Discrete Applied Mathematics, 1988, Vol. 21, pp. 35-46.
2. Dagan, Ido and Alon Itai. Set expression based inheritance system, Annals of
Mathematics and Artificial Intelligence, 1991, Vol. 4(3-4), pp. 269-280.
3. Dagan, Ido and Alon Itai. Word sense disambiguation using a second language
monolingual corpus, Computational Linguistics, 1994, Vol. 20(4), pp. 563-596.
2
4. Dagan, Ido, John Justeson, Shalom Lappin, Herbert Leass and Amnon Ribak. Syntax
and lexical statistics in anaphora resolution, Applied Artificial Intelligence, 1995, Vol.
9, pp. 633-644.
5. Dagan, Ido, Shaul Marcus and Shaul Markovitch. Contextual word similarity and
estimation from sparse data, Computer, Speech and Language, 1995, Vol. 9, pp. 123152.
6. Dagan, Ido and Kenneth Church. Termight: Coordinating man and machine in
bilingual terminology acquisition, Machine Translation, 1997, Vol. 12(1-2), pp. 89107.
7. Feldman, Ronen, Ido Dagan and Haym Hirsh. Mining text using keyword
distributions, Journal of Intelligent Information Systems, 1998, Vol. 10(3), pp. 281300.
8. Dagan, Ido, Lillian Lee and Fernando Pereira. Similarity-based models of
cooccurrence probabilities, Machine Learning, 1999, Vol. 34(1-3) special issue on
Natural Language Learning, pp. 43-69.
9. Argamon, Shlomo, Ido Dagan and Yuval Krymolowski. A memory based approach
to learning shallow natural language patterns, Journal of Experimental and
Theoretical AI (JETAI), 1999, Vol. 11, pp. 369-390.
10. Argamon-Engleson, Shlomo and Ido Dagan. Committee-Based Sample Selection for
Probabilistic Classifiers, Journal of Artificial Intelligence Research (JAIR), 1999,
Vol. 11, pp. 335-360.
11. Marx, Zvika and Ido Dagan. Conceptual mapping through keyword coupled
clustering. Mind and Society: a Special Issue on Commonsense and Scientific
Reasoning, 4(2), pp. 59-85, 2001.
12. Marx, Zvika, Ido Dagan, Joachim M. Buhmann and Eli Shamir. Coupled clustering:
a method for detecting structural correspondence, Journal of Machine Learning
Research, 2002, Vol. 3(Dec), pp. 747-780.
13. A Gliozzo, C Strapparava, I Dagan, Unsupervised and Supervised Exploitation of
Semantic Domains in Lexical Disambiguation, Computer Speech and Language, Vol.
18, Issue 3, July 2004, pp. 275-299.
14. M. Koppel, N. Akiva and I. Dagan, Feature Instability as a Criterion for Selecting
Potential Style Markers, Journal of the American Society for Information Science and
Technology (JASIST), Volume 57, Number 11, September 2006, pp. 1519-1525.
15. Zhitomirsky-Geffet, Maayan and Ido Dagan. Bootstrapping Distributional Feature
Vector Quality, Computational Linguistics, September 2009, Vol. 35, No. 3: 435–
461.
16. Alfio Gliozzo, Carlo Strapparava and Ido Dagan. Improving text categorization
bootstrapping via unsupervised learning. ACM Transactions on Speech and Language
Processing (TSLP), Volume 6 , Issue 1, October 2009.
17. Ido Dagan, Bill Dolan, Bernardo Magnini and Dan Roth. Recognizing textual
entailment: Rational, evaluation and approaches. Natural Language Engineering,
Volume 15, No. 4, pp. i-xvii, October 2009.
3
Refereed Articles in Books
1. Engelson, Sean and Ido Dagan. Sample selection in natural language learning, in S.
Wermter, E. Riloff and G. Scheler (Eds.), Connectionist, Statistical and Symbolic
Approaches to Learning for Natural Language Processing, Springer, 1996, pp. 230245.
2. Dagan, Ido, Kenneth Church and William Gale. Robust bilingual word alignment for
machine aided translation, in S. Armstrong, K. Church, P. Isabelle, S. Manzi, E.
Tzoukermann and D. Yarowsky (Eds.), Natural Language Processing Using Very
Large Corpora, Kluwer Academic Publishers, 1999, pp. 209-224.
3. Dagan, Ido. Contextual Word Similarity, in Rob Dale, Hermann Moisl and Harold
Somers (Eds.), Handbook of Natural Language Processing, Marcel Dekker Inc, 2000,
Chapter 19, pp. 459-476.
4. Choueka, Yaacov, Ehud S. Conley and Ido Dagan. A comprehensive bilingual word
alignment system: application to disparate languages - Hebrew and English, in J.
Veronis (Ed.), Parallel Text Processing, Kluwer Academic Publishers, 2000, pp. 69–
96.
5. Dagan, Ido and Yuval Krymolowski. Compositional memory-based partial parsing, in
R. Bod, R. Scha and K. Sima'an (Eds.), Data-Oriented Parsing, CSLI Publications,
2002.
6. Glickman Oren, Ido Dagan. Acquiring lexical paraphrases from a single corpus, in N.
Nicolov, K. Bontcheva, G. Angelova and R. Mitkov (editors). Recent Advances in
Natural Language Processing III, John Benjamins Publ. Co., Amsterdam, 2004, pp.
81-90.
7. Ido Dagan, Oren Glickman and Bernardo Magnini. The PASCAL Recognising
Textual Entailment Challenge. In Quinonero-Candela, J.; Dagan, I.; Magnini, B.;
d'Alché-Buc, F. (Eds.) Machine Learning Challenges. Lecture Notes in Computer
Science , Vol. 3944, pp. 177-190, Springer, 2006.
8. Oren Glickman, Ido Dagan and Moshe Koppel. A lexical alignment model for
probabilistic textual entailment. In Quinonero-Candela, J.; Dagan, I.; Magnini, B.;
d'Alché-Buc, F. (Eds.) Machine Learning Challenges. Lecture Notes in Computer
Science , Vol. 3944, pp. 287-298, Springer, 2006.
Papers at Refereed Conferences and Workshops
1. Dagan, Ido and Alon Itai. Automatic Acquisition of Constraints for the Resolution of
Anaphora References and Syntactic Ambiguities, in Proceedings of COLING, 1990,
pp. 330-332.
2. Dagan, Ido and Alon Itai. A Statistical Filter for Resolving Pronoun References, in Y.
A. Feldman and A. Bruckstein (Eds.), Artificial Intelligence and Computer Vision,
Elsevier Science Publishers B.V., 1991, pp. 125-135 (Proceedings of the 7th Israeli
Symposium on Artificial Intelligence and Computer Vision, 1990).
4
3. Dagan, Ido, Alon Itai and Ulrike Schwall. Two languages are more informative than
one, in Proceedings of the Annual Meeting of the Association for Computational
Linguistics (ACL), 1991, pp. 130-137.
(Extended version appears in journal article 3)
4. Dagan, Ido. Lexical disambiguation: Information sources and their statistical
realization, in Proceedings of the Annual Meeting of the Association for
Computational Linguistics (ACL) (Student Session), 1991, pp. 341-342.
5. Rackow, Ulrike, Ido Dagan and Ulrike Schwall. Automatic translation of noun
compounds, in Proceedings of COLING, 1992, pp. 1249-1253.
6. Dagan, Ido, Shaul Marcus and Shaul Markovitch. Contextual word similarity and
estimation from sparse data, in Proceedings of the Annual Meeting of the Association
for Computational Linguistics (ACL), 1993, pp. 164-171.
(Extended version appears in journal article 5)
7. Dagan, Ido, Kenneth Church and William Gale. Robust bilingual word alignment for
machine aided translation, in Proceedings of the Workshop on Very Large Corpora
(WVLC), 1993, pp. 1-8.
(Extended version appears in book article 2)
8. Dagan, Ido, John Justeson, Shalom Lappin Herbert Leass and Amnon Ribak. Syntax
and lexical statistics in anaphora resolution, Bar-Ilan Symposium on Foundations of
AI, 1993.
(Extended version included in journal article 4)
9. Dagan, Ido, Fernando Pereira and Lillian Lee. Similarity-based estimation of word
cooccurrence probabilities, in Proceedings of the Annual Meeting of the Association
for Computational Linguistics (ACL), 1994, pp. 272-278.
(Extended version included in journal article 8)
10. Dagan, Ido and Kenneth Church. Termight: Identifying and translating technical
terminology, in Proceedings of the 4th Conference on Applied Natural Language
Processing (ANLP), 1994, pp. 34-40.
(Extended version appears in journal article 6)
11. Dagan, Ido and Sean Engelson. Committee-based sampling for training probabilistic
classifiers, in Proceedings of the Twelfth International Conference on Machine
Learning (ICML), 1995.
(Extended version included in journal article 10)
12. Dagan, Ido and Sean Engelson. Selective sampling in natural language learning, in
Proceedings of the IJCAI Workshop on New Approaches to Learning for Natural
Language Processing, 1995, pp. 41-48.
(Extended version appears in book article 1)
13. Feldman, Ronen and Ido Dagan. KDT - Knowledge Discovery in Texts, in
Proceedings of the First International Conference on Knowledge Discovery (KDD),
1995, pp. 112-117.
(Extended version included in journal article 7)
14. Feldman, Ronen and Ido Dagan. Knowledge Discovery in Textual Databases, in
Proceedings of the ECML Workshop in Knowledge Discovery, 1995.
5
15. Engelson, Sean and Ido Dagan. Minimizing Manual Annotation Cost in Supervised
Training from Corpora, in Proceedings of the Annual Meeting of the Association for
Computational Linguistics (ACL), 1996, pp. 319-326.
(Extended version included in journal article 10)
16. Dagan, Ido, Ronen Feldman and Haym Hirsh. Keyword-Based Browsing and
Analysis of Large Document Sets, in Proceedings of The Fifth Annual Symposium on
Document Analysis and Information Retrieval (SDAIR), 1996, pp. 191-208.
(Extended version included in journal article 7)
17. Feldman, Ronen, Ido Dagan and Willi Kloesgen. Efficient algorithms for mining and
manipulating associations in texts, in Proceedings of the Thirteenth European
Meeting on Cybernetics and Systems Research (EMCSR), 1996.
18. Dagan, Ido, Lillian Lee and Fernando Pereira. Similarity-based methods for word
sense disambiguation, in Proceedings of the Annual Meeting of the Association for
Computational Linguistics (ACL), 1997, pp 56-63.
(Extended version included in journal article 8)
19. Dagan, Ido, Yael Karov and Dan Roth. Mistake-driven learning in text categorization,
in Proceedings of Second Conference on Empirical Methods in Natural Language
Processing (EMNLP-2), 1997.
20. Yamazaki, Takefumi and Ido Dagan. Mistake-driven learning with thesaurus for text
categorization, in Proceedings of the Natural Language Pacific Rim Symposium
(NLPRS-97), 1997.
21. Argamon, Shlomo, Ido Dagan and Yuval Krymolowsky. Memory-based learning of
shallow natural language patterns, in Proceedings of the Annual Meeting of the
Association for Computational Linguistics (ACL), 1998.
(Extended version appears in journal article 9)
22. Marx, Zvi, Ido Dagan and Eli Shamir. Detecting Sub-Topic Correspondence through
Bipartite Term Clustering, in Proceedings of the ACL-1999 Workshop on
Unsupervised Learning in Natural Language Processing, 1999, pp. 45-51.
(Extended version included in journal article 11)
23. Krymolowski, Yuval and Ido Dagan. Compositional Memory-Based Partial Parsing,
in Proceedings of the Annual Meeting of the Association for Computational
Linguistics (ACL), 2000, pp. 45-52.
(Extended version appears in book article 5)
24. Marx, Zvika, Ido Dagan and Joachim M. Buhmann. Coupled Clustering: a method for
detecting structural correspondence, in Proceedings of the Eighteenth International
Conference on Machine Learning (ICML), 2001, pp.353–360.
(Extended version appears in journal article 12)
25. Marx, Zvika, Ido Dagan and Eli Shamir. Cross-component clustering for template
induction, in Proceedings of the ICML Workshop on Text Learning (TextML), 2002,
pp. 66-75.
26. Dagan, Ido, Zvika Marx and Eli Shamir. Cross-dataset clustering: revealing
corresponding Themes Across Multiple Corpora, in Proceedings of the Sixth
Conference on Natural Language Learning (CoNLL), 2002, pp. 15-21.
6
27. Koppel, Moshe, Navot Akiva and Ido Dagan. A Corpus-Independent Feature Set for
Style Based Text Categorization, in Proceedings of IJCAI'03 Workshop on
Computational Approaches to Style Analysis and Synthesis, Acapulco, Mexico, 2003.
28. Glickman, Oren and Ido Dagan. Identifying Lexical Paraphrases from a Single
Corpus: A Case Study for Verbs, in Proceedings of Recent Advantages in Natural
Language Processing (RANLP '03), 2003.
29. Marx Z., Dagan I. and Shamir E. (2004). Identifying structure across pre-partitioned
data. In Thrun S., Saul L., and Scho"lkopf B. (eds.), Advances in Neural Information
Processing Systems 16 (NIPS 2003), December 8-13, Vancouver, Canada.
30. Ido Dagan and Oren Glickman. 2004. Probabilistic textual entailment: Generic
applied modeling of language variability. In PASCAL Workshop on Learning
Methods for Text Understanding and Mining, Grenoble.
31. Idan Szpektor, Hristo Tanev, Ido Dagan and Bonaventura Coppola. Scaling Webbased Acquisition of Entailment Relations. Proceedings of Empirical Methods in
Natural Language Processing (EMNLP), 2004.
32. Maayan Geffet and Ido Dagan. Feature Vector Quality and Distributional Similarity.
Proceedings of The 20th International Conference on Computational Linguistics
(COLING), 2004.
33. Maayan Geffet and Ido Dagan. The Distributional Inclusion Hypotheses and Lexical
Entailment. Proceedings of the 43rd Annual Meeting of the Association for
Computational Linguistics (ACL), 2005.
34. Oren Glickman and Ido Dagan. A Probabilistic Setting and Lexical Cooccurrence
Model for Textual Entailment. Proceedings of ACL Workshop on Empirical Modeling
of Semantic Equivalence and Entailment, 2005.
35. Oren Glickman, Ido Dagan and Moshe Koppel. A Probabilistic Classification
Approach for Lexical Textual Entailment. Proceedings of the 20th National
Conference on Artificial Intelligence (AAAI), 2005.
36. Oren Glickman, Ido Dagan and Moshe Koppel. A Probabilistic Lexical Approach to
Textual Entailment. Proceedings of the International Joint Conferences on Artificial
Intelligence (IJCAI), 2005.
37. Alfio Gliozzo, Carlo Strapparava, and Ido Dagan. Investigating Unsupervised
Learning for Text Categorization Bootstrapping. Proceedings of the joint Human
Language Technology Conference and Conference on Empirical Methods in Natural
Language Processing (HLT/EMNLP), 2005.
38. Zvika Marx, Ido Dagan and Eli Shamir. A Generalized Framework for Revealing
Analogous Themes across Related Topics. Proceedings of the joint Human Language
Technology Conference and Conference on Empirical Methods in Natural Language
Processing (HLT/EMNLP), 2005.
39. Ido Dagan, Oren Glickman, Alfio Gliozzo, Efrat Marmorshtein and Carlo
Strapparava. Direct Word Sense Matching for Lexical Substitution. Proceedings of
COLING-ACL 2006, 17-21 Jul 2006, Sydney, Australia.
7
40. Shachar Mirkin, Ido Dagan and Maayan Geffet. Integrating Pattern-based and
Distributional Similarity Methods for Lexical Entailment Acquisition. Proceedings of
COLING-ACL 2006, 17-21 Jul 2006, Sydney, Australia.
41. Lorenza Romano, Milen Kouylekov, Idan Szpektor, Ido Dagan and Alberto Lavelli.
Investigating a Generic Paraphrase-based Approach for Relation Extraction.
Proceedings of EACL 2006, 5-7 April 2006, Trento, Italy.
42. Oren Glickman, Ido Dagan, Mikaela Keller, Samy Bengio and Walter Daelemans.
Investigating Lexical Substitution Scoring for Subtitle Generation. Proceedings of
CoNLL-X, 8-9 Jun 2006, New York City, USA.
43. Oren Glickman, Ido Dagan and Eyal Shnarch. Lexical Reference: a Semantic
Matching Subtask. Proceedings of EMNLP 2006, 22-23 Jul 2006, Sydney, Australia.
44. Roy Bar-Haim, Ido Dagan, Iddo Greental and Eyal Shnarch. Semantic Inference at
the Lexical-Syntactic Level. Proceedings of the 22nd National Conference on
Artificial Intelligence (AAAI), July 2007, Vancouver, Canada.
45. Idan Szpektor, Ido Dagan, Alon Lavie, Danny Shacham and and Shuly Wintner.
Cross Lingual and Semantic Retrieval for Cultural Heritage Appreciation.
Proceedings of the ACL Workshop on Language Technology for Cultural Heritage
Data (LaTeCH), June 2007, Prague, Czech Republic.
46. Idan Szpektor, Eyal Shnarch and Ido Dagan. Instance-based Evaluation of Entailment
Rule Acquisition. Proceedings of the 45th Annual Meeting of the Association for
Computational Linguistics (ACL), June 2007, Prague, Czech Republic.
47. Roy Bar-Haim, Ido Dagan, Iddo Greental, Idan Szpektor and Moshe Friedman.
Semantic Inference at the Lexical-Syntactic Level for Textual Entailment
Rocognition. Proceedings of the ACL-PASCAL Workshop on Textual Entailment and
Paraphrasing, June 2007, Prague, Czech Republic.
48. Idan Szpektor and Ido Dagan. Learning Canonical Forms of Entailment Rules.
Proceedings of the International Conference Recent Advantages in Natural Language
Processing (RANLP), September 2007, Bulgaria.
49. Idan Szpektor, Ido Dagan, Roy Bar-Haim and Jacob Goldberger. Contextual
Preferences. Proceedings of the 46th Annual Meeting of the Association for
Computational Linguistics (ACL), June 2008, Columbus, Ohio.
50. Idan Szpektor and Ido Dagan. Learning Entailment Rules for Unary Templates.
Proceedings of COLING 2008, August 2008, Manchester.
51. Shachar Mirkin, Ido Dagan, Eyal Shnarch. Evaluating the Inferential Utility of
Lexical-Semantic Resources. Proceedings of EACL, 2009.
52. Libby Barak, Ido Dagan and Eyal Shnarch. Text Categorization from Category Name
via Lexical Reference. Proceedings of NAACL, 2009.
53. Eyal Shnarch, Libby Barak and Ido Dagan. Extracting Lexical Reference Rules from
Wikipedia. Proceedings of the 47th Annual Meeting of the Association for
Computational Linguistics (ACL), August 2009, Singapore.
8
54. Shachar Mirkin, Lucia Specia, Nicola Cancedda, Ido Dagan, Marc Dymetman and
Idan Szpektor. Source-Language Entailment Modeling for Translating Unknown
Terms. Proceedings of the 47th Annual Meeting of the Association for Computational
Linguistics (ACL), August 2009, Singapore.
55. Lili Kotlerman, Ido Dagan, Idan Szpektor and Maayan Zhitomirsky-Geffet.
Directional distributional similarity for lexical expansion. Proceedings of the 47th
Annual Meeting of the Association for Computational Linguistics (ACL), August
2009, Singapore.
56. Roy Bar-Haim, Jonathan Berant and Ido Dagan. A compact forest for scalable
inference over entailment and paraphrase rules. Proceedings of the EMNLP
conference, 2009.
57. Idan Szpektor and Ido Dagan. Augmenting Wordnet-based inference with argument
mapping. Proceedings of the ACL-2009 Workshop on Applied Textual Inference
(TextInfer), 2009.
58. Luisa Bentivogli, Ido Dagan, Hoa Trang Dang, Danilo Giampiccolo, Medea Lo
Leggio and Bernardo Magnini. Considering Discourse References in Textual
Entailment Annotation. Proceedings of the 5th International Conference on
Generative Approaches to the Lexicon (GL2009), 2009.
Other (non-refereed) Publications
1. Ido Dagan, Oren Glickman and Bernardo Magnini. 2005. The PASCAL Recognising
Textual Entailment Challenge. Proceedings of the PASCAL Recognising Textual
Entailment Challenge, April 2005.
2. Oren Glickman, Ido Dagan and Moshe Koppel. Web Based Probabilistic Textual
Entailment. Proceedings of the PASCAL Recognising Textual Entailment Challenge,
April 2005.
3. Roy Bar Haim, Ido Dagan, Bill Dolan, Lisa Ferro, Danilo Giampiccolo, Bernardo
Magnini and Idan Szpektor. The Second PASCAL Recognising Textual Entailment
Challenge. Proceedings of The Second PASCAL Recognising Textual Entailment
Challenge, 10 April 2006, Venice, Italy.
4. Danilo Giampiccolo; Bernardo Magnini; Ido Dagan; Bill Dolan. The Third PASCAL
Recognizing Textual Entailment Challenge. Proceedings of the ACL-PASCAL
Workshop on Textual Entailment and Paraphrasing, June 2007, Prague, Czech
Republic.
5. Ido Dagan, Roy Bar-Haim, Idan Szpektor, Iddo Greental and Eyal Shnarch. Natural
Language as the Basis for Meaning Representation and Inference. In: A. Gelbukh
(Ed.) Computational Linguistics and Intelligent Text Processing, Lecture Notes in
Computer Science 4919: 151-170, Springer, 2008.
6. Roy Bar-Haim, Jonathan Berant, Ido Dagan, Iddo Greental, Shachar Mirkin, Eyal
Shnarch and Idan Szpektor. Efficient Semantic Deduction and Approximate
Matching over Compact Parse Forests. In Proceedings of Text Analysis Conference
(TAC), 2008.
9
Patents
1. Glossary construction tool, with co-inventor Kenneth Church, at AT&T. U.S. Patent
No. 5,850,561, filed September 23, 1994, issued December 15, 1998.
Co-inventor of the following three patent applications at FocusEngine, in the areas of text
categorization and its combination with other information access methods:
2. U.S. Patent Application No. 09/512,252, filed February 24, 2000
3. U.S. Patent Application No. 09/690,307, filed October 17, 2000
4. U.S. Patent Application No. 60/275,839, filed March 14, 2001
Awards
 IBM Faculty Award, 2007 (with $20,000 donation).
 Wolf Foundation Krill Prize, 2008.
International Professional Activities
Boards and Committees
1. Nominated as candidate and elected as Vice-President Elect of the Association of
Computational Linguistics (ACL), starting 2008; will become the ACL President for
2010 (VP in 2009).
2. Advisory Board of the European Chapter of the Association for Computational
Linguistics (EACL), 2004-2007.
3. Board of SIGNLL - the Association for Computational Linguistics Special Interest
Group for Natural Language Learning.
4. Board of SIGDAT – the Association for Computational Linguistics Special Interest
Group for Linguistic Data and Corpus-Based Approaches to NLP.
5. Advisory Committee for the Text Analysis Conference (TAC) evaluation campaign
of the U.S. National Institute of Standards and Technology (NIST), starting 2008.
6. Chairman of the steering committee of the Knowledge Center for Processing Hebrew,
2004-2006.
Journal Editorial Boards
1. Computational Linguistics (CL) journal, 1995 – 1997.
2. Machine Translation (MT) journal, 1999 – present.
3. Natural Language Engineering (NLE) journal, 2005 – present.
Program Committee Chairing and Event Organization
1. Program co-chair of the Fourth ACL SIGDAT International Workshop on Very Large
Corpora (WVLC), 1996.
2. Area chair for the Conference on Empirical Methods in Natural Language Processing
(EMNLP), 2004.
3. Organizing the Israeli Seminar on Computational Linguistics (ISCOL), 2004.
10
4. Program co-chair of the Ninth Conference on Computational Natural Language
Learning (CONLL), 2005.
5. Organizer and program co-chair of the 1st, 2nd and 3rd PASCAL Recognizing Textual
Entailment Challenges and Workshops, 2004-2007 (3rd workshop as the PASCAL2007 ACL Workshop on Textual Entailment and Paraphrasing).
6. Program co-chair and co-organizer of the ACL Workshop on Empirical Modeling of
Semantic Equivalence and Entailment, 2005.
7. Scientific committee of the Text Analysis Conference (TAC) 2008, of the U.S.
National Institute of Science and Technology (NIST).
Program Committees and Conference Reviewing
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
ANLP 1994, ACL Conference on Applied Natural Language Processing.
ACL 1995, Annual Meeting of the Association for Computational Linguistics.
BISFAI 1995, Fourth Bar-Ilan Symposium on Foundations of Artificial Intelligence.
TMI 1997, International Conference on Theoretical and Methodological Issues in
Machine Translation.
WVLC 1997, Fifth ACL SIGDAT International Workshop on Very Large Corpora.
BISFAI 1997, Fifth Bar-Ilan Symposium on Foundations of Artificial Intelligence.
AAAI 1998, The Fifteenth National Conference on Artificial Intelligence (NLP track).
COLING/ACL 1998, Joint conference for COLING and the Annual Meeting of the
Association for Computational Linguistics.
COLING/ACL 1998 Student Session.
Computerm 1998, International Workshop on Computational Terminology.
ANLP 1999, ACL Conference on Applied Natural Language Processing.
ACL 2000, Annual Meeting of the Association for Computational Linguistics.
ANLP 2000, ACL Conference on Applied Natural Language Processing.
NAACL 2000, North-American Chapter of the ACL.
ACL 2001, Annual Meeting of the Association for Computational Linguistics.
NAACL 2001, North-American Chapter of the ACL.
ACL 2002, Annual Meeting of the Association for Computational Linguistics.
EMNLP 2002, Empirical Methods in Natural Language Processing.
COLING 2002.
Computerm 2002, International Workshop on Computational Terminology.
ACL 2003, Annual Meeting of the Association for Computational Linguistics (Area:
Machine Learning for Natural Language).
Machine Translation (MT) Summit conference, 2003.
ACL 2003 Workshop on Multi-Word Expressions.
RANLP 2003, Recent Advances in Natural Language Processing.
ACL 2004, Annual Meeting of the Association for Computational Linguistics.
CONLL 2004, Computational Natural Language Learning.
ECAI-2004 workshop on Ontology Learning and Population from Text.
IJCAI 2005, International Joint Conferences on Artificial Intelligence.
BISFAI 2005, Eighth Biennial Israeli Symposium on the Foundations of Artificial
Intelligence.
RANLP 2005, Recent Advances in Natural Language Processing.
AAAI 2006, National Conference on Artificial Intelligence.
EACL 2006 (European Chapter of the ACL)
ICOS 2006 (Conference on Computational Semantics)
HLT-NAACL 2006 (Human Language Technology and North American Chapter of the
ACL)
ACL 2006
ACL 2006 Workshop on Linguistic Distances
11
37.
38.
39.
40.
41.
42.
43.
44.
OLP 2006 (ACL Workshop on Ontology Learning and Population)
IJCAI 2007 (International Joint Conferences on Artificial Intelligence)
AAAI 2007 (National Conference on Artificial Intelligence)
ACL 2007
ACL 2008
AAAI 2008
COLING 2008
EACL 2009
Additional Reviewing
1. Reviewing NLP article submissions for the following journals:
a. Computational Linguistics
b. Machine Translation
c. Artificial Intelligence
d. Journal of Artificial Intelligence Research
e. Machine Learning
f. Natural Language Engineering
g. Annals of Mathematics and Artificial Intelligence
h. Information Processing and Management.
i. Computer, Speech and Language
2. Reviewing research grant proposals in the NLP area for:
a. Israel Science Foundation (ISF)
b. US-Israel Binational Science Foundation (BSF)
c. German-Israeli Foundation for Scientific Research and Development (GIF)
d. Israel Ministry of Science.
Invited Talks and Panels at International Events
1. Invited panelist at the meeting of the European Expert Advisory Group on Language
Engineering Standards (EAGLES), Madrid, 1997. Topic: Bilingual alignment and lexicon
acquisition.
2. Invited talk at the SPARKLE (Shallow PARsing and Knowledge Extraction for Language
Engineering) European project review, Pisa, 1998. Topic: Automatic thesaurus
construction.
3. Invited talk at the Bolzano (Italy) Workshop on Corpus-based Terminology, 1998. Topic:
Automated corpus-based acquisition of bilingual terminology.
4. Invited talk at TALN, Annual Meeting of the French Natural Language Processing
Association, France, 1999. Topic: Vector models in language processing.
5. Invited talk at the TELRI European Seminar (Trans-European Language Resources
Infrastructure), 1999. Topic: Automatic acquisition of multi-lingual resources.
6. Invited panelist at the ACL SensEval-3 Workshop, Barcelona, 2004. Topic: Recognizing
Lexical Entailment: A Sense Matching Task.
7. Invited panelist at the MEANING 2005 workshop, Trento, 2005. Topic: What do
applications need?
Implicit Sense Matching (Lexical Entailment).
8. Invited Keynote Speaker at the International Conference on Recent Advances in Natural
Language Processing (RANLP), Bulgaria, 2005. Topic: Textual entailment and applied
semantic inference.
9. Invited Lecture at the ACL 2006 Workshop on Linguistic Distances, Sydney. Topic:
Semantic Similarity: What for?
10. Invited Keynote Speaker at the 2007 AAAI Spring Symposium on Machine Reading,
Stanford University, Palo Alto. Topic: Textual Entailment
12
11. Invited panelist at the ACL SemEval workshop, Prague, 2007.
12. Invited talk at the CICLING 2008 conference.
Topic: Natural Language as the Basis for Meaning Representation and Inference.
13. Invited speaker at the joint STEVIN-IMIX conference, Utrecht, June 2008.
Topic: What should we do next? Applied semantic engine.
14. Invited speaker at the Symposium on Semantic Knowledge: Discovery, Organization and
Use. New York University, New York, November 2008. Topic: It’s time for a semantic
engine.
15. Invited speaker at the 8th International Conference on Computational Semantics (IWCS8), Tilburg, January 2009.
Invited Summer School Courses
1. Statistical Methods for Natural Language Processing.
ESSLLI – European Summer School on Language, Logic and Information, Lisbon,
Portugal, 1993.
2. Statistical Machine Translation.
ELSNET European Summer School on Language and Speech Communication: CorpusBased Methods, Utrecht, Holland, 1994.
3. Multilingual Corpus Processing.
ELSNET European Summer School on Language and Speech Communication:
Multilinguality in Speech and Language Processing, Edinburgh, Scotland, 1995.
4. Lexical Statistical Methods for Natural Language Processing.
ESSLLI – European Summer School on Language, Logic and Information, Saarbrucken,
Germany, 1998.
5. Text Mining.
ELSNET European Summer School on Language and Speech Communication: Text and
Speech Triggered Information Access, Xios, Greece, 2000
Conference Tutorials
1. Bilingual Word Alignment and Lexicon Construction.
At the Annual Meeting of the Association for Computational Linguistics (ACL), 1996.
2. Bilingual Word Alignment and Lexicon Construction.
At the International Conference on Computational Linguistics (COLING), 1996.
3. Lexical Statistical Methods for Natural Language Processing.
At the joint COLING-ACL conference, 1998.
4. Learning in NLP: When can we reduce or avoid annotation cost?
At the International Conference on Recent Advances in Natural Language Processing
(RANLP), 2003.
5. Textual Entailment.
At ACL 2007
Research Grants
1. Ido Dagan and Alon Itai. Statistical Methods for Disambiguation in Natural
Languages. Grant number 120-741 of the Israel Council for Research and
Development, 1988-1992.
2. Ido Dagan, Yaacov Choueka and Sean Engleson. Alignment of parallel bilingual
texts: Handling disparate languages and rich morphology and applying local syntactic
constraints. Grant number 488/95-1 of the Israel Science Foundation (ISF), 199513
1998.
3. Yaacov Choueka, Ido Dagan, Tomi Klein, Ariel Frank and Michael Elhadad. Taming
the information highway: Infrastructures and prototypes for intelligent textual
information handling, with special attention to Hebrew. Grant of the Israeli Ministry
of Science and the Arts for Scientific and Technological Infrastructure, 1995-1998.
4. Ronen Feldman, Ido Dagan, Willy Kloewsgen and Stefan Wrobel. Generic
environment and high-level language for knowledge discovery in texts. Grant of the
German-Israeli Foundation for Scientific Research and Development (GIF), 19961999.
5. Ido Dagan, Ronen Feldman, Beatrice Daille, Yves Kodratof. Term level text mining:
representations and algorithms. Grant of AFIRST – French-Israeli Scientific
Cooperation, 1996-1998.
6. Ido Dagan. Similarity and analogy in structured textual information processing. Grant
of the Israel Science Foundation (ISF), 1998-2001.
7. Ido Dagan. PASCAL European Network of Excellence on Pattern Analysis,
Statistical Modelling and Computational Learning, funded by the European
Commission. 2003-2007. 130,000 Euro (multiple activities).
8. Shuly Wintner and Ido Dagan. Cross-lingual Information Retrieval. Funded by the
Israeli Chapter of the Internet Society. 2004-2007. $27,500 for Bar Ilan.
9. Ido Dagan. Propositional Retrieval: Bridging Applied Semantic Language Processing
and Information Retrieval. Funded by Israel Science Foundation (ISF). 2004-2007.
420,000 NIS.
10. Ido Dagan. Automatic identification of meaning equivalence and entailment for
semantic information retrieval.
Funded by the Israel Ministry of Defense. 2006-2008. 630,000 NIS.
11. Ido Dagan. Ontology Learning and Meta-Tagging.
Funded by the Israel Ministry of Industry and Trade. 2006-2009. 990,000 NIS.
12. Ido Dagan. Natural Language Processing for Cultural Heritage Information.
Funded by the Bruno-Kessler Foundation (Italy), 2007-2010, 80,000 Euro.
13. Ido Dagan. PASCAL-2: 2nd European Network of Excellence on Pattern Analysis,
Statistical Modelling and Computational Learning, funded by the European
Commission. 2008-2013. 20,000 Euro for first year.
14. Ido Dagan. Validating intelligence information and unsupervised information
extraction. Funded by the Israel Ministry of Defense. 2009. 200,000 NIS.
15. Ido Dagan. Textual Entailment Inference for On-Demand Information Extraction.
Funded by Israel Science Foundation (ISF). 2008-2012. 740,000 NIS.
Supervising Graduate Students
M.Sc. Thesis
14
1. Shlomit Hazan (with Dr. Ronen Feldman): Discovery and clustering of association
rules in large data bases. 1997.
2. Erez Lotan: Automatic construction of a statistical thesaurus. 1998.
3. Alex Avramovitch: An Internet Crawler for automatic corpus and thesaurus
construction. 1998.
4. Shelly Katz (with Dr. Ariel Frank): Intelligent information filtering within
information harvesting in the Internet. 1998.
5. Roman Mitnitsky: A personal search agent for Internet users. 1998.
6. Michal Finkelstein-Landau: Term-based summarization and knowledge discovery in
texts. 1999.
7. Marina Risher: Automatic query generation. 2001.
8. Ehud Conley: Seq_align: A parsing-independent bilingual sequence alignment
algorithm. 2002.
9. Idan Szpektor (with Prof. Yossi Matias). Scaling Web Based Acquisition of
Entailment Relations. 2005.
10. Shachar Mirkin (with Dr. Ari Rappoport). Pattern-based acquisition of lexical
entailment relations. 2006
11. Moshe Friedman (with Prof. Moshe Koppel). A Machine Learning Approach for
Coreference Resolution in Probabilistic Textual Entailment. 2006.
12. Tal Itzhak Ron. Lexical Acquisition of Nominalization Entailment Rules Using
Online Lexical Resources. 2006.
13. Efrat Hershkovitz. Implicit word sense disambiguation via context-sensitive lexical
entailment. 2006.
14. Eyal Shnarch. Lexical entailment and its extraction from wikipedia. 2008.
15. Libby Berkovitch. Keyword based text categorization. 2008.
16. Ephi Sachs. Semantic aspects of information retrieval. 2008.
17. Lili Kotlerman. Directional distributional similarity for lexical expansion. 2009.
18. Chen Erez. Using semantic knowledge for co-reference resolution. 2009.
19. Chaya Liebeskind. Text categorization for large multi-class taxonomy. 2009.
20. Hadas Zohar.
21. Roni Ben Aharon.
22. Naomi Frankel.
Ph.D. Thesis
1. Zvika Marx (with Prof. Eli Shamir): Structure Based Computational Aspects of
Similarity and Analogy in Natural Language. 2005.
2. Yuval Krymolowski (with Prof. Amihood Amir): Partial Parsing using MemoryBased Sequence Learning. 2006.
3. Oren Glickman (with Prof. Moshe Koppel): Generic Shallow Semantic Inference
based on Automatic Knowledge Acquisition. 2006.
4. Maayan Gefet (with Dr. Dror Feitelson): Automatic construction of ontology from
text. (2006).
5. Roy Bar Haim. Probabilistic Lexical-Syntactic Inference for Textual Entailment
6. Idan Szpektor. Automatic Acquisition of Lexical-Syntactic Entailment Rules.
7. Shachar Mirkin. Lexical Entailment Inference and its application to Semantic
Information Retrieval
8. Eyal Shnarch. Learning lexical entailments for semantic inference. (PhD proposal
2009).
9. Jonathan Berant.
10. Lili Kotlerman.
11. Asher Stern.
15
Download