Curran

advertisement
James Curran
BScAdv(Hons) Sydney, PhD Edinburgh
Dr
Associate Professor and ARC Australian Research Fellow
Schwa Lab
SIT Building J12, Room 449
(Cnr Cleveland St and City Rd)
P: +61 2 9036 6037
F: +61 2 9351 3838
E: james DOT r DOT curran AT sydney DOT edu DOT au
URL http://sydney.edu.au/it/~james
Research interests
My research is in computational linguistics, focusing on robust statistical approaches to
broad-coverage large-scale natural language processing (NLP). My interests range from the
design of fundamental NLP components, including text processing and tagging tools,
through to statistical parsers and high-level systems for financial modelling using text,
question answering and information extraction.
My background is in computer science. Computational linguistics poses challenges that
require both algorithmic and implementation techniques of interest to computer scientists.
Meanwhile, computational linguists are developing complex formalisms and statistical
models that enable increasingly detailed linguistic analyses. Unfortunately, greater fidelity
usually brings significant efficiency penalties. Providing even a superficial analysis of the
rapidly growing volume of text now available is a prodigious task.
I am excited by the challenges of developing large-scale and robust deep-linguistic
processing techniques that are feasible for tera-scale datasets. I believe that statistical
parsing with lexicalised grammar formalisms, e.g. Combinatory Categorial Grammar (CCG),
and supertagging, provides the best trade-off between linguistic fidelity and efficiency.
Efficient, accurate parsing will enable us to create and exploit unprecedented quantities of
automatically analysed text using semi-supervised knowledge acquisition. This will be crucial
to overcoming the knowledge bottleneck that hampers real-world applications of NLP.
Selected publications
The following list is a selection from publications (approx 10)
J Nothman, N Ringland, W Radford, T Murphy, and J R Curran (2012). Learning multilingual
named entity recognition from Wikipedia. Artificial Intelligence, Elsevier.
B Hachey, W Radford, J Nothman, M Honnibal, and J R Curran (2012). Evaluating Entity
Linking with Wikipedia. Artificial Intelligence, Elsevier.
D Vadas and J R Curran (2011). Parsing noun phrases in the Penn Treebank.
Computational Linguistics, 37(4). MIT Press.
S Clark and J R Curran (2007). Wide-Coverage Efficient Statistical Parsing with CCG and
Log-Linear Models. Computational Linguistics 33(4):493–552.
J R Curran, T Murphy, and B. Scholz (2007). Minimising semantic drift with Mutual Exclusion
Bootstrapping. In Proceeding. of the Conference of the Pacific Association for Computational
Linguistics, pp 172–180. Melbourne, Australia. Best Paper Award.
J Gorman and J R Curran (2006) Scaling Distributional Similarity to Large Corpora. In Proc.
of the 21st International Conference on Computational Linguistics and 44th Annual Meeting
of the Association for Computational Linguistics, pp 361–368, Sydney, Australia.
S Clark and J R Curran (2004) Parsing the WSJ using CCG and Log-Linear Models. In
Proceeding. of the 42nd Annual Meeting of the Association for Computational Linguistics
(ACL), pp 104–111, Barcelona, Spain.
S Clark and J R Curran (2004) The Importance of Supertagging for Wide-Coverage CCG
Parsing. In Proceeding. of the 20th International Conference on Computational Linguistics
(COLING), pp 282–288, Geneva, Switzerland.
J R Curran and S Clark (2003) Investigating GIS and Smoothing for Maximum Entropy
Taggers. In Proceeding. of the 11th Conference of the European Chapter of the Association
for Computational Linguistics (EACL), pp 91–98, Budapest, Hungary.
J R Curran and S Clark (2003) Language Independent NER using a Maximum Entropy
Tagger. In Proceeding. of the 7th Conference of Natural Language Learning (CoNLL), pp
164–167, Edmonton, Canada.
Teaching interests
Introductory and advanced programming, data structures, algorithms, software
engineering, artificial intelligence, machine learning, computational linguistics.
Courses taught:
 INFO1903: Informatics (Advanced)
 ENGG1801: Engineering Computing
 COMP5046: Statistical Natural Language Processing
Download