Turbo Annotate - North

advertisement
Core Technologies
Name
TurboAnnotate
Developer(s)
GB van Huyssteen, MJ Puttkammer, M Schlemmer
Affiliation(s)
Centre for Text Technology (CTexT), North-West University, Potchefstroom, South
Africa
Description
TurboAnnotate is a user-friendly annotating environment (i.e. tool) for bootstrapping
linguistic data for machine-learning purposes, or for manually creating gold standards
or other annotated lists.
This first version of TurboAnnotate was developed with the specific the task of
hyphenation for South African languages in mind.
In the annotation GUI, the annotator simply drags the mouse over the part of the word
to be annotated, and on release of the mouse button, the selection changes colour.
The machine learning system that we use in our system is the well-known Tilburg
Memory-Based Learner (TiMBL; Daelemans et al, 2004).
Van Huyssteen & Puttkammer (2007) reports that TurboAnnotate could not only
ensure higher accuracy in human annotations, but could also save on human effort
required (at least in the case of Afrikaans).
Work on TurboAnnotate continues.
Category(ies)
Morphological Analysis to Annotation
Language(s): In
Languages using the character set of the Latin alphabet
Language(s):
Out
Languages using the character set of the Latin alphabet
Distribution
Online
Documentation
Van Huyssteen, GB & Puttkammer, MJ. Accelerating the Annotation of Lexical
Data for Less-Resourced Languages. Proceedings: Interspeech 2007 Eurospeech, 10th European Conference on Speech Communication and
Technology. Antwerp, Belgium, August 27-31, 2007.
Daelemans, W, Van den Bosch, A, Zavrel, J & Van der Sloot, A. "TiMBL:
Tilburg Memory Based Learner, Version 5.1, Reference Guide", ILK Technical
Report, February 4, 2004.
Operating
System(s )
Linux
Programming
Language
Perl
Execution
Location
Local
Required
Software
TiMBL 5.02 , Perl
Pricing:
Academic
n/a
Pricing:
Multiple Users
n/a
Pricing:
Commercial
n/a
Licence
Open Source (GPL)
Contact Person
MJ Puttkammer: Martin.Puttkammer@nwu.ac.za
Other
Information
To acquire the source code, please send an e-mail to Martin Puttkammer
(Martin.Puttkammer@nwu.ac.za)
Download