Spelling Checkers for Africa - North

advertisement

Private Bag x6001 Potchefstroom

Suid-Afrika 2520

Tel (018) 299 1541

Fax (018) 299 1571

Web: www.ctext.co.za

CTexT …1 of 5

PRESS RELEASE

MEDIAVERKLARING

Spelling Checkers for Africa

FOR IMMEDIATE RELEASE/ VIR ONMIDDELIKE VRYSTELLING

2008-03-17

UITGEREIK DEUR/ ISSUED BY:

Moira Müller

Kommunikasiebeampte

Communications Officer

CTexT

…2 of 5

SPELLING CHECKERS FOR AFRICA

(Afrikaans follows English)

Potchefstroom – The Centre for Text Technology (CTexT  ) on the Potchefstroom Campus of the North-West

University is currently completing the development of more lexical data (words from a specific language with added information, such as the part-of-speech category these words belong to) for African languages as part of Microsoft’s

Local Language Program.

The aim is to produce lexical data (which can then be used for a variety of developments, such as spelling checkers) for five regional languages in Africa, namely Hausa, Igbo, Kinyarwanda, Wolof and Yoruba. These languages are used in countries such as Nigeria, Senegal, Equatorial Guinea, Burkina Faso, Togo, Niger, Ghana, Rwanda and Mauritania.

This initiative aims to provide individuals in Africa with access to desktop computer software in their native language. Through this initiative, people are given an entry to technology in a way that is familiar and that respects their linguistic and cultural uniqueness.

The project has seen close collaboration between computational linguists from CTexT, linguists from Nigeria and the

USA, and the Proofing Tools Team from Microsoft in Ireland. CTexT expects the development of the data to be completed within the next month.

“Microsoft has an ongoing commitment to making computer software relevant for the people and countries of

Africa,” said Dr. Cheick Modibo Diarra, the chairman for Africa at Microsoft.

“Our Local Language Programme is one of the ways we broaden our reach and empower our customers and partners. It brings the development, growth and proliferation of regional languages together with advances in information technology in a way that is complementary and vital for economic development.”

CTexT has facilitated this empowerment of languages before through the development of spelling checkers for five

South African languages. The spelling checkers are for Afrikaans, isiXhosa, isiZulu, Sesotho sa Leboa and Setswana.

It is liberating for an individual and a nation to express themselves freely. With the development of language software programs and spelling checkers in native languages, CTexT is not only sustaining Human Language

Technology in Africa, but also fighting the battle to ensure that language rights are upheld.

END/

Contact details for more information:

Moira Müller l (018) 299 1541 l Moira.Muller@nwu.ac.za

l www.ctext.co.za

CTexT …3 of 5

Fact sheet

Wolof is a language spoken in Senegal, the Gambia, and Mauritania, and it is the native language of the ethnic group of the Wolof people. It belongs to the Atlantic branch of the Niger-Congo language family.

Unlike many other African languages, Wolof is not a tonal language.

Reference: http://en.wikipedia.org/wiki/Wolof_language

Igbo is a language spoken in Nigeria by around 18 million people (1999 WA), the Igbo, especially in the south-eastern region once identified as Biafra. The language was used by John Goldsmith as an example to justify deviating from the classical linear model of phonology as laid out in The Sound Pattern of

English. It is written in the Roman script. Igbo is a tonal language, like Yoruba and Chinese.

Reference: http://en.wikipedia.org/wiki/Igbo_language

Hausa is the Chadic language with the largest number of speakers, spoken as a first language by about 24 million people, and as a second language by about 15 million more.

Native speakers of Hausa, the Hausa people are mostly to be found in the African country of Niger and in the north of Nigeria, but the language is widely used as a lingua franca (similar to Swahili in East Africa) in a much larger swathe of West Africa, particularly amongst Muslims.

Reference: http://en.wikipedia.org/wiki/Hausa_language

Yoruba (native name ede Yorùbá, 'the Yoruba language') is a dialect continuum of West Africa with over

22 million speakers. The native tongue of the Yoruba people, it is spoken, among other languages, in

Nigeria, Benin, and Togo and traces of it are found among communities in Brazil, Sierra Leone (where it is called Oku), and Cuba (where it is called Nago).

Reference: http://en.wikipedia.org/wiki/Yoruba_language

Kinyarwanda is the chief spoken language in Rwanda. It is also spoken in the east of D.R. Congo and in the south of Uganda (Bufumbira-area). Kinyarwanda is a tonal language of the Bantu language family (Guthrie

D61). Kinyarwanda is closely related to Kirundi spoken in the neighboring country, Burundi and to Giha of western Tanzania.

Reference: http://en.wikipedia.org/wiki/Kinyarwanda

3

SPELTOETSERS VIR AFRIKA

CTexT …4 van 5

Potchefstroom – Die Sentrum vir Tekstegnologie (CTexT™) op die Potchefstroomkampus van die Noordwes-

Universiteit ontwikkel tans leksikale data vir Afrikatale as deel van Microsoft se “Local Language Programme”.

Hierdie leksikale data bestaan uit woorde van ’n spesifieke taal wat met addisionele inligting verryk is. So byvoorbeeld maak die woordsoortkategorie waaraan elke woord behoort sowel as woordvormingsinformasie deel uit van dié leksikale data.

Die doel van die projek is om data in te samel vir vyf streekstale in Afrika, naamlik Hausa, Igbo, Kinyarwanda, Wolof en Yoruba. Die data kan dan gebruik word in die ontwikkeling van ’n verskeidenheid taaltegnologiese toepassings, soos speltoetsers en morfologiese analiseerders. Die tale wat hier ter sprake is word gepraat in onder andere

Nigerië, Senegal, Ekwatoriaal-Guinee, Burkina Faso, Togo, Ghana, Rwanda en Mauritanië.

Hierdie inisiatief het ten doel om aan individue in Afrika toegang te gee tot rekenaarsagteware in hulle moedertaal.

Hierdeur kry mense blootstelling aan tegnologie op ʼn manier wat hulle taalkundige en kulturele uniekheid respekteer.

Die projek het noue samewerking tussen rekenaarlinguiste en taalkundiges van CTexT, Nigerië en die VSA asook die

Skryfhulpmiddelspan van Microsoft in Ierland genoodsaak. Die ontwikkeling van die data sal na verwagting binne die volgende maand voltooi word.

“Microsoft het ʼn voortdurende verbintenis tot die lokalisering van sagteware om dit sodoende ook relevant te maak vir die mense en die lande in Afrika,” het Dr. Cheick Modibo Diarra, die voorsitter van Afrika by Microsoft, gesê.

“Ons ‘Local Language Programme’ is een van die maniere waarop ons ons reikwydte kan verbreed en ons kliënte en vennote kan bemagtig. Dit bring die ontwikkeling en groei van plaaslike tale en vooruitgang in tegnologie byeen op

’n manier wat komplementêr en noodsaaklik is vir ekonomiese ontwikkeling.”

CTexT het al voorheen meegehelp aan die bemagtiging van plaaslike tale deur speltoetsers vir vyf Suid-Afrikaanse tale, Afrikaans, isiXhosa, isiZulu, Sesotho sa Leboa en Setswana, te ontwikkel.

Dit is bevrydend vir individue en nasies om hulself vryelik te kan uitspreek. Met die ontwikkeling van taalsagtewareprogramme en speltoetsers in moedertale, onderhou CTexT nie net Mensetaaltegnologie in Afrika nie, maar bevorder hulle ook taalregte.

EINDE/

_______________________________________________________________________________________________

Kontakbesonderhede vir meer inligting:

Moira Müller l (018) 299 1541 l Moira.Muller@nwu.ac.za

l www.ctext.co.za

4

Feiteblad

CTexT

…5 van 5

Wolof word gepraat in Senegal, Gambië en Mauritania. Dit is die moedertaal van die Wolof etniese groep.

Dit behoort aan die Atlantiese tak van die Niger-Kongo taalfamilie. Anders as baie ander Afrikatale is

Wolof ’n tonale taal.

Bron: http://en.wikipedia.org/wiki/Wolof_language

Igbo is ‘n taal wat deur ongeveer 18 miljoen mense in Nigerië gepraat word. Dit is veral die Moedertaal van die Ibo in die Suid-Oostelike omgewing, voorheen bekend as Biafra. Igbo is ’n tonale taal, soos Yoruba en Sjinees.

Bron: http://en.wikipedia.org/wiki/Igbo_language

Hausa is die Tsjadiesel taal met die grootste aantal sprekers en word as eerstetaal gebruik deur ongeveer

24 miljoen mense en deur nog ongeveer 15 miljoen mense as tweedetaal.

Moedertaalsprekers van Hausa word meestal aangetref in die Afrikaland Niger en in die noorde van

Nigerië, hoewel die taal veral onder Moslems wyd gebruik word as lingua franca (soortgelyk aan Swahili in

Oos-Afrika) in ’n baie wyer gedeelte van Wes-Afrika.,.

Bron: http://en.wikipedia.org/wiki/Hausa_language

Yoruba is ‘n dialek van Wes-Afrika met meer as 22 miljoen sprekers. As die moedertaal van die

Yorubamense word dit, saam met ander tale, gepraat in Nigerië, Benin en Togo en spore daarvan kan gevind word onder gemeenskappe in Brasilië, Sierra Leone (waar dit Oku genoem word), en Kuba (waar dit Nago genoem word).

Bron: http://en.wikipedia.org/wiki/Yoruba_language

Kinyarwanda is die mees gesproke taal in Rwanda. Dit word ook gepraat in die ooste van die D.R. Kongo en in die suide van Uganda (Bufumbira-area). Kinyarwanda is ’n tonale taal en is deel van die Bantutaalfamilie. Kinyarwanda is nou verwant aan Kirundi, wat in die buurland, Burundi, gepraat word, en ook aan Giha van westelike Tanzanië.

Bron: http://en.wikipedia.org/wiki/Kinyarwanda

5

Download