Tuhinga Māhorahora: a corpus of Māori writing by children Jeanette King, University of Canterbury Christine Brown, Resource Teacher of Māori Mary Boyce, University of Canterbury Aim • to support kaiako in MME (Rau 2005) • previous pilot by Christine Brown • analysing words used by children in their in-class writing • gave valuable feedback to kaiako which improved teaching practice • to develop practices and protocols to analyse written texts Tuhinga Māhorahora • Self-directed writing by tamariki in Māori medium education classrooms • In 2013 the writing of 70 children in year 1-8 MME was photographed • Over 1,200 pieces of writing • ≈ 60,000 word tokens • ≈ 2,100 word types Method • With funding from NZILBB, these pieces of writing have been transcribed and tagged and entered into a database • Transcription: initially using text editors, but now with Xml TEI Editor oXygen Method • Text anonymised – any name which could identify tamariki or school replaced • Text tagged – mostly to correct spelling corrections, but also to tag names Method • Participant and transcript information also added to TEI header Method • Uploaded to NZILBB’s LaBB-CAT online corpus analysis tool Analysis • Search and export options in LaBB-CAT Analysis WordSmith • To calculate frequency lists Range • To analyse use of words in relation to Brown’s wordlists • 9 lists of content types • 1 list of function words • 1 list of names the children are commonly using • Wordlists complied from a number of Māori language corpora Analysis Data • Year 3 • 10 tamariki • Aged 6;8 to 9 years • 346 pieces of writing • 13,000 words Results Year 3 Results Year 3 Results Year 3 Results Year 3 - tokens as % of text coverage names words not on lists one two three four five seven function words six eight nine Results Year 3 - types used as a percentage of Brown's lists 100 90 80 70 60 % 50 40 30 20 10 0 list 1 list 2 list 3 list 4 list 5 list 6 list 7 list 8 list 9 function words Frequency Top 15 words Frequency Top 15 content words Frequency Top 15 content words Muri Whai muri i … i muri i … Alternatives One option would be to encourage use of alternatives • Kātahi – already being used • 20 instances from 4 tamariki • but 16 from one child • Ā, nā • not being used as a connective particle List one Words not used by tamariki: • āta • rite • hau • take • karanga • tīmata • kitea • tinana • marama • tohu • momo • whakarongo • ora • whakautu • rau Kāore Function words Word gloss number Engari But 6 Ahakoa Even though 1 Ehara Negative particle 0 Reira Anaphoric location particle 0 Taua Anaphoric determiner 0 English words • 338 English words or phrases Topics Earthquake Where to from here? • Apply for funding from MOE to enable project to collect and code data and deliver information to kaiako during school year. • Build up analysis tools in LaBB-CAT. Work towards an online tool which teachers themselves can use. • Some classrooms are now using tablets for writing which will make collection and tagging of data even easier. Acknowledgements • Kura, kaiako and tamariki • NZILBB: Robert Fromont, Scott Lloyd • RAs: Roberta Tainui, Caitlin Swan and Niwa Wehi • Paul Nation’s website: http://www.victoria.ac.nz/lals/about/staff/paul-nation References Brown, C. (2009). Assessing the readability of Māori language texts for classroom use. Masters thesis. University of Canterbury. http://hdl.handle.net/10092/4015 King, J. (2015) Metaphors we die by: change and vitality in Māori. In E. Piirainen and A. Sherris (Ed.), Language Endangerment: disappearing metaphors and shifting conceptualizations: 15-36. Amsterdam: John Benjamins. Rau, C. (2005). Literacy acquisition, assessment and achievement of year two students in total immersion in Māori programmes. International Journal of Bilingual Education and Bilingualism 8(5): 404-32. Wray, A. (2002). Formulaic language and the lexicon. Cambridge: Cambridge University Press. DOI: 10.1017/CBO9780511519772 CLIP • Comparative Language Input Project • Teacher speech • Māori, Mayan language (Guatemala), Jeju Island (Korea), Alaska & Philipines • Number of words, number of sentences, noun/verb count Neologisms borrowings from English • karaima = climb • mōro = mall • nawhe = enough • poita = points • Timanaki - Chipmunks • wikini/wikeni = weekend Neologisms compounds • panunukakau = scooter • piritau = badge • hoari rama = light saber • papa piro = score board • papa āwhina = kickboard • pekenui / papa tūpeke= trampoline • ngongototo = vampire Neologisms Blends • wairero – rap, from waiata (to sing) and kōrero (to speak) Combination borrowing and calque • warewhare = warehouse • ipapa = ipad