Testing Speaking Skills - TESOL International Research Foundation

advertisement
The International Research Foundation
for English Language Education
TESTING SECOND LANGUAGE SPEAKING SKILLS:
SELECTED REFERENCES
(last updated 30 December 2014)
Akiyama, T. (2003). Assessing speaking: Issues in school-based assessment and the introduction
of speaking tests into the Japanese senior high school entrance examinations. JALT
Journal, 25(2), 117-141.
American Council on the Teaching of Foreign Languages (1986). ACTFL Proficiency
Guidelines. New York: American Council on the Teaching of Foreign Languages.
Andrews, S., & Fullilove, J. (1994). Assessing spoken English in public examination-why and
how? In J. Boyle & P. Falvey (Eds.), English Language Testing in Hong Kong (pp. 5786). Hong Kong: Chinese University.
Arter, J. A. (1989). Assessing communicative competence in speaking and listening: A
consumer’s guide. Portland, OR: Northwest Regional Educational Laboratory.
Bachman, L. F. (1988). Problems in examining the validity of the ACTFL oral proficiency
interview. Studies in Second Language Acquisition, 10(2), 149-164.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford
University Press.
Bachman, L. F., Lynch, B. K., & Mason, M. (1995). Investigating variability in tasks and rater
judgments in a performance test of foreign language speaking. Language Testing, 12(2),
238-257.
Bachman, L. F., & Palmer, A. S. (1981a). A multitrait-multimethod investigation into the
construct validity of six tests of speaking and reading. In A. S. Palmer, P. Groot, & G.
Trosper (Eds.), The construct validation of tests of communicative competence (pp. 149165). Washington, DC: TESOL.
Bachman, L. F., & Palmer, A. S. (1981b). The construct validation of the FSI Oral Interview.
Language Learning, 31, 67-86.
Bachman, L. F., & Palmer, A. S. (1982). The construct validation of some components of
communicative proficiency. TESOL Quarterly, 16(4), 449-465.
Bachman, L. F., & Savignon, S. J. (1986). The evaluation of communicative language
proficiency: A critique of the ACTFL Oral Interview. Modern Language Journal, 70(4),
380-390.
1
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Bailey, K. M. (1985). If I had known then what I know now: Performance testing of foreign
teaching assistants. In P. C. Hauptman, R. LeBlanc, & M. Wesche (Eds.), Second
language performance testing (pp. 153-180). Ottawa: University of Ottawa Press.
Bardovi-Harlig, K. (2011). Assessing familiarity with pragmatic formulas: Planning oral/aural
assessment. In N.R. Houck & D. H.Tatsuki (Eds.), Pragmatics: Teaching natural
conversation. (pp. 7-22). New York: TESOL.
Barrett, S. (2001). The impact of training on rater variability. International Education Journal,
2(1), 49-58.
Bartz, W. H. (1979). Testing oral communication in the foreign language classroom. Arlington,
VA: Center for Applied Linguistics.
Berwick, R., & Ross, S. (1996). Cross-cultural pragmatics in oral proficiency interview
strategies. In M. Milanovic & N. Saville (Eds.), Studies in language testing 3:
Performance testing, cognition, and assessment: Selected papers from the 15th language
testing research colloquium (pp. 34-54). Cambridge, UK: Cambridge University Press.
Bernstein, J., De Jong, J. H. A. L., Pisoni, D., & Townshend, B. (2000). Two experiments on
automatic scoring of spoken language proficiency. In P. Delcloque (Ed.), Proceedings of
InSTIL2000: Integrating speech technology in learning (pp. 57-61). Dundee, Scotland,
UK: University of Abertay Dundee.
Berwick, R., & Ross, S. (1996). Cross-cultural pragmatics in oral proficiency interview
strategies. In M. Milanovic & N. Saville (Eds.), Studies in language testing 3:
Performance testing, cognition, and assessment: Selected papers from the 15th language
testing research colloquium (pp. 34-54). Cambridge, UK: Cambridge University Press.
Blanche, P. (1990). Using standardized achievement and oral proficiency tests for selfassessment purposes: The DLIFLC study. Language Testing, 7(2), 202-229.
Bridgeman, B., Powers, D., Stone, E., & Mollaun, P. (2011). TOEFL iBT speaking test scores as
indicators of oral communicative language proficiency. Language Testing, 29(1), 91-108.
Briggs, S., & MacDonald, C. (1978). A practical approach to testing speaking and listening
skills. English Teaching Forum, 16(3), 8-15.
Brindley, G. (1989). Assessing achievement in the learner-centered curriculum. Sydney:
National Centre for English Language Teaching and Research, Macquarie University.
Brooks, L. (2009). Interacting in pairs in a test of oral proficiency: Co-constructing a better
performance. Language Testing, 26(3), 341-366.
2
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Brown, A. (1993). The role of test-taker feedback in the test development process: Test-takers’
reactions to a tape-mediated test of proficiency of spoken Japanese. Language Testing,
10(3), 277-303.
Brown, A. (1995). The effect of rater variables in the development of an occupation-specific
language performance test. Language Testing, 12(1), 1-15.
Brown, A. (2003). Interviewer variation and the co-construction of speaking proficiency.
Language Testing, 20(1), 1-25.
Brown, A. (2004). Discourse analysis and the oral interview: Competence or performance. In D.
Boxer & A. D. Cohen (Eds.), Studying speaking to inform language learning (pp. 263282). Clevedon, UK: Multilingual Matters.
Brown, A., & Hill, K. (1998). Interviewer style and candidate performance in the IELTS oral
interview. IELTS Research Reports, 1, 1-19.
Brown, D. (1983). Conversational cloze tests and conversational ability. ELT Journal, 37(2),
158-161.
Brown, H. D. (2004). Language assessment: Principles and classroom practices. London:
Longman.
Button, G. (1992). Answers as interactional products: Two sequential practices used in
interviews. In P. Drew & J. Heritage (Eds.), Talk at work: Interaction in institutional
settings (pp. 212-231). Cambridge, UK: Cambridge University Press.
Byrnes, H. (1987). Features of pragmatic and sociolinguistic competence in the oral proficiency
interview. In A. Valdman (Ed.), Proceedings of the symposium on the evaluation of
foreign language proficiency (pp. 167-177). Bloomington, IN: Indiana University.
Callaway, D. R. (1980). Accent and the evaluation of ESL oral proficiency. In J. W. Oller & K.
Perkins (Eds.), Research in language testing (pp.102-115). Rowley, MA: Newbury
House.
Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second
language teaching and testing. Applied Linguistics, 1, 1-47.
Carpenter, K., Fujii, N., & Kataoka, H. (1995). An oral interview procedure for assessing second
language abilities in children. Language Testing, 12(2), 157-181.
Center for Applied Linguistics (2000). BEST evolves to meet new needs. CAL Reporter, 14(1),
1, 5.
3
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Chalhoub-Deville, M. (1995). A contextualized approach to describing oral language proficiency.
Language Learning, 45(2), 251-281.
Chalhoub-Deville, M. (1995). Deriving oral assessment scales across different tests and rater
groups. Language Testing, 12(1), 16-33.
Chalhoub-Deville, M., & Fulcher, G. (2003). The oral proficiency interview and the ACTFL
guidelines: A research agenda. Foreign Language Annals, 36(4), 498-506.
Chalhoub-Deville, M., & Wigglesworth, G. (2005). Rater judgment and English language
speaking proficiency. World Englishes, 24(3), 383-391.
Chambers, L., & Ingham, K. (2011). The BULATS Online Speaking Test. Cambridge ESOL
Research Notes, 43, 21-25.
Chaudhary, S. (1997). Testing spoken English as a second language [Electronic version]. Forum,
35(2), 2.
Clankie, S. (1995). The SPEAK test of oral proficiency: A case study of incoming freshmen. In J.
D. Brown & S. Yamashita (Eds.), Language testing in Japan (pp. 119-125). Tokyo: JALT
Applied Materials.
Clark, J. L. D. (Ed.). (1978). Direct testing of speaking proficiency. Princeton, NJ: Educational
Testing Service.
Clark, J. L. D. (1979). Direct and semi-direct tests of speaking ability. In E. J. Briere & F. B.
Hinofotis (Eds.), Concepts in language testing (pp. 35-49). Washington, DC: TESOL.
Clark, J. L. D. (Ed.). (1983). Language testing: Past and current status -- directions for the future.
Modern Language Journal, 67(4), 431-443.
Clark, J. L. D. (1986). Development of a tape-mediated ACTFL/ILR scale-based test of Chinese
speaking proficiency. In C. W. Stansfield (Ed.), Technology and language testing (pp.
129-146). Washington, DC: TESOL.
Clark, J. L. D. (1987). A study of the comparability of speaking proficiency interview ratings
across three government language training agencies. In K. M. Bailey, T. L. Dale, & R. T.
Clifford (Eds.), Language testing research: Selected papers from the 1986 Colloquium
(pp. 132-179). Monterey, CA: Defense Language Institute.
Clark, J. L. D. (1988). Validation of a tape-mediated ACTFL/ILR-scale based test of Chinese
speaking proficiency. Language Testing, 5(2), 187-205.
4
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Clark, J. L. D., & Clifford, R. T. (1988). The FSI/ILR/ACTFL proficiency scales and testing
techniques: Development, current status, and needed research. Studies in Second
Language Acquisition, 10(2), 121-147.
Clark, J. L. D., Clifford, R. T., & Hooshmand, D. (1992). “Screen-to-screen” testing: An
exploratory study of oral proficiency interviewing using video teleconferencing. System,
20(3), 293-304.
Cogan, D. (1998). Oral English testing in a Japanese university. In J.C. Richards (Ed.), Teaching
in action: Case studies from second language classrooms (pp. 334-339). Alexandria, VA:
TESOL.
Congdon, P., & McQueen, J. (2000). The stability of rater severity in large-scale assessment
programs. Journal of Educational Measurement, 37(2), 163-178.
Coniam, D. (1995). Towards a common ability scale for Hong Kong English secondary-school
forms. Language Testing, 12(2), 182-193.
Courtney, M. (1996). Talking to learn: Selecting and using peer group oral tasks. ELT Journal,
50(4), 318-326.
Dandonoli, P., & Henning, G. (1990). An investigation of the construct validity of the ACTFL
oral proficiency guidelines and oral interview procedure. Foreign Language Annals,
23(1), 11-22.
Davies, A. (1985). Communicative language testing. Applied Linguistics, 1, 22-33.
Day, E. M., & Shapson, S. (1987). Assessment of oral communicative skills in early French
immersion programmes. Journal of Multilingual and Multicultural Development, 8(3),
237-260.
De Saint Leger, D. (2009). Self-assessment of speaking skills and participation in a foreign
language class. Foreign Language Annals, 42(1), 158-178.
Douglas, D. (1994). Quantity and quality in speaking test performance. Language Testing, 11(2),
125-143.
Douglas, D. (2000). Assessing languages for specific purposes. Cambridge: Cambridge
University Press.
Douglas, D. (2004). Discourse domains: The cognitive context of speaking. In D. Boxer, & A.
Cohen (Eds.), Studying speaking to inform second language learning (pp. 25-47).
Clevedon, UK: Multilingual Matters.
5
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Douglas, D., & Selinker, L. (1992). Analyzing oral proficiency test performance in general and
specific purpose contests. System, 20(3), 317-328.
Duffy, C. (2007). An examination of test task characteristics and their effect on oral language test
performance. In C. Irvine-Niakaris & A. Nebel (Eds.), 2nd language testing & evaluation
forum, Teaching and testing: Opportunities for learning (pp. 20-43). Athens, Greece:
Hellenic American Union.
Eckes, T. (2005). Examining rater effects in TestDaF writing and speaking performance
assessments: A many-facet Rasch analysis. Language Assessment Quarterly, 2, 197–221.
Used for speaking assessment
Egbert, M. M. (1998). Miscommunication in language proficiency interviews of first-year
German students: A comparison with natural conversation. In R. Young & A. W. He
(Eds.), Talking and testing: Discourse approaches to the assessment of oral proficiency
(pp. 149-172). Philadelphia, PA: John Benjamins.
Edwards, A. L. (1996). Reading proficiency assessment and the ILR/ACTFL text typology: A
reevaluation. Modern Language Journal, 80(3), 350-361.
Egbert, M. M. (1998). Miscommunication in language proficiency interviews of first-year
German students: A comparison with natural conversation. In R. Young & A. W. He
(Eds.), Talking and testing: Discourse approaches to the assessment of oral proficiency
(pp. 149-172). Philadelphia, PA: John Benjamins.
Elder, C., Iwashita, N., & McNamara, T. (2002). Estimating the difficulty of oral proficiency
tasks: What does the test-taker have to offer? Language Testing, 19(4), 347-368.
Fall, T., Adair-Hauck, B., & Gilsan, E. (2007). Assessing students’ oral proficiency: A case for
online testing. Foreign Language Annals, 40(3), 377-406.
Firth, J. D. (Ed.). (1980). Measuring spoken language proficiency. Washington, DC: Georgetown
University Press.
Foster, P., Tonkyn, A., & Wigglesworth, G. (2000). Measuring spoken language: A unit for all
reasons. Applied Linguistics, 21(3), 354-375.
Fulcher, G. (1987). Tests of oral performance: The need for data-based criteria. ELT Journal,
41(4), 287-291.
Fulcher, G. (1996). Does thick description lead to smart tests? A data-based approach to rating
scale construction. Language Testing, 13(2), 208-238.
6
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Fulcher, G. (1996). Testing tasks: Issues in task design and the group oral. Language Testing,
13(1), 23-51
.
Fulcher, G. (1996). Invalidating validity claims for the ACTFL oral rating scale. System, 24(2),
163-172.
Fulcher, G. (1998). Testing speaking. In C. Clapham (Ed.), Language testing and assessment (pp.
75-86). Encyclopedia of Language and Education. Amsterdam, The Netherlands: Kluwer
Academic Publishers.
Fulcher, G. (2003). Testing second language speaking. Harlow, UK: Longman.
Fulcher, G., Davidson, F., & Kemp, J. (2011). Effective rating scale development for speaking
tests: Performance decision trees. Language Testing, 28(1), 5-29.
Fulcher, G., & Marquez Reiter, R. (2003). Task difficulty in speaking tests. Language Testing,
20(3), 321-344.
Galaczi, E. D. (2010). Peer-peer interaction in a paired speaking test: The case of FCE.
Cambridge ESOL Research Notes, 42, 22.
Galaczi, E., & French, A. (2011). Context validity. In L. Taylor (Ed.), Examining speaking:
Research and practice in assessing second language speaking. Studies in Language
Testing, 30 (pp. 112-170). Cambridge: UCLES/Cambridge University Press.
Gan, Z. (2010). Interaction in group oral assessment: A case study of higher and lower scoring
students. Language Testing, 27(4), 585-602.
Grove, E., & Brown, A. (2001). Tasks and criteria in a test of oral communication skills for firstyear health science students. Melbourne Papers in Language Testing, 10, 37-47.
Hadden, B. (1991). Teacher and non-teacher perceptions of second-language communication.
Language Learning, 41(1), 1-24.
Haggstrom, M. (1994). Using a video camera and task-based activities to make classroom oral
testing a more realistic communicative experience. Foreign Language Annals, 27(2),
161-175.
Halleck, G. (1992). The oral proficiency interview: Discrete point test or measure of
communicative language ability? Foreign Language Annals, 25(3), 227-231.
Harlow, L. L., & Caminero, R. (1990). Oral testing of beginning language students at large
universities: Is it worth the trouble? Foreign Language Annals, 23(6), 489-501.
7
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
He, A. W. (1998). Answering questions in LPIs: A case study. In R. Young & A. W. He (Eds.),
Talking and testing: Discourse approaches to the assessment of oral proficiency (pp.
101-116). Philadelphia, PA: John Benjamins.
He, A. W., & Young, R. (1998). Language proficiency interviews: A discourse approach. In R.
Young, & A. W. He (Eds.), Talking and testing: Discourse approaches to the assessment
of oral proficiency (pp. 1-24). Amsterdam, The Netherlands: John Benjamins.
Hendricks, D., Scholz, G., Spurling, R., Johnson M., & Vandenburg, L. (1980). Oral proficiency
testing in an intensive English program. In J. W. Oller & K. Perkins (Eds.), Research in
language testing (pp. 77-90). Rowley, MA: Newbury House.
Henning, G. (1983). Oral proficiency testing: Comparative validities of interview, imitation, and
completion methods. Language Learning, 33(3), 315-331.
Henning, G. (1992). The ACTFL Oral Proficiency Interview: Validity evidence. System, 20(3),
365-372.
Hill, K. (1998). The effect of test-taker characteristics on reactions to and performance on an oral
English proficiency test. In A.J. Kunnan (Ed.), Validation in language assessment (pp.
209-229). Mahwah, NJ: Lawrence Erlbaum.
Hingle, I., & Linington, V. (1997). English proficiency test: The oral component of a primary
school [Electronic version]. Forum, 35(2), 26.
Hinofotis, F. B., Bailey, K. M., & Stern, S. L. (1981). Assessing the oral proficiency of
prospective foreign teaching assistants: Instrument development. In A. S. Palmer, P. J.
M. Groot, & G. A. Trosper (Eds.), The construct validation of tests of communication
competence (pp. 106-126). Washington, DC: TESOL.
Hoekje, B., & Linnell, K. (1994). "Authenticity" in language testing: Evaluating spoken language
tests for international teaching assistants. TESOL Quarterly, 28(1), 103-126.
Huang, H. T., & Hung, S. T. (2013). Comparing the effects of test anxiety on independent and
integrated speaking test performance. TESOL Quarterly, 47(2), 244-269.
Hughes, A. (1981). Conversational cloze as a measure of oral ability. ELT Journal, 35(2), 161168.
Hughes, A., Cooper, R. L., Nevo, D., Stevenson, D. K., & Wesche, M. B. (1986). Panel
discussion: The next 25 years? Language Testing, 3(2), 237-245.
8
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Inoi, S. (1995). The validity of written pronunciation questions: Focus on phoneme
discrimination. In J. D. Brown & S. Yamashita (Eds.), Language testing in Japan (pp.
179-186). Tokyo: JALT Applied Materials.
Isaacs, T. (2008). Towards defining a valid assessment criterion of pronunciation proficiency in
non-native English-speaking graduate students. The Canadian Modern Language Review,
64(4), 555-580.
Iwashita, N., Brown, A., McNamara, T., & O’Hagan, S. (2008). Assessed levels of second
language speaking proficiency: How distinct? Applied Linguistics, 29(1), 24-49.
Iwashita, N. & Grove, E. (2003). A comparison of analytic and holistic scales in the context of a
specific purpose speaking test. Prospect, 18(3), 25-35.
James, R. (1996). CALL and the speaking skill. System, 24(1), 15-21.
Jendi, A. (2005). Approaches to assessing English oral communication in UAE high schools. In
P. Davidson, C. Coombe, & W. Jones (Eds.), Assessment in the Arab world (pp. 173190). Dubai: TESOL Arabia.
Jieke, G., & Yan, J. (2005). Development and preliminary validation of the CET semi-direct oral
proficiency test. In A. McNeill & J. Lai (Eds.), Crosslinks in English language teaching
(pp. 19-43) Hong Kong: English Language Teaching Unit, Chinese University of Hong
Kong.
Johnson, M. (2001). The art of non-conversation: A reexamination of the validity of the oral
proficiency interview. New Haven, CT: Yale University Press.
Johnson, M., & Tyler, A. (1998). Re-analyzing the OPI: How much does it look like natural
conversation? In R. Young & W. He (Eds.), Talking and testing: Discourse approaches
to the assessment of oral proficiency (pp. 27-51). Philadelphia: John Benjamins
Publishing Company.
Jones, R. L. (1978). Interview techniques and scoring criteria at the higher proficiency levels. In
J. L. D. Clark (Ed.), Direct tests of speaking proficiency: Theory and application (pp. 89102). Princeton, NJ: Educational Testing Service.
Jonz, J. (1990). Another turn in the conversation: What does cloze measure? TESOL Quarterly,
24(1), 61-83.
Kang, O. (2013). Linguistic analysis of speaking features distinguishing general English exams at
CEFR levels. Research Notes, 52, 40-48.
9
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Ke, C., & Reed, D. J. (1995). An analysis of results from the ACTFL Oral Proficiency Interview
and the Chinese Proficiency Test before and after intensive instruction in Chinese as a
foreign language. Foreign Language Annals, 28(2), 208-222.
Kim, M. (2001). Detecting DIF across the different language groups in a speaking test. Language
Testing, 18(1), 89-114.
Kim, Y. H. (2009). A G-theory analysis of rater effect in ESL speaking assessment. Applied
Linguistics, 30, 435–440.
Kim, Y-H. (2009). An investigation into native and non-native teachers’ judgment on oral
English performance: A mixed methods approach. Language Testing, 26, 187-217.
Kim, Y-H. (2009). Exploring rater and task variability in second language oral performance
assessment. In A. Brown, & K. Hill (Eds.), Language testing and evaluation, Volume 13:
Tasks and criteria in performance assessment (pp. 91-109). Frankfurt, Germany: Peter
Lang.
Kormos, J. (1999). Simulating conversations in oral-proficiency assessment: A conversation
analysis of role plays and non-scripted interviews in language exams. Language Testing,
16(2), 163-188.
Kormos, J., & Denes, M. (2004). Exploring measures and perceptions of fluency in the speech of
second language learners. System 32(2), 145-164.
Lantolf, J.P., & Frawley, W. (1985). Oral-proficiency testing: A critical analysis. Modern
Language Journal, 69(4), 337-345.
Lantolf, J. P., & Frawley, W. (1988). Proficiency: Understanding the construct. Studies in Second
Language Acquisition, 10(2), 181-195.
Lazaraton, A. (1992). The structural organization of a language interview: A conversation
analytic approach. System, 20(3), 373-386.
Lazaraton, A. (1996). Interlocutor support in oral proficiency interviews: The case of CASE.
Language Testing, 13(2), 151-172.
Lazaraton, A. (1997). Preference organization in oral proficiency interviews: The case of
language ability assessments. Research on Language and Social Interaction, 30, 53-72.
Lazaraton, A., & Riggenbach, H. (1990). Oral skills testing: A rhetorical task approach. Issues in
Applied Linguistics, 1(2), 196-217.
10
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Lazaraton, A., & Wagner, S. (1996). The revised test of spoken English (TSE): Analysis of
native speaker and nonnative speaker data. TOEFL Monograph Series MS-7. Princeton,
NJ: Educational Testing Service.
Lee, Y. W. (2005). Dependability of scores for a new ESL speaking test: Evaluating prototype
tasks (TOEFL Monograph Series No. 28). Princeton, NJ: Educational Testing Service.
Lee, Y. (2006). Dependability of scores for a new ESL speaking assessment consisting of
integrated and independent tasks. Language Testing, 23(2), 131-166.
Leung, C., & Mohan, B. (2004). Teacher formative assessment and talk in classroom contexts –
assessment as discourse and assessment of discourse. Language Testing, 21(3), 335-359.
Lim, G., & Galaczi, E. (2010). Lexis in the assessment of speaking and writing: An illustration
from Cambridge ESOL's General English tests. Cambridge ESOL Research Notes, 41,
14-19.
Lindblad, T. (1992). Oral tests in Swedish schools: A five-year experiment. System, 20(3), 279292.
Linder, C. (1977). Oral communication testing: A handbook for the foreign language teacher.
Skokie, IL: National Textbook Company.
Liski, E., & Puntanen, S. (1983). A study of the statistical foundations of group conversation
tests in spoken English. Language Learning, 33(2), 225-246.
Lombardo, L. 1984. Oral testing: Getting a sample of real language. English Teaching Forum.
January, 2-6.
Lumley, T. (1998). Perceptions of language-trained raters and occupational experts in a test of
occupational English language proficiency. English for Specific Purposes, 17(4), 347367.
Lumley, T., & McNamara, T. F. (1995). Rater characteristics and rater bias: Implications for
training. Language Testing, 12(1), 54-71.
Lumley, T., & O’Sullivan, B. (2005). The effect of test-taker gender, audience and topic on task
performance in tape-mediated assessment of speaking. Language Testing, 22(4), 415-437.
Luoma, S. (2004). Assessing speaking. Cambridge, UK: Cambridge University Press.
Lynch, B.K., & McNamara, T.F. (1998). Using G-theory and Many-facet Rasch measurement in
the development of performance assessments of the ESL speaking skills of immigrants.
Language Testing, 15(2), 158-180.
11
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Madsen, H. S., & Jones, R. L. (1981). Classification of oral proficiency tests. In A. S. Palmer, P.
J. M. Groot & G. A. Trosper (Eds.), The construct validation of tests of communicative
competence (pp. 15-30). Washington, DC: TESOL.
Magnan, S. S. (1988). Grammar and the ACTFL oral proficiency interview: Discussion and data.
Modern Language Journal, 72, 266-276.
Major, R. C. (1987). Measuring pronunciation accuracy using computerized techniques.
Language Testing, 4(2), 155-169.
Malabonga, V., Kenyon, D. M., & Carpenter, H. (2005). Self-assessment, preparation and
response time on a computerized oral proficiency test. Language Testing, 22, 59-92.
Malvern, D., & Richards, B. (2002). Investigating accommodation in language proficiency
interviews using a new measure of lexical diversity. Language Testing, 19(1), 85-104.
Manley, J. H. (1995). Assessing oral language: One school district’s response. Foreign Language
Annals, 28(1), 93-102.
Matthews, M. (1990). The measurement of productive skills: Doubts concerning the assessment
criteria of certain public examinations. ELT Journal, 44(2), 117-121.
May, L. (2009). Co-constructed interaction in a paired speaking test: The rater's perspective.
Language Testing, 26, 397–421.
May, L. (2011). Interaction in a paired speaking test. New York, NY: Peter Lang.
McClean, J. (1995). Cooperative assessment: Negotiating a spoken-English grading scheme with
Japanese university students. In J.D. Brown & S. Yamashita (Eds.), Language testing in
Japan (pp. 136-148). Tokyo: JALT Applied Materials.
McNamara, T. (1996). Measuring second language performance. London: Longman.
McNamara, T. F. (1997). ‘Interaction’ in second language performance assessment: Whose
performance? Applied Linguistics, 18, 446-466.
McNamara, T., Kill, K., & May, L. (2002). Discourse approaches to oral assessment. Annual
Review of Applied Linguistics, 22, 243-262.
McNamara, T. F., & Lumley, T. (1997). The effect of interlocutor and assessment mode variables
in overseas assessments of speaking skills in occupational settings. Language Testing, 14,
140-156.
12
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Mead, N., & Rubin, D. L. (1985). Assessing listening and speaking skills. Retrieved from
http://www.ericdigests.org/pre-923/speaking.htm.
Meredith, R. A. (1978). Improved oral test scores through delayed response. Modern Language
Journal, 62 , 321-327.
Merrylees, B., & McDowell, C. (1998). An investigation of speaking test reliability with
particular reference to examiner attitude to the speaking test format and
candidate/examiner discourse produced. IELTS Research Reports, 2, 1-35.
Moholt, G., & Presler, A. (1986). Correlation between human and machine ratings of Test of
Spoken English reading passages. In C. W. Stansfield (Ed.), Technology and language
testing (pp. 111-128). Washington, DC: TESOL.
Morrow, K. (1984). Testing performance in oral interaction. In S. J. Savignon & M. S. Berns
(Eds.), Initiatives in communicative language teaching (pp. 203-210). Reading, MA:
Addison-Wesley.
Morton, J. (1998). A cross-cultural study of second language narrative discourse on an oral
proficiency test. Prospect, 13(2), 20-35.
Morton, J., Wigglesworth, G., & Williams, D. (1997). Approaches to the evaluation of
interviewer behaviour in oral tests. In G. Brindley & G. Wigglesworth (Eds.), access:
Issues in language test design and delivery (pp. 175-195). Sydney, Australia: NCELTR.
Moss, P. A. (1994). Can there be validity without reliability? Educational Researcher, 23(2), 412.
Mullen, K. A. (1980). Rater reliability and oral proficiency evaluations. In J. W. Oller & K.
Perkins (Eds.), Research in language testing (pp. 91-101). Rowley, MA: Newbury House.
Muňoz, A., & Álvarez, M. (2010). Washback of an oral assessment system in the EFL
classroom. Language Testing, 27(1), 33-49.
Myford, C. M., & Wolfe, E. W. (2000). Monitoring sources of variability within the Test of
Spoken English assessment system. Princeton, NJ: Educational Testing Service.
Nagata, H. (1995). Testing oral ability: ILR and ACTFL oral proficiency interviews. In J. D.
Brown & S. Yamashita (Eds.), Language testing in Japan (pp. 08-118). Tokyo: JALT
Applied Materials.
Nakamura, Y. 1995. Making speaking tests valid: Practical considerations in a classroom setting.
In J. D. Brown & S. Yamashita (Eds.), Language testing in Japan (pp. 126-133).Tokyo:
JALT Applied Materials.
13
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Nibungco, J. T., & Williams, M. D. 1996. Designing oral assessment for nontraditional ESL
students in a community college. College ESL, 6(1), 85-94.
Noijons, J. (1985). The standardized testing of oral proficiency in the Netherlands. In P. C.
Hauptman, R. LeBlanc, & M. Wesche (Eds.), Second language performance testing (pp.
193-213). Ottawa: University of Ottawa Press.
North, B. (1995). The development of a common framework scale of descriptors of language
proficiency based on a theory of measurement. System, 23(4), 445-465.
North, B. (2000). The development of a common framework scale of language proficiency. New
York: Peter Lang.
North, B. (2007). The CERF Illustrative Descriptor Scales. The Modern Language Journal,
91(4), 656-659.
North, B., & Schneider, G. (1998). Scaling descriptors for language proficiency scales. Language
Testing, 15, 217-262.
O’Loughlin, K. (1995). Lexical density in candidate output on direct and semi-direct versions of
an oral proficiency test. Language Testing, 12(2), 217-237.
O’Loughlin, K. (2002). The impact of gender in oral proficiency testing. Language Testing,
91(2), 169-192.
O’Malley, J. M., & Pierce, L. V. (1996). Authentic assessment for English language learners:
Practical approaches for teachers. Reading, MA: Addison-Wesley.
Orr, M. (2002). The FCE Speaking test: Using rater reports to help interpret test scores.
System, 30, 143–154.
O’Sullivan, B. (2002). Learner acquaintanceship and oral proficiency test pair-task performance.
Language Testing, 19(3), 277-295.
O’Sullivan, B., & Davidson, P. (2005). A quantitative investigation of interlocutor-related
variability in speaking test performance. In P. Davidson, C. Coombe, & W. Jones (Eds.),
Assessment in the Arab world (pp. 203-223). Dubai: TESOL Arabia.
O’Sullivan, B., Weir, C J., & Saville, N. (2002). Using observation checklists to validate
speaking-test tasks. Language Testing, 19(1), 33-56.
Palmer, A. S., Groot, P. J. M., & Trosper, G. A. (Eds.). (1981). The construct validation of tests
of communicative competence. Washington, DC: TESOL.
14
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Plakans, B. S., & Abraham, R. G. (1990). The testing and evaluation of international teaching
assistants. In D. Douglas (Ed.), English language testing in U.S. colleges and universities
(pp. 68-81). Washington, D. C.: NAFSA.
Plough, I. C., Briggs, S. L., & Van Bonn, S. (2010). A multi-method analysis of evaluation
criteria used to assess the speaking proficiency of graduate student instructors. Language
Testing, 27, 235-260.
Powers, D. E., Albertson, W., Florek, T., Johnson, K., Malak, J., Nemceff, B., Porzuc, M.,
Silvester, D., Wang, R., Winner, E., & Zelazny, A. (2002). Influence of irrelevant speech
on standardized test performance. TOEFL Research Report 68. Princeton, NJ:
Educational Testing Service.
Powers, D. E., Schedl, M. A., Wilson Leung, S., & Butler, F. A. (1999). Validating the revised
Test of Spoken English against a criterion of communicative success. Language Testing,
16(4), 399-425.
Raffaldini, T. (1988). The use of situation tests as measures of communicative ability. Studies in
Second Language Acquisition, 10(2), 197-216.
Rampton, M. B. H. (1990). Displacing the ‘native speaker’: Expertise, affiliation, and
inheritance. ELT Journal, 44(2), 97-101.
Reed, D. J. (1992). The relationship between criterion-based levels of oral proficiency and normreferenced scores of general proficiency in English as a second language. System, 20(3),
329-345.
Reed, D. J., & Cohen, A. D. (2001). Revisiting raters and ratings in oral language assessment. In
C. Elder (Ed.), Experimenting with uncertainty: Essays in honour of Alan Davies (pp. 8296). Cambridge: Cambridge University Press.
Ross, S. (1987). An experiment with a narrative discourse test. In K. M. Bailey, T. L. Dale, & R.
T. Clifford (Eds.), Language testing research: Selected papers from the 1986 Colloquium
(pp. 60-69). Monterey, CA: Defense Language Institute.
Ross, S. (1992). Accommodative questions in oral proficiency interviews. Language Testing,
9(2), 173-186.
Ross, S. (1998). Divergent frame interpretations in language proficiency interview interaction. In
R. Young, & A. W. He (Eds.), Talking and testing: Discourse approaches to the
assessment of oral proficiency (pp. 333-353). Amsterdam, The Netherlands: John
Benjamins.
15
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Ross, J. A. (2006). The reliability, validity, and utility of self-assessment. Practical Assessment,
Research and Evaluation, 11(10). Retrieved from http://pareonline.net/pdf/v11n10.pdf
Ross, S., & Berwick, R. (1992). The discourse of accommodation in oral proficiency interviews.
Studies in Second Language Acquisition, 14(2), 159-176.
Saif, S. (2002). A needs-based approach to the evaluation of the spoken language ability of
international teaching assistants. The Canadian Journal of Applied Linguistics, 5, 145167.
Salaberry, R. (2000). Revising the revised format of the ACTFL oral proficiency interview.
Language Testing, 17(3), 289-310.
Salvisberg, J. (2010). Diagnostic oral skills assessment. New York, NY: Peter Lang.
Sanchez, R. (1976). Critique of oral language assessment instruments. Journal of the National
Association for Bilingual Education, 1(2), 120-127.
Savignon, S. (1985). Evaluation of communicative competence: The ACTFL provisional
proficiency guidelines. Modern Language Journal, 6, 129-134.
Sawaki, Y. (2007). Construct validation of analytic rating scales in a speaking assessment:
Reporting a score profile and a composite. Language Testing, 24(3), 355-390.
Scanlan, T. (1987). Improving fluency through a study of native pause behavior. Foreign
Language Annals, 20, 345-352.
Scott, M. L. (1986). Student affective reactions to oral language tests. Language Testing, 3(2),
99-118.
Shohamy, E. (1983). Inter-rater and intra-rater reliability of oral interview and concurrent
validity with cloze procedure. In J. Oller (Ed.), Issues in language testing research (pp.
229-236). Rowley, MA: Newbury House.
Shohamy, E. (1982). Predicting speaking proficiency from cloze tests: Theoretical and practical
considerations for test substitutions. Applied Linguistics, 3(2), 161-171.
Shohamy, E. (1983). The stability of oral proficiency assessment on the oral interview testing
procedure. Language Learning, 33, 527-540.
Shohamy, E. (1988). A proposed framework for testing the oral language of second/foreign
language learners. Studies in Second Language Acquisition, 10(2), 165-180.
16
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Shohamy, E. (1994). The validity of direct versus semi-direct oral tests. Language Testing, 11(2),
99-123.
Shohamy, E., Gordon, C. M., & Kramer, R. (1992). The effect of raters’ background and training
on the reliability of direct writing tests. Modern Language Journal, 76(1), 27-33.
Shohamy, E., Reves, T., & Bejarano, Y. (1986). Introducing a new comprehensive test of oral
proficiency. ELT Journal, 40(3), 212-222.
Spolsky, B. (1990). Oral examinations: An historical note. Language Testing, 7(2), 158-173.
St. John, J. (1992). The Ontario test of ESL oral interaction test. System, 20(3), 305-316.
Stansfield, C. W. (Ed.) (1986). Technology and language testing. Washington, DC: TESOL.
Stansfield, C. W. (1991). A comparative analysis of simulated and direct oral proficiency
interviews. In S. Anivan (Ed.), Current developments in language testing (pp. 199-209).
Singapore: SEAMEO RELC.
Stansfield, C. W., & Kenyon, D. M. (1992). Research on the comparability of the oral
proficiency interview and the simulated oral proficiency interview. System, 20(3), 347366.
Stansfield, C. W., & Kenyon, D. M. (1996). Comparing the scaling of speaking tasks by language
teachers and by the ACTFL guidelines. In A. Cumming & R. Berwick (Eds.), Validation
in language testing (pp. 124-153) Cleveland: Multilingual Matters Ltd.
Sundrarajun, C., & Kiely, R. (2010). The oral presentation as a context for learning and
assessment. Innovation in Language Learning and Teaching, 2(2), 101-117.
Surface, E. A., & Dierdorff, E. L. (2003). Reliability and the ACTFL oral proficiency interview:
Reporting indices of inter-rater consistency and agreement for 19 languages. Foreign
Language Annals, 36(4), 507-519.
Swain, M. (1984). Large-scale communicative language testing: A case study. In S. Savignon &
M. Berns (Eds.), Initiatives in communicative language teaching (pp. 185-201). Reading,
MA: Addison-Wesley.
Swain, M. (2001). Examining dialogue: Another approach to content specification and to
drawing inferences from test scores. Language Testing, 18(3), 275-302.
Taylor, L. (2000). Investigating the paired speaking test format. Cambridge ESOL Research
Notes, 2, 14-15.
17
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Taylor, L. (2001). The paired speaking test format: Recent studies. Cambridge ESOL Research
Notes, 6, 15-17.
Taylor, L. (2003). The Cambridge approach to speaking assessment. Cambridge ESOL Research
Notes, 13 2-4.
Taylor, L. (Ed.). (2011). Assessing second language speaking. Studies in Language Testing, 30,
Cambridge: UCLES/Cambridge University Press.
Taylor, R. E. (1995). Assessing oral communication skills-reflections of an oral examiner. World
Englishes, 15(1), 131-136.
Terry, R. M. (1986). Testing the productive skills: A creative focus for hybrid achievement tests.
Foreign Language Annals, 19(6), 521-528.
Thompson, I. (1995). A study of inter-rater reliability of the ACTFL oral proficiency interview in
five European languages: Data from ESL, French, German, Russian, and Spanish.
Foreign Language Annals, 28(3), 407-422.
Tung, P. (1985). Designing oral proficiency tests in EFL for Hong Kong secondary schools. In
P.C. Hauptman, R. LeBlanc, & M. Wesche (Eds.), Second language performance testing
(pp. 229-242). Ottawa: University of Ottawa.
Turner, J. (1998). Assessing speaking. Annual Review of Applied Linguistics, 18, 192-207.
Underhill, N. (1987). Testing spoken language: A handbook of oral testing techniques.
Cambridge, UK: Cambridge University Press.
Upshur, J. A. (1971). Objective evaluation of oral proficiency. TESOL Quarterly, 5(1), 47-59.
Upshur, J. A., & Turner, C. E. (1995). Constructing rating scales for second language tests. ELT
Journal, 49(1), 3-12.
Upshur, L. J., & Turner, C. E. (1999). Systematic effects in the rating of second-language
speaking ability: Test method and learner discourse. Language Testing, 16(1), 82-116.
van Lier, L. (1989). Reeling, writhing, drawling, stretching and fainting in coils: Oral proficiency
interviews as conversation. TESOL Quarterly, 23(3), 489-508.
Van Moere, A. (2006). Validity evidence in a group oral test. Language Testing, 23(4), 411-440.
Van Weeren, J., & Theunissen, T.J. J. M. (1987). Testing pronunciation: An application of
generalizability theory. Language Learning, 37(1), 109-122.
18
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Vermeer, A. (2000). Coming to grips with lexical richness in spontaneous speech data.
Language Testing, 17(1), 65-83.
Vidaković, I., & Galaczi, E. The measurement of speaking ability 1913 – 2012. In C. J. Weir, I.
Vidaković, & E.D. Galaczi (Eds.), Measured constructs: A history of English language
examinations 1913-2012, Studies in Language Testing 37 (pp. 257 – 346). Cambridge,
UK: Cambridge University Press (2013)
Weaver, C. & Romanko, R. (2005). Assessing oral communication competence in a university
entrance examination. The Language Teacher, 29(1), 3-9.
Wigglesworth, G. (1995). Exploring bias analysis as a tool for improving rater consistency in
assessing oral interaction. Language Testing, 12(1), 305-335.
Wigglesworth, G. (1997). An investigation of planning time and proficiency level on oral test
discourse. Language Testing, 14(1), 85-106.
Winke, P., Gass, S., & Myford, C. (2011). The relationship between raters’ prior language study
and the evaluation of foreign language speech samples. TOEFL iBT Research Report No.
iBT-16. Princeton, NJ: ETS.
Xi, S., & Mollaun, P. (2011). Using raters from India to score a large-scale speaking test.
Language Learning, 61(4), 1222-1255.
Xi, X. (2007). Evaluating analytic scoring for the TOEFL Academic Speaking Test (TAST) for
operational use. Language Testing, 24, 251–286.
Xiaoming, X. (2007a). Validating TOEFL® iBT speaking and setting score requirements for ITA
screening. Language Assessment Quarterly, 4(4), 318-351.
Xiaoming, X. (2007b). Evaluating analytic scoring for the TOEFL® Academic Speaking Test
(TAST) for operational use. Language Testing, 24(2), 251-286.
Ying, B. (2009). The impact of familiarity on group oral proficiency testing. CELEA Journal,
38(2), 114-125.
Yong-Won, L. (2006). Dependability of scores for a new ESL speaking assessment consisting of
integrated and independent tasks. Language Testing, 23(2), 131-166.
Yoshida-Morise, Y. (1998). The use of communication strategies in language proficiency
interviews. In R. Young & A. W. He (Eds.), Talking and testing: Discourse approaches
to the assessment of oral proficiency (pp. 207-242). Philadelphia, PA, John Benjamins.
19
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
The International Research Foundation
for English Language Education
Young, R. (1995). Discontinuous language development and its implications for oral proficiency
rating scales. Applied Language Learning, 6, 13-26.
Young, R. (1995). Conversational styles in language proficiency interviews. Language Learning.
45(1), 3-42.
Young, R., & Milanovic, M. (1992). Discourse variation in oral proficiency interviews. Studies
in Second Language Acquisition. 14(5), 403-424.
20
177 Webster St., #220, Monterey, CA 93940 USA
Web: www.tirfonline.org / Email: info@tirfonline.org
Download