Testing Speaking Skills term paper

Testing Oral Proficiency: Difficulties and Methods
Although testing language has traditionally taken the form of testing knowledge
about language, the idea of testing communicative competence is becoming recognized as
being of great importance in second language learning. In testing communicative
competence, speaking and listening tasks are commonly used. Those require tasks such as
the completion of an information gap and role play (Kitao & Kitao, 1996).
As language teachers, it is important for us to enhance the students’ delivery
skills, increase their confidence, and develop their methods of organization and critical
thinking skills. On the other hand, as language testers, it is necessary to establish a careful
research design and conduct a precise measurement to determine f these goals have been
met. The oral communication field needs a clear-cut method of evaluation as can be
found in discrete language skill classes such as listening comprehension (Nakamura &
Valens, 2001). Language teachers and language testers need a method which takes
subjective qualitative observations and then transforms them into objective quantitative
In testing oral proficiency, or oral skills of second language learning, four
components are emphasised. These include: vocabulary, grammar, semantics, and
phonology. Accurate assessment of limited-English speaking learners requires a total
description of the communication skills, linguistic structures, and functional usage of the
learner’s language within all social domains (Silverman, Noa, & Russel, 1977).
A critical issue in the assessment is the selection of criteria for evaluating
performance. Stiggins (as cited in Butler & Stevens, 1997) points out that the selection of
these criteria should be one of the first steps in designing performance assessments.
Students should understand ahead of time what is expected of them and whenever
possible, actually help them determine on what basis their performance will be judged.
When students are actively involved in establishing assessment criteria for tasks, they do
not only have a better understanding of what is expected of them when they perform the
tasks, but they will be able to more fully appreciate why the criteria are important (Butler
& Stevens, 1997).
This paper is divided into two sections. The first provides a brief description of
the difficulties that testers of speaking skills encounter. The second presents different
methods and approaches to testing speaking skills and oral proficiency in second
language learning.
Difficulties in testing the speaking skills:
Speaking is probably the most difficult skill to test. It involves a combination of
skills that may have no correlation with each other, and which do not lend themselves
well to objective testing. In ( Kitao & Kitao, 1996), it was mentioned that there are not
yet good answers to questions about the criteria for testing these skills and the weighing
of these factors.
It is possible to find people who can produce the different sounds of a foreign
language appropriately; hence they lack the ability to communicate their ideas correctly.
This is one of the difficulties that testers encounter when testing the oral production of
learners. However, the opposite situation could occur as well; some people do have the
ability of expressing their ideas clearly, but at the same time they can not pronounce all
the sounds correctly.
Another difficulty is the administration of speaking skills testing. That is because
it is hard to test large numbers of learners in a relatively short time. Therefore, the
examiner of an oral production is put under great pressure (Heaton, 1988).
The next difficulty discussed here is that speaking and listening skills are very
much related to each other; it is difficult to separate them. In most cases, there is an
interchange between listening and speaking, and speaking appropriately depends on
comprehending spoken input. Therefore, this has an impact on testing speaking because
the testers will not know whether they are testing purely speaking or speaking and
listening together.
Finally, the assessment and scoring of speaking skills is one of its biggest
problems. If possible, it is better to record the examinees’ performance and the scoring
will be done upon listening to the tape.
The aspects of speaking that are considered part of its assessment include
grammar, pronunciation, fluency, content, organization, and vocabulary. (Kitao & Kitao,
Depending on the situation and the purpose of the test, testers need to choose the
appropriate methods and techniques of testing. The next section will discuss some of
these methods.
Methods of testing oral proficiency:
The assessment of performance-based tests of oral proficiency on the basis of
ACTFL levels :
In (Kenyon, 1998) the author claims that performance tasks are the foundation of
any performance-based assessment. The task he describes refers to the open-ended
stimulus serving to elicit the examinee’s performance to be evaluated. An example of that
is an oral response to an interviewer’s questions or instructions to a role-play, or to the
physical response to instructions given to the examinee in the target language. His study
was based on the Speaking Proficiency Guidelines of the American Council on the
Teaching of a Foreign Language. The framework of that study is that the determining
source of the examinees’ proficiency level lies in his/her ability to accomplish speaking
tasks that are associated with different levels of proficiency that are defined by the
Guidelines. There are four levels of proficiency in the Guidelines. They are: Novice level
which is characterized by the ability to communicate minimally in highly predictive
situations with previously learned words. Intermediate which is related to the ability to
initiate, sustain and close basic communication tasks. Advanced that is characterized by
the ability to converse fluently in participatory fashion. Superior that is related to the
ability to participate effectively in most formal and informal conversations on practical,
social, professional, and abstract topics. Between these four main levels, there are
sublevels as well.
In conducting that study, a number of students were tested by asking them to
perform some tasks that were designed in accordance with the ACTFLGuidelines. The
examinees were allowed to demonstrate the characteristics of speech at the main levels
from intermediate to superior. After analyzing the results, the author extracts the finding
that it is important to clearly construct unambiguous tasks on performance-based
assessments so that all salient features of the task are clear to the examinee.
Kenyon states that “When students are asked to give a linguistic performance in
response to a task in a testing situation, it is paramount that the directions to the
performance-based task be absolutely clear.”
Monologue, Dialogue, and Multilogue speaking tests:
In their report Nakamura & Valens (2001) conducted a study on Japanese
graduate students at Keio University. They used three different types of speaking tests:
Monologue Speaking Test which is also called the presentation. In this type, students
were asked to perform some tasks such as; show and tell where they talk about anything
they choose. This is considered a chance to give students an opportunity to make a small
presentation. The second type is Dialogue Speaking Test which is also known as the
interview. It is an open-ended test where the students lead a discussion with the teacher,
and students in that kind of test are required to use conversation skills that they have
learned throughout the course. The third type is Multilogue Speaking Test that is also
called the discussion and dabating. Here, the discussions are student-generated, and
students are put into groups where as a group, they decide on a topic they feel would be
of interest for the rest of the classroom.
The evaluation criteria that was used in that study was as follows:
Evaluation Items:
Eye contact
Ability to explain an idea
Discussing and dabating:
Able to be part of the conversation to help it flow naturally
Uses fillers/ additional questions to include others in conversation
Transfers skills used in dialogues to group discussions
The rating scale ranged between poor and good with the symbols from 1 to 4.
The finding of their study reveals that among the three test types, the discussion tests was
the most difficult followed by interview test and the presentation test.
In the context of Saudi Arabia, I believe that the types of tests discussed above
could be successfully used in the assessment of students who are learning English for
certain purposes such as; to study in a university abroad where the first language is
English, or to be able to work in an environment where English is used largely as in
banks or hospitals. The learners, who have these intentions, are in need to master these
skills that were tested in the study mentioned above.
Testing speaking using visual material:
Without the need to comprehend spoken or written material, it is possible to test
speaking using pictures, diagrams, and maps. Through a careful selection of material, the
testers can have control over the use of vocabulary and the grammatical structures
required. There are different types of visual materials that range in their difficulty to suit
all the levels of learners. One common stimulus material could be a series of pictures
showing a story, where the testee should describe. It requires the testee to put together a
coherent narrative. Another way to do that is by putting the pictures in a random order of
the story to a group of testees. The students decide on the sequence of the pictures
without showing them to each other, and then put them down in the order that they have
decided on. They then have the opportunity to reorder the pictures if they feel it is
Another way of using visual stimulus is by giving two testees similar pictures
with slight differences between them, and without seeing each others pictures they
describe their own pictures in order to figure out the differences. However, there is a
problem in using visual stimulus in testing speaking, it lies in that the choice of the
materials used must be something that all the testees can interpret equally well, since if
one testee has a difficulty understanding the visual information, it will influence the way
he/she is evaluated (Kitao & Kitao, 1996).
The portfolio approach:
Butler and Stevens (1997) state that “O’Malley and Pierce (1996) suggest that the
portfolio approach in the case of an expanded oral profile, widely used for assessing
reading and writing can also be used effectively to assess oral language.” Profile or
portfolio information, reviewed periodically, can be used to enhance teaching and
learning for students and to communicate to students, parents, and other teachers what
students can already do and what they need to improve. A teacher would systematically
collect and record a variety of oral language samples for students that would help
capturing the range of their communicative abilities.
Samples of students’ oral language tasks may come from individual tasks or from
group or interactive tasks. Individual tasks are those that students perform alone, such as
giving a prepared report in front of the class or expressing an opinion about a current
event. Group tasks require students to interact with other students in the accomplishment
of a variety of goals, such as debates, group discussions, role plays, or improvised drama.
Both categories of tasks are important in providing students with a range of activities that
stretch their speaking abilities and in helping them to focus on adjusting their speech to
the audience. In selecting oral samples for a profile, teachers would also consider the
continuum of formal and informal language that is represented in the classroom.
The Taped Oral Proficiency Test:
In that approach, the learners’ performances are recorded on tapes and then assessed
later by the examiner. This method has some advantage and some disadvantages.
According to Cartier (1980), one disadvantage of the taped test is that it is less personal;
the examinee is talking to a machine and not to a person. Another disadvantage is that it
has a low validity. Moreover, the taped test is inflexible; if something goes wrong during
the recording, it is virtually impossible to adjust for it. On the other hand, there are some
advantages of that type of test. It can be given to a group of learners in a language lab, it
is more standardized and more objective since each student receives identical stimuli, and
scoring can be performed at the most convenient or economical time and location.
I believe that the taped test method is very practical when it comes to testing large
numbers of students where the examiner would not have enough time to assess each one
of them individually. However, the problem lies in not having enough language labs in
some institutions which, in turn, creates a big difficulty for testers.
Previous literature on classroom testing of second language speech skills provides
several models of both task types and rubrics for rating, and suggestions regarding
procedures for testing speaking with large numbers of learners. However, there is no
clear, widely disseminated consensus in the profession on the appropriate paradigm to
guide the testing and rating of learner performance in a new language, either from second
language acquisition research or from the best practices of successful teachers. While
there is similarity of descriptors from one rubric to another in professional publications,
these statements are at best subjective. Thus, the rating of learners' performance rests
heavily on individual instructors' interpretations of those descriptors (Pino, 1998).
In spite of the difficulties inherent in testing speaking, a speaking test can be a
source of beneficial backwash. If speaking is tested, unless it is tested at a very low level,
such as reading aloud, this encourages the teaching of speaking in classes.
In my opinion, testing speaking skills could be a very interesting experience, and
it gives teachers an opportunity to creative in selecting the test items and materials.
Moreover, it has a great impact on students by making them enjoy taking the test and feel
comfortable doing so if the teacher chooses the materials that interest their students and
that is suitable to their age and levels of knowledge.
Butler, F. A., & Stevens, R. (1997) Oral languages assessment in the classroom.
Theory Into Practice, 36 (4). 214-219.
Cartier, F. A. (1980). Alternative methods of oral proficiency assessment. In J. R.
Firth (Ed.), Measuring spoken language proficiency (7-14). GA:
Georgetown University.
Heaton, J. B. (1988). Writing English language tests. Longman.
Kenyon, D. (1998). An investigation of the validity of task demands of
performance-based tests. In A. J. Kunnan (Ed.), Validation in language
assessment: Selected papers from the 17th language testing research
colloquium (pp. 19-40). Mahwah: Lawrence Erlbaum Associates.
Kitao, S. K., & Kitao, K. (1996). Testing speaking (Report No.TM025215). (ERIC
Document Reproduction Service No. ED398261)
Kitao, S. K., & Kitao, K. (1996). Testing communicative competence (Report No.
(ERIC Document Reproduction Service No. ED398260)
Nakamura, Y., & Valens, M. (2001). Teaching and testing oral communication
skills. Journal of Humanities and Natural Sciences,3, 43-53.
Pino, B. G. (1998). Prochievement testing of speaking: matching instructor
expectations, learner proficiency level, and task types. Texas Papers in
Foreign Language Education, 3, (3), 119-133.
Related flashcards

Iranian peoples

32 cards


26 cards

Create Flashcards