Improving Speech Applications With Usability Surveys

advertisement
PG 1
Improving Speech Applications
with Usability Surveys
How does Nortel measure the
‘Usability Pulse’ of Self Service?
Judith Sherwood
Sales Engineer
Nortel Self Service Solutions
PG 2
Number of Participants
Many Usability Test Methods
Live Pilot
Usability Survey
Employee
Test Calls
Follow-Up
(call-back)
Surveys
Focus Groups
Data from Each Participant
PG 3
What is a Usability Survey?
• Usability testing is an evaluation of a
customer touch-point from a user
perspective
• Typically conducted using small focus
groups (12-20 subjects) in a controlled
studio environment
PG 4
What is a Usability Survey?
• Methodology: Usability Survey using 200
to 500+ panelists to reveal problems
affecting fewer callers
o Chosen from a large pool of over
80,000 panelists with known
demographics
PG 5
Size Matters !
• Traditional Usability Testing Methods have sample
size limitations
• If a problem affects only 5% of the users:
o
10-call sample has a 40% chance of finding it
o
100-call sample has a 99% chance of finding it
• If a problem affects only 1% of the users:
o
100-call sample has a 63% chance of finding it
o
500-call sample has a 99% chance of finding it
PG 6
Methodology
 Panelist places call,
completes a task, and
then fills out a
questionnaire on an
internet website
Design Tasks &
Questionnaire
 Entire call conversation is
recorded for analysis
Target Service

 Survey ties the individual
caller experience and
questionnaire
response to the call
recording
 


Panelist
Pool












    
  








Call-in Platform

PostQuestionnaire
Analysis &
Recommendations
Report
PG 7
Methodology
 More efficient, less expensive,
faster execution, broader
feedback
 Panelists recruitment and
call-in campaign is
outsourced
Design Tasks &
Questionnaire
Target Service

 Analysis provides
recommendations for service
improvement
 Listen to problem calls to
suggest ways to fix
 


Panelist
Pool













     







Call-in Platform

PostQuestionnaire
Analysis &
Recommendations
Report
PG 8
Usability Survey Grading Process
• Percentile-based letter grading system to
compare against other speech applications
• Raw Scores are based on:
o
Caller Satisfaction (% very satisfied - %
dissatisfied - % very dissatisfied)
o
Task Completion (% who finish task in
one call)
o
Consistency (variability in call length)
PG 9
Lessons Learned: A Case Study
• “Acme”: A regional Managed-Care Health Insurance
Company. Customer Service available for Members
and Healthcare Providers
• Available Tasks in Self-Service:
o Members:
• Check Co-Pay amount and Physician Name
• Order a replacement ID card
o Providers:
• Check Claim Status
• Verify Member Status and Co-Pay
• Initial Usability Survey, then Tuning, followed by 2nd
Survey
PG 10
Initial Results
1.0 How easy was it for you to accomplish your objective in this call?
Very easy
Easy
Neither easy nor difficult
Difficult
Very difficult
I did not accomplish my
objective in this call.
16% (85 Panelists)
27% (146 Panelists)
15% (79 Panelists)
21% (110 Panelists)
4% (22 Panelists)
17% (93 Panelists)
• Call Completion Grade = D
• Call Completion Score
= % accomplished objective
in one-call only
• Call Completion Score
= (100–17) x (0.83) = 69%
2.0 How satisfied were you with your overall experience?
Very satisfied
12% (66 Panelists)
Satisfied
Neither satisfied nor
dissatisfied
33% (174 Panelists)
14% (73 Panelists)
Dissatisfied
Very dissatisfied
29% (155 Panelists)
12% (66 Panelists)
No Response 0.2% (1 Panelist)
• Satisfaction Grade = D
• Satisfaction Score
= %VS – %D – %VD
• Satisfaction Score
= +12 – 29 – 12 = – 29%
PG 11
After Tuning
1. How easy was it for you to accomplish your objective in this call?
Very easy
Easy
Neither easy nor difficult
Difficult
52% (285 Panelists)
25% (136 Panelists)
8% (44 Panelists)
6% (35 Panelists)
Very difficult 1% (6 Panelists)
I did not accomplish my
objective in this call.
Call Completion Grade = C
Call Completion Score
= (100–7)x(0.86) = 80%
7% (37 Panelists)
No Response 1% (3 Panelists)
2. How satisfied were you with your overall experience?
Very satisfied
50% (275 Panelists)
30% (165 Panelists)
Satisfied
Neither satisfied nor
dissatisfied
8% (46 Panelists)
Dissatisfied
8% (46 Panelists)
Very dissatisfied
Satisfaction Grade = A
Satisfaction Score
= +50 – 8 – 2 = +40%
2% (13 Panelists)
No Response 0.2% (1 Panelist)
PG 12
UI Overall Improvement After Tuning
• How easy was it for you to accomplish your objective
in this call?
Initial Results
Very easy
Difficult
Very difficult
I did not accomplish my
objective in this call.
27% (146 Panelists)
21% (110 Panelists)
17% (93 Panelists)
Easy
Neither easy nor
difficult
15% (79 Panelists)
4% (22 Panelists)
After Tuning
Very easy
16% (85 Panelists)
Easy
Neither easy nor
difficult
69% Call
Completion
Difficult
Very difficult
I did not accomplish my
objective in this call.
No Response
80% Call
Completion
52% (285 Panelists)
25% (136 Panelists)
8% (44 Panelists)
6% (35 Panelists)
1% (6 Panelists)
7% (37 Panelists)
1% (3 Panelists)
• How satisfied were you with your overall experience?
Very satisfied
12% (66 Panelists)
Satisfied
Neither satisfied nor
dissatisfied
Dissatisfied
Very dissatisfied
33% (174 Panelists)
14% (73 Panelists)
29% (155 Panelists)
12% (66 Panelists)
No Response 0.2% (1 Panelist)
– 29% Satisfaction Score
Very satisfied
Satisfied
50% (275 Panelists)
30% (165 Panelists)
Neither satisfied nor
dissatisfied
8% (46 Panelists)
Dissatisfied
8% (46 Panelists)
Very dissatisfied
2% (13 Panelists)
No Response 0.2% (1 Panelist)
+ 40% Satisfaction Score
PG 13
Best Practice 1:
Catch Recognition Problems Quickly
• Look for the Red Flags of Voice Recognition
o Low satisfaction and call-completion scores
o Low voice-recognition rating scores
o Check Complaints in free responses
o Observe if “can’t hear” problems for low-score
panelists
• Work with Developer and IT support
o Is speech level strong enough in the IVR?
Check switch gain levels
o Then check speech detector parameters
o Then check recognition confidence thresholds
and grammars
PG 14
Voice Recognition Issues
• How well or poorly did the system recognize your
responses when you spoke the answer to questions?
Initial Results
Very well
Well
Neither well nor poorly
11% (59 Panelists)
21% (110 Panelists)
12% (66 Panelists)
Poorly
Very poorly
After Tuning
34% (183 Panelists)
22% (116 Panelists)
Very well
Well
Neither well nor poorly
Poorly
Very poorly
47% (257 Panelists)
31% (167 Panelists)
8% (46 Panelists)
10% (57 Panelists)
3% (16 Panelists)
No Response 1% (3 Panelists)
No Response 0.2% (1 Panelist)
• How quickly or slowly did the system respond to your
spoken answers?
Very quickly
16% (88 Panelists)
Quickly
Neither quickly nor
slowly
Slowly
Very slowly
44% (238 Panelists)
21% (113 Panelists)
13% (72 Panelists)
4% (23 Panelists)
Very quickly
46% (249 Panelists)
Quickly
Neither quickly nor
slowly
Slowly
41% (225 Panelists)
8% (46 Panelists)
3% (16 Panelists)
Very slowly 1% (8 Panelists)
No Response 0.4% (2 Panelists)
No Response 0.2% (1 Panelist)
• Fix: Increased digital gain from host switch
PG 15
Best Practice 2:
Spot Prompt Clarity Confusions
• Look for the Red Flags of Prompt Confusion
o Low satisfaction and call-completion scores
o Low ‘What-to-Speak’ scores
o Listen for caller hesitations for low-score panelists
• Work with Dialog Designer
o Let callers know ahead of time that they can speak
o Reword prompts; Callers appreciate clear choices
o Give Touch-Tone options when reprompting
o Coach your voice actor
PG 16
Voice-Prompt Issues
• How appropriate or inappropriate was the speaking
style and voice of this service?
Initial Results
Very appropriate
Appropriate
Neither appropriate nor
inappropriate
Inappropriate
32% (172 Panelists)
52% (276 Panelists)
11% (61 Panelists)
3% (17 Panelists)
After Tuning
Very appropriate
58% (319 Panelists)
Appropriate
Neither appropriate nor
inappropriate
36% (198 Panelists)
4% (20 Panelists)
Inappropriate 1% (5 Panelists)
Very Inappropriate 1% (7 Panelists)
Very Inappropriate 0.2% (1 Panelist)
No Response 0.4% (2 Panelists)
No Response 1% (3 Panelists)
• Was it clear what you needed to select or say at
each step of the call?
Clear for all steps
27% (142 Panelists)
Clear for all steps
Clear for almost all
steps
27% (147 Panelists)
Clear for almost all
steps
Clear for some steps
Clear for only a few
steps
Clear for no steps
24% (129 Panelists)
16% (88 Panelists)
5% (28 Panelists)
No Response 0.2% (1 Panelist)
Clear for some steps
Clear for only a few
steps
58% (315 Panelists)
29% (160 Panelists)
8% (45 Panelists)
4% (21 Panelists)
Clear for no steps 1% (4 Panelists)
No Response 0.2% (1 Panelist)
PG 17
Voice-Prompt Issues
• Fix: Clarify prompt choice wording
o Initial: “Are you a member, provider, or a nonmember looking for information?”
o After Tuning: “First, tell me who you are: a
member, a provider, or a non-member. ”
• Fix: It’s not just what you say, but how you say it.
o Coach your voice talent for proper inflections.
PG 18
Best Practice 3: Spot Call
Flow Frustrations
• Look for the Red Flags of Call-Flow Frustration
o Low satisfaction and call-completion scores
o Low Enough-Choices scores
o Check complaints about inflexible systems
o Listen for caller frustrations for low-score
panelists
• Work with Dialog Designer
o Look for easy ways to complete repetitive tasks
o Provide an easy exit strategy
o Leverage the spoken language instinct
PG 19
Call-Flow Issues
• When you were given menu choices, were you given
too many, just enough, or too few?
Initial Results
Too many choices
11% (61 Panelists)
Just enough choices
Too few choices
74% (395 Panelists)
15% (78 Panelists)
No Response 0.2% (1 Panelist)
After Tuning
Too many choices
10% (55 Panelists)
Just enough choices
Too few choices
85% (466 Panelists)
4% (22 Panelists)
No Response 1% (3 Panelists)
PG 20
Call-Flow Issues
• Fix: To clarify options, Provide Anchor, Split Main
Menu choices, and Add Grammar Synonyms
o Initial:
o
o
o
“Alright, I’m going to tell you the things I can help
you with. When you hear the right one, just say
it… Verify member status and office co-pays. Get
status of a claim. Get member ID cards. Change
PCP. Or, order forms and literature.
After Tuning:
“Alright, Main Menu. Please say one of the
following options at any time … Verify member
status. Check PCP co-pay. Get claim status. Order
ID cards. Change PCP. Order forms. Or, order
literature …”
PG 21
Best Practice 4: Spot Task Differences
• Look for the Red Flags of Task Differences
o Low satisfaction and call-completion scores for
Specific Tasks
o Check complaints about complicated tasks
o Listen for caller confusion in low-scoring tasks
• Work with Dialog Designer
o Look for easy ways to complete repetitive tasks
o Make sure task instructions are clear
o Provide an easy operator exit strategy
PG 22
Task Differences
• Relative Task Satisfaction can change after
tuning
o Can’t Hear problem can swamp other UI
task issues
o Claim Status Sat. low for business
reasons (no shortcut for multiple
claims)
PG 23
Task Differences
• Fixes
o Improve recognition to shift all scores up and
reveal other UI issues
o Future: Give clearer exit menu options
o Future: For Claim Status, offer stream-lined
repeat-claim function
Claim Status
Verify Status
Satisfaction AFTER
Task
Satisfaction BEFORE
ID Cards
Office CoPay
-75
-25
25
75
Satisfaction Score
PG 24
Conclusions
• The Usability Survey method is very effective for tuning
applications prior to pilot production
o
Collect hundreds of calls and analyze results efficiently
• Recognition/Prompt/Call-Flow issues are revealed quickly
in Usability Survey
o
Listen to calls for which various ratings are low
• Task differences show the effect of Task Consistency,
Complexity, and Business Rules on Usability Quality
o
Longer tasks require more call-flow efficiency and friendly
hand-holding.
o
Transfer to Agent for business reasons can sometimes
lower satisfaction if callers must first spend a long time in
Self-Serve, or if transfer takes place with no explanation.
PG 25
Thank You!
PG 26
Download