Lies About Learning Research_draft_chapter_for_larry_book

advertisement
Introduction
This chapter’s title is “lies about research.” Let’s start by being clear what we mean
about “lies”; because a central argument I’m going to make is that our profession has a
propensity for inaccuracy. In the simplest terms, a lie is a falsehood. In that sense,
anything that ends up being ultimately untrue one could interpret as a lie. However, in
the messy world of social science and corporate learning, such a standard isn’t practical.
Indeed, even within less messy fields such as medicine, some 50% of peer-reviewed
studies end up being refuted by subsequent studies. Perhaps a more precise definition of a
“lie” implies intent to deceive. The falsehood has what Immanuel Kant would have called
a deontological component to it. It isn’t simply a falsehood, but presenting a falsehood to
deceive that is at the heart of lying. Sadly because of its messiness and our own lack of
conviction of the efficacy of what we do, our space is fraught with well intentioned
falsehoods.
My focus on this chapter is to focus on this notion of falsehoods designed to veneer over
the messiness of our world with respect to research. So I will follow the thought I
started above and be clear about what I mean by research. At its core, research is simply
a systematic investigation. This is important to “practitioners” because there is false
dichonomy that is presented which if believed becomes a lie – that research and practice
are somehow at odds. A corollary that is also a falsehood is the notion of a tension
between theory and practice. This notion, if you really tease at it quickly becomes
obviously false. Everything you do is grounded in a theory and when you do what you
do, you research it. You simply can’t get around it.
To finish the thought, nothing is “practical” as a theory because a theory is simply a way
of explaining something. Take physics. There are three theories that explain gravity –
quantum mechanics, Newtonian, and Einsteinian. None of them are a “truth.” Each is
backed by “research” (i.e., a systematic investigation that led to some empirical evidence
that supports the theory); each also has research that refutes it. Turns out quantum
mechanics does a great job of explaining how gravity interacts among very small things
but a poor job of explaining how gravity interacts with very large objects like galaxies
1
and Einstein’s work does the exact opposite. Turns out Newton did a great job for the
likes of you and me. Turns out all three of these theories are very practical, and we use
them -- or products based on them -- every day.
My point is that this chapter will explore how lies impact how we interpret our systematic
investigation of our work as learning professionals. It is WAY too important to
pejoratively relegate research to the Ivory Tower. Each of us is theoretical, and each of
us conducts and consumes research daily and if we buy Mark Twain’s notion of “lies,
damnedlies, and statistics, our consumption and production of research if we aren’t
careful, can get in the way of our success. We need to be informed consumers and
producers of research related to corporate learning. We need to know when we are being
deceived by vendors and we need to be careful when when make proclomations about
ourown success because in a knowledge economy, how we develop people is important
to our companies, to our employees and to society..
That may seem like a bold statement, but I firmly believe that what you do matters and
consequently the stakes are quite high. I hope by the end of this chapter I’ve walked you
through the reasoning that leads me to urge you to be thoughtful when it comes to how
your make claims about what you do and to be careful when you are evauating what
others say they do. his as a hypothesis and presented to you some evidence for you to
evaluate. I will end the chapter with some practical things you can do as a learning
professional and a call to arms for us as a community. But let me start by urging you to
be a doubting Thomas even about this chapter itself. Do not take anything I state prima
facie, but investigate it more thoroughly yourself!
Research, Theory & You
You are already a researcher and perhaps even more uncomfortably, you are a learning
theorist; the question is how aware are you that you do these two things and how good
are you at theorizing and researching corporate learning. In general, putting aside the
content of what you are developing as an intervention, you operate at the intersection of
two theoretical concepts. The first is Human Capital theory, which basically says that
2
you can make people more productive, ceteris paribus, if you invest in them through
some sort of development. Everything you do is predicated on this one theory which does
have ample evidence supporting it and some evidence which suggests that a competing
theory, most commonly known as “signaling,” may also be playing a role in obfuscating
the impact of your development in terms of increasing human capital.
Given that, it is worth taking a moment to discuss this alternative theory. It basically says
that development is more of a signal to the market rather than something that actually
increases performance. A classic example of this theory is whether a place like Harvard
makes people better by developing them or whether the students are so talented and hard
working that they got into Harvard, that all Harvard is doing is serving as a signal of their
innate abilities. As an aside, each theory has a Nobel laureate who won the Nobel Prize in
economics for their work in this area. It may be troubling to you to keep both in your
head at the same time, but Joyce said the sign of true genius is keeping competing ideas
in your head simultaneously!
Good examples of this obfuscation are things like “Hi Po” programs. Being selected by a
company to be part of a high-potential program may be doing two things, and from a
design perspective they compete with each other. It may be developing people, but it also
sends a signal, to the company, other employees and to the participants themselves about
that person’s talent separate from what they gain in the program.
In terms of learning theories, it is important to own your assumptions because they
impact your design and your evaluation. There are three competing theories of learning,
each (like the physics example above) with significant empirical evidence to support
them – behaviorist, cognitivist, and socio-cultural.
Every time you design or deliver a program, you are making theoretical assumptions; it is
embedded with how you build your program. And, if you are evaluating the learning,
even if it is the proverbial ‘smiley sheet,’ you are conducting research. But if you are not
clear and explicit, chances are you’re not maximizing the impact of what you do. I’m
3
going to end this section with a couple of definition. First, “evidence” taken from the
OED: “Grounds for belief; testimony or facts tending to prove or disprove any
conclusion. And/or Information, whether in the form of personal testimony, the language
of documents, or the production of material objects, that is given in a legal investigation,
to establish the fact or point in question.” And second, “empiricism”: “Practice founded
upon experiment and observation.” These are concepts critical to your success.
The Bar for Truth in Our World
Now that we know we are aware that we are all theorists and researchers, let’s talk about
the paradigms that we use to empirically evaluate our evidence. We have two prevailing
models that are popular. It is interesting to note that neither gets any significant amount
of traction among top researchers as particularly compelling empirical approaches but
they do get significant traction among corporate learning professionals.
The most common way of evaluating the efficacy of what we do is the Kirkpatrick
model. Conceptually this is quite elegant, though it might be more useful if we thought
of the levels as not ordinal (i.e., 4 is not “better” than 1) but rather categorical (i.e., 4 is
“different” than 1). My largest concern with what I see as the adoption of this model is
that everyone aspires to teach the top level and no one pays attention to how they
measure and analyze their data; that is the key to research, a “systematic” investigation.
As an example, a case study of ROI is not a way that a financial analyst would gather
empirical evidence to measure ROI. I’ll spend more time on this later in the chapter.
The other quite popular approach is Brinkerhoff’s Success Case Method that really is
about telling success stories. Conceptually, one is not looking for typical outcomes but,
rather, extraordinary outcomes. So if one can find one instance where something amazing
happened, that is the story to tell in this methodology. This is akin to telling the story of
someone who survived in the wilderness for weeks on nothing and then planning your
vacation accordingly. I find this approach compelling in terms of advocacy, but less
compelling in terms of research since it suggests as a typical outcome something that is
atypical; if it isn’t the norm.
4
Some Current Whoppers in the World of Corporate Research
Putting aside what should be obvious examples of advertisements and testimonials on the
efficacy of some software or training, let me give some examples that I’ve seen that are
pervasive among our profession. We can think of these when used as a basis for our own
work or when people sell to us, broadly speaking, as “lies.”
Three of the most popular business books of all times are fraught with lies of various
sorts and yet they are popular because we continue to buy them and believe them.
Perhaps the most nefarious is In Search of Excellence, where there was at least some
evidence that the data was faked. Another great book, the Seven Habits of Highly
Effective People, which is largely aspirational and based on Covey’s strong religious
grounding and ethics. He basically argues that if people behaved more in this way,
companies would be better off, but there wasn’t any thorough empirical testing of the
hypothesis. Let me be clear, I think his book is inspirational in its aspirations, but it has
been interpreted as something that it is not.
The final example is Collin’s Good to Great. This book is simply guilty of poor research
design. The way Professor Collins got his evidence was to look for patterns in behavioral
traits among leaders of successful companies, and he then found patterns. But then he
argues that these patterns are generalizable. In other words, any leader that adopted these
traits would make her company great. The best way to investigate the question once the
behaviors would have been to run a randomized trail, but that is obviously impractical but
one could have at least run a broader sample to see if there was any evidence of the same
behaviors existing in companies that weren’t so great, or if endegoneity was present (this
is kind of a methodological chicken and egg question -- do companies make great
leaders or vice versa). This example may be trite, but I can give you two examples that
may illustrate its import; one that is borderline racist and the other has led to many
needless deaths. One could run a pattern of the traits of the best basketball teams and
then generalize that one simple needs tall African-Americans in order to put together a
5
great team. Or one could find that drug use is quite prevalent among musicians and that
all one needs to do to be a rock star is consume illicit drugs.
Now let’s move on to examples that are grounded purely in practice. Let’s start with an
example that if you have been presented with, means that you’ve been lied to – the notion
of learning styles. What is so interesting about this example is that it is so pervasive and
popular, even though it has been thoroughly researched and there is overwhelming
evidence against it. Indeed, there have been over 100 peer-reviewed studies investigating
this hypothesis, and they all have found no evidence of learning styles and yet if one
types in a search engine “corporate learning” and “learning styles,” there are about 4.5
million matches on Google.
I am shocked when I hear companies talking about building or buying programs to
accommodate a particular learner’s style. Why? Because this hypothesis, that each of us
has a “better” way of learning (e.g., visual, aural, etc.) is gobbledygook. Unlike some of
the other examples I give, where it could be a simple misinterpretation of the facts or
poor analysis, this one is accepting on faith what has been demonstrated empirically to
not be true. There have been a horde of studies cutting across disciplines, sectors, and
demographics that have demonstrated that individuals do not have differentiated “styles.”
This isn’t to say that using multi-modal approaches to increase engagement isn’t a sound
strategy, but, rather, that if someone being self-reported “visual learner” does not mean
that they learn “better” visually or that they cannot learn in any modality.
Another example I will give is a classic example of a misinference from observational
data. Think of this a rendition of an old show tune from Bye Bye Birdie – “what’s the
matter with kids today?” The prevailing argument uses the following logic: A manager
notices that the younger people in the office work differently than the older people and
then assumes that there is something fundamentally different about younger workers
today (the popular pieces started appearing when GenerationX was entering the
workforce and now it is applied to Millenials as well). This presumption is a classic
example of misinterpreting a generational effect for a cohort effect. There is no evidence
6
to speak of that somehow humans have evolved into a new species since the birth of
Millenials. And, as the old musical song attests, folks have been observing that young
people seem different than older people since time memorial. This is what is called in
research circles as a “specification error.”
Next let me give an example that is akin to embellishment; a good theory, but one that
has never will be researched effectively (and by definition probably never will) - informal
learning. When I search for this term on the web, I get over 14 million hits on Google.
There have been a host of articles written about the 70/20/10 rule. Now let me be clear –
I am not arguing at all that informal learning does not exist, and there have been a lot of
peer-reviewed articles that have investigated it, most recently a Stanford study used by
the White House for public policy purposes. But what has not been vetted is this ratio of
70/20/10. Now from what I was able to ascertain trying to follow the literature, this was
akin to that old game of “telephone.” A researcher in the 1960’s posited that one could
think of informal learning vis-à-vis formal learning as an iceberg where most of the berg
is underwater; indeed, if one remembers their Archimedes, one ends up with about 90%
of a berg under water. It seems that over time, someone, or many, many someones took
that metaphor, attached weights to it and all of a sudden it became an accepted notion that
90% of learning is nonformal and people budgeted accordingly. Think about it. How
would one measure that? – amount of time spent, cost? This mistruth is now accepted as
a truth and meaningful (and, often, wrong) decisions are made on that basis.
My final example relates to the Holy Grail among corporate learners – “ROI.” I first
heard this term as applied to the world of corporate learning when a CLO from a Fortune
100 company publically proclaimed an ROI of 1328%. I was shocked at such a
statement. There is a rich history in the literature of labor economics that surrounds
estimating the returns to education. Tom Schultz won the Nobel Prize for work in this
vein and another Nobel Laureate, Jim Heckman has been writing of late about the
incredible returns to early childhood education. To put some of this in context, most
returns to a high quality college education hovers between 10-20%. So if this company
had really cracked this nut, then it meant it was performing 100 times as well as, say,
7
Harvard business school. One need only look at the balance sheet and income statement
to realize that what was happening instead was very poor research.
Now to be clear, there is ample evidence on the returns to education and it has been
quantified, so it can be done. And I understand the compelling arguments around the
need for learning leaders to have more business acumen and talk the language of
business. But if one went in to the CFO with such a claim, it would not be laudable, but
laughable and, potentially, career limiting. The CEO and the CFO understand
conceptually the notion of returns to training. But they also probably would understand
in the same way they do with legal counsel that those returns are not best measured as
ROI. Think about it. ROI implies that one can look at a company’s financial
performance overtime, take into account all the extraneous variables that would impact
that company (i.e., just about everything), quantify all the costs associated with delivering
the program, isolate what are called endowment effects (such as motivation, prior
education, and intelligence) and come up with a simple ratio. In theory, one could use
stock price since it is supposed to take into account both the market effects and future
earning, but even so it is a fool’s errand. This is a classic example of just poor research
design. Plausible question supported by theory, though not sufficiently practical to
execute for the average CLO.
What These Things Have in Common
What lessons can we draw from these examples? First, I hope you see that we are all in
good company when making these mistakes. Being diligent is difficult when you are
under financial and time pressures. Perhaps trickier is that, as professionals, we have our
own experiences to draw from, and we too often cross the line and have a stake in the
success of our program. We should put our heart and souls into the design and delivery of
programs, but, at the end of the day, the programs need to stand or fall on the merits and
we should always have the best interests of the organization and our colleagues in the
forefront rather than creativity or ‘coolness’ our programs.
8
We need to stop being lemmings. We are so, so enamored of the next great trend, and we
jump on these bandwagons partially because the success of our programs matter so much.
But we need to be much more thoughtful in how we interpret the research we are
consuming in order to inform our program designs.
Finally, we need to recognize how much how we ask a question matters and that the ways
in which we gather and analyze evidence matter also. But perhaps what matters most is,
in facf, how we ask a question matters because we cannot fix through clever analysis that
which we messed up in the design of our interventions. Let me explain.
How To Prevent Getting Hoodwinked and From Lying to Yourself
The key to ensuring that you aren’t deceived or don’t inadvertently deceive is to – within
the realms of possibility -- be an effective theoretician and researcher. What does that
mean?
First, pose questions (and by definition answers) that can be tested – investigated
empirically. Second, know your theory and at least in your mind, link your questions to
the underlying theory. Use methods that permit direct investigation of the question.
Third, provide a coherent and explicit chain of reasoning; walk yourself through how you
got to the question. Either replicate yourself, or see if others have replicated the question.
Whenever possible, disclose research to encourage professional scrutiny and debate.
It probably is worth it to review research paradigms. Keep in mind that all the social
sciences basically ask the same sorts of questions and consequently, all those paradigms
are relevant to you. But each paradigm is like a lens, and it brings certain things into
focus and makes other things out of focus. In general, there are two types of research
approaches:
1. Collecting numbers – quantitative;
2. Collecting observations – qualitative
And generally, there are three purposes to research:
9
1. Implementation and replicability

Quantitative or Qualitative
2. Theoretical Base

Quantitative or Qualitative
3. Evidence of Effectiveness

Quantitative
In our world, then, we probably ought to be conducting research that is a mix of
qualitative and quantitative with a heavy emphasis on quantitative methods. Now when it
comes to developing people, it is important to recognize that you aren’t ever going to be
able to capture the full impact of any training. The problem is that learning is so
complex, there are so many interaction effects and learning doesn’t happen the way
conceived on the matrix. The links to learning are loosely coupled and sometimes take
time to stick and sometimes last quite a long time. This means that there are always some
consumption and investment benefits to training, and there are individual and social
benefits to learning.
Going beyond these conceptual ideas, how we systematically gather evidence, even if we
perceive it to be quantitative, means different things and tells us different things. For
empirical investigations, we could conduct interviews, run statistical analyses of data –
either observational or self-reported – or we could conduct controlled experiments. With
regard to measurement techniques, even with the same data and the same paradigm, we
can analyze things differently. We could use simple correlation, a residual approach that
controls for other factors or we could measure direct returns.
Finally, if we are interested in things like the value of training, we need to be aware of
measurement challenges: these include, just to name a few, the interaction between
education, prior education and ability; selectivity bias (high potentials aren’t the same as
typical employees and if you infer that they are performing better because of the
intervention it could be that they are just rock stars), the quality versus the quantity of
training and discounting for time.
10
The Hierarchy of Evidence for Impact
Generally, the more times something has been tested, the better. So one study that finds
that a particular tool, product or intervention works is less reliable than 100 studies with
the same findings (with the huge caveat that the quality of each study matters).
If you are attempting to demonstrate impact, the gold standard among researchers is what
they call randomized control trials; these are the classic “experiments” that scientists use
with control groups. Almost as good are natural experiments and quasi experiments.
Less desirable are mixed method studies, which are the worst and are the ones that we see
most often in our space – the survey and the case study.
Why Research Matters
In some fundamental ways, all social science is interested in answering the same sorts of
questions about the world. At the same time, there are very specific rules about the
fidelity of research within each discipline, and, historically, there has been some concern
about poorly designed and executed mixed method research. With respect to education
evaluation, there have been attempts by OEM, Campbell Collaborative and National
Academy of Sciences to describe a hierarchy of evidence where, roughly speaking:

more studies are better than fewer

anecdotal case studies or testimonials are the weakest form

randomized controlled experiments are the gold standard.
And there is recognition that certain paradigms are better at surfacing hypotheses and
exploring “why” and others are better suited to evaluated whether something “worked” as
planned.
For social science research to be funded and evaluated, the government and others have
developed hierarchies of evidence with RCT’s being the gold standard and, obviously the
more times a study is replicated, the more reliable the finding. I’ve talked about those
before and that, bluntly, is the gold standard for evidence.
11
The problem with the approach is that it is untenable for learning professionals. It may
win points among an editorial board or for tenure, but among CLO’s, heads of talent and,
most importantly, business leaders who have P&L responsibility, this approach is as good
as useless.
But as I hope I’ve illustrated with some of my “lies, “ in its absence, what has emerged is
a system where evidence is largely word of mouth or personal testimonials, sometimes by
vendors who have an agenda and sometimes by the learning professionals who also have
a point of view; they advocates rather than a scientists. If one takes the boldest
perspective, one could argue that as a consequence, much of what we design and deliver
is rooted in little evidence, we have little evidence than any of it works and we have a
large stake in saying it works, regardless of its actual efficacy.
I can share with you in a very personal way why this approach troubles me so much. I
was recently accused of fabricating things myself and rather than thoroughly
investigating the assertion made by a group of unhappy faculty, the reporter ran with the
story and then everyone assumed it must be so even if all one needed to do was a simple
search to realize that the story itself was false.
But we can’t assume that folks will always rise to the challenge. What is needed instead
is a pragmatic solution. You need to decide what level of sophistication you need in in
order to understand the evidence of impact? What tools do you and can you use to gather
evidence of impact? What tools do you use to analyze performance and learning? Do you
use any methods to evaluate implementation in addition to the intervention? If so, how?
I think there is actually a pragmatic way to get at this. In our country’s courts, there are
rules of evidence that dictate the way one gathers and analyzes evidence. In the same
way, each of our disciplines have different research paradigms that have us approach
evidence differently; so, too, do forensic scientists who approach the systematic gathering
and analysis of evidence; each form has its own fidelity. So how one evaluates DNA
12
evidence is different than how one evaluates testimony. If someone provides testimony
about events they witnessed while they were ‘under the influence’ or, perhaps, who has
failing eyesight, one views it differently. If one notes that the DNA evidence might have
been tainted in some way, one evaluates it differently. So in this way, our systems align.
But the difference is how the courts treat the evidence. There are different standards for
the evaluation of the evidence depending on whether the case is civil or criminal. In a
criminal case, the standard is “beyond a reasonable doubt,” whereas in a civil case, it is a
“preponderance of evidence.” One could present the same evidence asking the same
question and, depending on the standard, arrive at two different findings (one need only
think of the OJ Simpson case as an example).
What we need I will call POE. We need as a community is a system to evaluate evidence
systematically for a different standard than those used for tenure review and peer
reviewed journals, we need a “Preponderance of Evidence (POE)” standard that is
quicker, more flexible and easier to interpret than the research we do for our own
edification. And it needs to be driven by the needs of the market. It uses the same
research paradigms, but the evidence bar is lower and it brings multiple forms of
evidence to bear on the question of efficacy. The big data movement might make this
fairly simple to execute in the near future.
We also need our profession to raise the bar. Some years ago, ASTD was a major sponsor
of a global initiative to create an ISO standard for corporate learning. 42 countries
adopted it, but to my knowledge not a single US employer adopted the standard. Our
professional association and in particular ASTD need to continue to lead the charge and
call out the “lies” as they become clear.
What You Can Do in the Mean Time
We continue to evolve as a profession. Hopefully, programs like the one I created at the
University of Pennsylvania (PennCLO) and initiatives taken by organizations like ASTD
13
to develop professionals can help. In the meantime, there are some pretty basic things
you can do to catch lies and prevent lying.
First, question everything. Be critical when your purchase and when you deliver. Be less
of an advocate and more of a critic. Don’t assume that because someone has a white
paper or even one peer-reviewed article about his or her approach or product that it
“proves” anything.
Second, do what research you can. Not being able to run large-scale, longitudinal,
controlled randomized trials with matched pairs is not an excuse to throw up your hands
and then become a lemming following the latest trend. Try to systematically question
what you are planning and gather what evidence you can to support or refute what you
actually do. And don’t be so vested in a strategy that you put on red tinted lens and
ignore mounting evidence that what you are doing is having little impact.
There are concrete things you can do to improve the probability of your success. The first
is, whenever possible, use empirically vetted content. In other words, if you have to pick
between two marketing courses, one based on the research of a professor and the other
one based on the whims of a prophet, go with the professor. Another thing you can do in
a similar ilk is, also whenever possible, use pedagogy that is aligned with your business
problem. If you need to get your folks working better as virtual teams, a behaviorist
approach may not be the most prudent design strategy.
Feel like you need an ‘in conclusion’ summary-like paragraph. It almost seems like you
felt badly about being later than you wanted and abruptly ended.
14
Download