Statistics: An Introduction

advertisement

Statistics: An Introduction

LIR 832

Class #1

January 8, 2007

Topics of the Day

• 1. Some really nice pictures (graphical display of quantitative data and more)

• 2. Why do we teach statistics other than to make your life miserable for a semester?

• 3. What is in it for me as a future HR/LR practitioner?

• 4. Fundamental issues in statistics (what really matters)

• 5. The structure of the course

The Use of Statistics:

Classic Examples

• Tufte: The Visual Display of Quantitative

Information

– “For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled” (Richard Feynman’s conclusion on the explosion of the space shuttle).

The Use of Statistics:

Classic Examples

• Data map: John Snow (1854) – deaths from cholera in Central London:

– Before knowledge of bacterial sources of illness

(Holmes father in 1847; Louis Pasteur later on).

– Deaths are dots; x’s are water pumps;

– Deaths are clustered around the Broad St. Pump

– Removal of handle ended an epidemic which killed more than 500

The Use of Statistics:

Classic Examples

• Data map: National cancer rates by county

– From darkest (in highest decile of cancer rates) to lightest (lower than US as a whole).

– High death rates from cancer in the northeast part of the country and around the Great Lakes (High levels of air pollution and dense concentration of industry).

– Low rates in an east-west band across the middle of the country.

– Higher rates for men than for women in the south, particularly

Louisiana (cancers likely caused by occupational exposure from working with asbestos in shipyards).

– Can you find the counties which are downwind from the Nevada test range?

– Can you find the central locations for the chemical industry in the

US?

The Use of Statistics:

Classic Examples

• Data Map: Space and Time - Charles Minard

(1861) - Napoleon’s March

– Width of line varies continuously with size of the army.

– The line establishes the longitude and latitude of the army.

– The lines show the direction of movement of the army.

– The location of the army with respect to certain dates is marked.

– The temperature along the path of march is marked.

The Use of Statistics:

Classic Examples

• Computer Graphics: Space and Time

– Concentration of Pollutants over L.A. July 22, 1979

– Two dimensional surface: 6 south California counties

– Nitrous oxides: power plants, refineries & vehicles

• Refineries and Kaiser Steel produce post midnight peaks.

• Traffic and power plants produce daytime peaks

– Carbon monoxide:

– Reactive hydrocarbons:

The Use of Statistics:

Classic Examples

• Election Maps: Difficulty in Portraying

Information Accurately and making your point.

(http://www-personal.umich.edu/~mejn/elec tion/)

The Use of Statistics:

Classic Examples

• Tufte’s Principles of Graphic Excellence

– “The efficient communication of complex quantitative ideas”

– Show the data

– Avoid distorting what the data have to say

– Encourage the eye to compare different pieces of data

– Make large data sets coherent

– Induce the viewer to think about the substance rather than about methodology, graphic design, the technology of graphic production, or something else

The Use of Statistics:

Truck Driver Retention

• Factors Affecting Over-the-Road Truck

Driver Retention: A More Traditional

Application of Statistics to a Complex

Relationship

The Use of Statistics:

Truck Driver Retention

• Background:

– Ongoing shortage of truck drivers makes trucking firms very concerned, at least rhetorically, about driver retention

– Have excellent data on drivers from a survey of truck drivers, would like to sort out factors affecting driver retention so firm policy can focus on those factors

– Problem, retention is multi-causal, many factors are likely to affect the retention of truck drivers and we need an approach that allows for all of these affects.

The Use of Statistics:

Truck Driver Retention

• It is always good to start an inquiry with a little theory. This sets a question or questions that we structure our inquiry around.

• The following from Freeman and Medoff:

Two Faces of Unionism…:

The Use of Statistics:

Truck Driver Retention

• Monopoly face: unions raise wages and improve benefits

• Exit-voice face.

– Typical means of employees registering dissatisfaction with a job is to quit and find a new job.

– Unions provide employees with an alternative route: voice

• Improve communications because employees are protected against bad consequences of communicating their views to management

• Allow employees a means to communicate and decide on issues among themselves rather than being mediated by management.

Employees rather than management decide on hard issues such as the allocation of benefits

• Solves public goods problem at work

The Use of Statistics:

Truck Driver Retention

• Lower quit rates and longer tenure are a potential source of advantage to organized firms as they:

– Save hiring and training costs

– Have greater depth of human capital

– Research on quits and employee tenure shows a strong positive association between tenure (years of service with employer) and unionism and a strong negative association between unionism and quits.

The Use of Statistics:

Truck Driver Retention

• This might be explained by the union “voice” effect but it might also be an example of the monopoly face (syllogism):

– Unions raise wages and increase benefits

– All else constant, employees tend to stay with firms which provide better wages and benefits

– To be better assured of union voice effects we need to distinguish the monopoly face of unions on compensation from that of voice

– Consistent with the monopoly argument, Delery finds only a compensation effect, no distinct union effect

The Use of Statistics:

Truck Driver Retention

• Current Research draws on a UMTIP survey of truck drivers.

– Interview 1,000 drivers in truck stops between

1997 and 1999

– Includes data on tenure and quits along with union membership and compensation.

• Consider the descriptive statistics

(abbreviated):

The Use of Statistics:

Truck Driver Retention

• Build a series of models that look at the months spent with their current employer (tenure):

– Don’t have pre-quit information on those who quit their jobs in the last year).

– Models (working from simplest to most complete)

• Model 1: Tenure models with extensive controls:

– Controls serve to eliminate the effects of factors which would otherwise confuse our estimates, such as: personal characteristics which might affect tenure (age, race, gender ...), segment of the industry, size of the firm, characteristics of the work.

• Model 2: Add Union Measure

– Coefficient on union membership is 38.78: interpreted as indicating union members stay with firm an additional 39 months.

The Use of Statistics:

Truck Driver Retention

• Model 3:

– Add a measure of weekly pay. The coefficient on union membership is not the earnings (monopoly) effect as we have removed effects related to weekly earnings.

– Members remain an additional 36 months, not much effect

• Model 4:

– Allow for the effect of benefits including paid days off, employer provided health insurance, pensions and deferred compensation

– Union effect declines to an additional 22 months:

The Use of Statistics:

Truck Driver Retention

• Conclusions:

– Union membership has a strong effect on employee retention. While part of this effect is due to unions improving wages and benefits, even with controlling for such effects, unions continue to be associated with longer employee retention.

– Time off work is also very important to driver retention as is earnings.

Why Quantitative Methods?

• What will we learn?

– Master fundamental knowledge about construction and application of statistical models.

– Develop, operationalize and interpret models of social interaction using modern statistical software.

– Learn to evaluate and critique others research.

– In essence, become knowledgeable users of statistical analyses.

What Issues Does

Statistics Address?

• Human beings are very good storytellers.

– Human beings have always been very good at developing stories which ‘explain’ the world out of small amounts of information. This behavior may have been necessary for survival when early man competed for food with large predators, but it often leads us to misunderstandings about causal relations.

What Issues Does

Statistics Address?

• Tom Peters: In Search of Excellence :

Lessons from America's Best-Run

Companies

– How 10 firms became top performers. Exciting reading with many important insights into successful management. AT&T, IBM, Digital

Equipment, 3M, Allen-Bradley, Delta Airlines

– Fortune magazine returns three years later, half of the firms are no longer top performers

What Issues Does

Statistics Address?

• Beardstown Ladies:

– Successful investment club in Ohio. Produces of book of investment tips with recipes. Some problems later on with how they figured their profits but lets put that aside. Their claim to fame was that they out guessed the stock market ten years in a row. Did this reflect brilliant thinking on their part, or might it simply be luck (change or random event)?

– Supppose in 1980 there were 1000 women’s investment clubs in Ohio. Each year we would expect ½ of those clubs would do better than the stock market and one half would do worse. How many clubs would have a record of straight wins in the 1980s?

What Issues Does

Statistics Address?

Number of Clubs Which Perform Better than the Stock

Market

For All Years in the 1980s

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

7.81

3.91

1.95

0.98

1000.00

500.00

250.00

125.00

62.50

31.25

15.63

What Issues Does

Statistics Address?

• Stories are essentially anecdotes, interesting and potentially insightful but its difficult to separate what is useful from what is bullshit. Much of what we consider theory in social and behavior science, whether it be economic theory, psychological theory, sociological theory, management theory, physics, etc.

What Issues Does

Statistics Address?

• Statistics provide a method of examining these stories to determine if they are consistent with the facts (data) generated from a large number of cases.

– For example, the questions we might be interested in might be…:

• Does a particular absence policy actually reduce absences?

• What is the response to a pay for performance system?

• How does greater workforce diversity influence plant level performance?

What Can We Do

With Statistics?

Compactly summarize large bodies of data

– Using measures of central tendency, dispersion and probability distributions, we can compactly describe and understand these large bodies of data.

• CPS file with 150,000 observations on earners

• Compustat file with annual data on 1,000s of firms at the divisional level

• Personnel files from medium or large size corporations.

What Can We Do

With Statistics?

• Determine if there are meaningful relationships in the data :

– Test theories or ideas about social or other interrelationships:

• An attendance award program will reduce absenteeism

• Piece rate systems will increase output but quality will suffer

(Dodge Brothers machining plant in Detroit in 1904 for example)

• Increases in the minimum wage reduce employment

• Training programs improve output

– What is a theory? (evolution v. intelligent design)

What Can We Do

With Statistics?

• Determine the magnitude of the relationship :

Answer the essential question “How Big”

– If you are going to calculate the ROI on a training program you need to know the magnitude of the effect of that program. So you will want to be able to answer questions such as:

• Following training program X, productivity rose by Y%

• If a firm is going to invest in a program, it need to know the rate of return and this will, in turn, be determined by the improvement in productivity.

• An A% increase in the minimum wage is associated with a B% decline in teenage employment.

• Piece rate workers produce H% more output than hourly workers.

What Can We Do

With Statistics?

• This is all very nice, what is in it for you as an

IR/HR professional?

– IR/HR students do not, typically, believe that numbers are their friends.

– Alas, HR Managers are expected to use numeric and statistical information to understand and guide their decisions.

• As HR moves from a transactional to strategic position within the firm, HR managers are more and more expected to use numeric and statistical methods to evaluate operations and guide their decisions.

• Organizational HR performance is monitored using HR metrics. If correctly chosen these metrics can provide a compact summary of units performance.

What Can We Do

With Statistics?

• A Few HR Metrics:

– Number of interviews to hires

– Total recruiting cost per hire

– Hiring manager satisfaction

– Turnover Rate

– Turnover Cost

– Absence Rate

– Health Care Cost per Employee

– HR expense factor

– Human Capital Value Added

– Worker’s Compensation Cost per Employee

– HR ROI

What Can We Do

With Statistics?

• Some issues in using these metrics:

– What do these measure?

– What is a good performance?

– Are deviations from ‘good performance due to problems or chance?

– What are the sources of ‘good’ or ‘bad’ performance?

What Can We Do

With Statistics?

• With the availability of HR metrics, it become possible to use descriptive and analytic statistics to evaluate programs.

– Consider a program to control health care costs. You are going to be interested in some relatively simple measures such as whether there was a reduction in direct health care costs. You will also be interested in determining whether there are indirect costs such as increased absenteeism, lower employee satisfaction, increased turnover and whether there is a change in employee behavior or simply cost shifting.

What Can We Do

With Statistics?

• You will regularly be presented with reports and memos incorporating numeric and statistical materials. You needed to understand and evaluate the work of others…:

– You hire a consultant to suggest or evaluate a program.

You need to be able to understand and interpret what they have done both to determine the quality of the work, to be able to ask good questions, and to reach your own conclusions about the report.

– Example: EEOC and OFCCP’s standards for establishing a “pattern and practice of discrimination”

What Can We Do

With Statistics?

• You should be facile with statistical measures and data to be able to play with professions for which is required knowledge. You can also shine relative to your peers if you are the one who does the statistical work and drafts those reports.

Fundamental Issues in

Statistics

• The world is multi-causal , meaningful models need to reflect multiple sources of causation.

– There are many random elements in the outcomes we are concerned with. Simple observation is not enough to reveal underlying relationships. We need multiple observations to be able to establish the presence of a relationship.

– Why anecdotes are suspect.

Fundamental Issues in

Statistics

• We use samples to learn about populations :

– We seldom observe the populations we want to know about. Because we have to use samples , we engage in inference from samples to populations. However, because of sampling variability , samples are not little mirror images of the population of interest. Given that samples are imperfect replications of populations, we have to use techniques such as hypothesis testing to determine if statements about populations are reasonable given our observed population.

Fundamental Issues in

Statistics

• Few events have only one or two causes. As we want to avoid reductionist approaches, our methods must allow for with multiple causation.

– The foundation of model building is not statical but theoretic and practical knowledge of an issue.

– Evaluation of the usefulness of models then rests on both statistical knowledge and broader understanding of an issue. Good statistical technique is a necessary but not sufficient condition for building a useful model.

• Toward the end of the course we will evaluate literature using statistics so that we can bring all of the diverse elements together.

– Successfully modeling this multi-causal world requires careful application of statistical technique.

– Traditional course ends with a brief smattering of multi-variate statistics, but we need more.

The Structure of the Course…

Download