Systemed Studies in Opinion Polling E1 D1 for web

advertisement
Systemed Studies on
Opinion Polling
By
Stuart C Dodd
Compiled and Edited
by
Burt Webb
Produced
by
Barbara Whitt
Systemed Studies on
Opinion Polling
By
Stuart C Dodd
Compiled and Edited
by
Burt Webb
Produced
by
Barbara Whitt
Systemed Studies on Opinion Polling
Published by
Dodd Memorial Library
Stuart C Dodd Institute for Social Innovation
First Edition
Draft 1
March 2011
Copied with permission from originals in Manuscripts,
Special Collections,
University of Washington Archives Division
University of Washington Libraries
Seattle, WA
Not to be reproduced without the permission of Special
Collections
Table of Contents
The Life and Work of Stuart C Dodd .................................................................................. 14
Sample of Stuart C Dodd’s ideas: ...................................................................................... 16
A Preview Introducing and Evaluating the "Pan-Acts Matrices" (excerpt) ...................... 16
Things Categories of Cosmists’ Actions or Scientists’ Four Aims .................................. 18
Things Liked Most .......................................................................................................... 19
Pan-Acts Cosmos Pictured as the Mass-Time Triangle ................................................. 21
General Systems: A Creative Search for Synthesis (excerpt) ........................................ 22
Stuart C Dodd Institute for Social Innovation ...................................................................... 24
Purposes ........................................................................................................................ 24
Our Mission .................................................................................................................... 25
Our Methods .................................................................................................................. 25
SCDI Founder: Richard Spady ...................................................................................... 26
The Leadership of Civilization Building .................................................................... 26
The Forum Foundation ............................................................................................ 27
Presidential Address project .................................................................................... 27
DVDs available from the Forum Foundation ............................................................ 27
Founding SCDI Executive Director: Rev. Dr. Richard S. Kirby (1949-2009) .................. 28
World Network of Religious Futurists ....................................................................... 28
SCDI Catalyst: August T. Jaccaci .................................................................................. 30
2008 Thomas Jefferson Returns ............................................................................. 30
Jefferson 2040 ......................................................................................................... 30
Unity Scholars ......................................................................................................... 31
Futurum Grid, 46th Annual Creative Problem Solving Institute Reference Sheet .... 32
The Nexilist Notebook ............................................................................................. 34
Dodd Memorial Library................................................................................................... 35
Dodd Memorial Library Editor Note ......................................................................... 35
Volumes in the Dodd Memorial Library .................................................................... 36
Overview of Book ............................................................................................................... 38
Notes on Articles in Systemed Studies on Opinion Polling ...................................................... 39
Section 1 on World Polling ................................................................................................. 40
Section 2 on Techniques of Polling .................................................................................... 42
Section 3 on Semantics in Polling ...................................................................................... 45
Section 4 on Values in Polling ............................................................................................ 47
Section 5 on Systemizing Polling ....................................................................................... 52
Section 1 on World Polling....................................................................................................... 55
#1. A Barometer of International Security ........................................................................... 57
I. The Instrument at Hand .............................................................................................. 58
II. International Postwar Uses ........................................................................................ 59
III. Nature of the Barometer ........................................................................................... 60
IV. Difficulties Anticipated .............................................................................................. 60
V. Current Development of the Barometer ..................................................................... 62
#2. Toward World Surveying .............................................................................................. 63
I. Possible Functions of a World Association ................................................................. 64
II. Current Development ................................................................................................. 66
III. Possible Organization ............................................................................................... 70
IV. Issues to Be Settled .................................................................................................. 72
Notes ............................................................................................................................. 73
#3. Steps toward a Barometer of International Security ..................................................... 75
I. Historical Sketch. ........................................................................................................ 76
II. Two further proposed steps. ...................................................................................... 77
A. The Index of Peace Expectation. ........................................................................ 77
B. Toward Interesting UNESCO. ............................................................................. 78
III. Proposed Resolution for UNESCO ........................................................................... 79
IV. Discussion ................................................................................................................ 80
A. Form of the Barometer ........................................................................................ 80
B. Some Background Facts ..................................................................................... 80
Notes ............................................................................................................................. 81
#4. Standards for Surveying Agencies ............................................................................... 83
I. Standards for Surveying Agencies .............................................................................. 86
II. Summary of Proposed Minimum Standards .............................................................. 98
Notes ............................................................................................................................. 99
#5. Techniques for World Polling ..................................................................................... 101
I. Planning for International Polling .............................................................................. 102
II. A Matrix Model for Standardizing Polling ................................................................. 103
III. Review of General Publications .............................................................................. 107
IV. The Administering Stage ........................................................................................ 108
V. The Designing Stage ............................................................................................... 110
VI. The Questioning Stage ........................................................................................... 111
VII. The Sampling Stage .............................................................................................. 112
VIII. The Interviewing Stage ......................................................................................... 113
IX. The Analyzing Stage .............................................................................................. 113
X. The Reporting Stage ............................................................................................... 114
XI. The Interrelating Stage ........................................................................................... 115
XII. Recommendations for Developing World Polls...................................................... 116
Notes ........................................................................................................................... 117
#6. The World Association for Public Opinion Research .................................................. 119
I. The Developing Theory and Methods of International Polling ................................... 122
II. The Goal of World Opinion Research ...................................................................... 123
Notes ........................................................................................................................... 125
#7. Developing Demoscopes for Social Research .................................................. 127
I. Pan-Sampling ........................................................................................................... 128
II. Organizational Sampling .......................................................................................... 132
III. World Sampling....................................................................................................... 134
IV. Time Sampling........................................................................................................ 135
Notes ........................................................................................................................... 137
Section 2 on Studies on Techniques of Polling ...................................................................... 141
#8. The "Steps-And-Parts" Model for Polling .................................................................... 143
I. Introduction ............................................................................................................... 144
II. Instrument Variables at Issue................................................................................... 144
A. The 24 Steps in Polling ..................................................................................... 144
B. The 8 Type-Parts of Polling ............................................................................... 147
C. The Step-Parts .................................................................................................. 148
III. Opinion Variables Taken as Criteria ....................................................................... 148
IV. Relations among the Variables ............................................................................... 149
V. Formulas Deduced .................................................................................................. 151
VI. Experimental Testing .............................................................................................. 152
VII. Statistical Fitting .................................................................................................... 152
VIII. Practical Uses of the Model .................................................................................. 153
IX. Summary ................................................................................................................ 153
Author’s Bibliography Cited .......................................................................................... 155
#9. Dimensions of a Poll ................................................................................................... 157
Technical Note: ........................................................................................................... 166
#10. Sociomatrices and Levels of Interaction for Dealing with Plurals, Groups and
Organizations ................................................................................................................... 167
I. Persons as Units, P0 ................................................................................................. 168
II. Plurals as Sums, P1 ................................................................................................. 168
III. Groups as products, P2 ........................................................................................... 168
IV. Organizations as Powers of Persons, P3 ................................................................ 169
V. Matrices for Predicting Interaction ........................................................................... 171
Notes ........................................................................................................................... 174
References .................................................................................................................. 177
#11. Predictive Principles for Polls ................................................................................... 179
I. Levels of Analysis ..................................................................................................... 181
II. Steps of Analysis ..................................................................................................... 183
III. Some Rules for the Steps ....................................................................................... 184
IV. Possible Predicters of a Specified Mass Behavior.................................................. 190
Notes ........................................................................................................................... 194
#12. Scientific Methods in Human Relations .................................................................... 197
I. Scientific Method of Solving Human Problems ......................................................... 198
II. Scientific Methods Described ................................................................................... 199
II. An Example of Scientific Method in Dealing with England's Wartime Food Problem201
IV. An Example of Scientific Method in the G. I. Bill of Rights ...................................... 202
V. An Example of Scientific Method in Basic Social Research—the Hypothesis of Group
Gravity, or Interactance ................................................................................................ 203
Notes ........................................................................................................................... 208
#13. On Reliability in Polling............................................................................................. 209
Abstract........................................................................................................................ 209
II. The Dimensions of Reliability ................................................................................... 210
II. Control of Reliability ................................................................................................. 213
III. Measurement of Reliability in the Syrian Poll .......................................................... 215
A. Sampling errors ................................................................................................. 216
B. Informant and interviewer errors ....................................................................... 218
C. Interrelation errors ............................................................................................. 219
D. Schedule and media errors ............................................................................... 220
E. Temporal errors................................................................................................. 221
F. Residual errors .................................................................................................. 221
IV. Conclusion .............................................................................................................. 222
Notes ........................................................................................................................... 223
#14. The Standard Error of a "Social Force" .................................................................... 225
I. Definitions ................................................................................................................. 226
II. The Sampling error of one case (momentum) ......................................................... 226
III. The generalized standard error ............................................................................... 228
IV. Some special cases ................................................................................................ 230
Notes ........................................................................................................................... 233
#15. The Applications and Mechanical Calculation of Correlation Coefficients ................ 235
#16. A Correlation Machine .............................................................................................. 247
I. The Use of Correlation Coefficients .......................................................................... 248
II. Errors of Estimate .................................................................................................... 248
III. Multiple Correlation ................................................................................................. 249
IV. Regression Weights ................................................................................................. 249
V. Sampling ................................................................................................................. 250
VI. Reliability ................................................................................................................ 250
V. Partial Correlation .................................................................................................... 251
VI. Range “σ” ............................................................................................................... 251
VII. The Mathematics of Correlation Coefficients ......................................................... 252
VIII. The Mechanism and Operating Features of a Correlation Machine ..................... 253
IX. Cross-product Mechanisms .................................................................................... 256
X. Various Correlation Machines.................................................................................. 257
XI. Graphic Correlation ................................................................................................ 259
XII. Possible Developments ......................................................................................... 260
Notes ........................................................................................................................... 262
#17. On Predicting Elections or Other Public Behavior .................................................... 263
I. A Diagnosis ............................................................................................................... 264
II. Shift of Opinion ........................................................................................................ 265
III. Strength of Opinion (Intensity) ................................................................................ 266
IV. Sampling ................................................................................................................ 268
Notes ........................................................................................................................... 270
#18. A Call for Experimental Designs for Election Polling ................................................ 273
Notes: .......................................................................................................................... 277
#19. On Estimating Latent Versus Manifest Undecidedness: ........................................... 279
I. Problem .................................................................................................................... 280
II. Definitions ................................................................................................................ 280
III. Experiment .............................................................................................................. 281
IV. Findings .................................................................................................................. 281
Notes ........................................................................................................................... 283
#20. Research note on the “Law of Forecast Feedback”.................................................. 285
#21. The Momental Models for Diffusing Attributes1 ........................................................ 289
I. The Need for Systematizing ...................................................................................... 290
A. The diffusion data in Project Revere ................................................................. 290
B. The diffusion models indicated .......................................................................... 290
C. An example of a tested model........................................................................... 291
II. Algebraic Development of the Momental Growth Model ......................................... 294
A. Definitions and assumptions ............................................................................. 294
B. Statistical moments of an attribute—the elementary forms of probability .......... 294
C. Powers of the moments — specifying laws of probability growth ...................... 297
D. Operational definitions of the models ................................................................ 298
III. Behavioral Development of the Momental Growth Model ....................................... 299
A. Definitions and assumptions in behavior terms ................................................. 299
-t
B. Normal Diffusing, pt = A0 ................................................................................... 301
-t
C. Exponential diffusing, pt = A1 ........................................................................ 302
-t
D. Logistic diffusing, pt = A2 .............................................................................. 303
E. Summary of behavioral implications .................................................................. 303
Notes ........................................................................................................................... 305
Section 3: Studies on Semantics in Polling............................................................................ 307
#22. Public Opinion Definitions ........................................................................................ 309
#23. The Interrelation Matrix ............................................................................................ 315
I. The Matrix ................................................................................................................. 316
A. Inclusiveness..................................................................................................... 316
B. Isolation ............................................................................................................. 317
C. Joint relationship ............................................................................................... 317
D. Aggregation....................................................................................................... 318
II. The Units ................................................................................................................. 318
A. Population Units ................................................................................................ 318
B. Indicator Units ................................................................................................... 319
C. Time Units ......................................................................................................... 320
III. Some Applications .................................................................................................. 321
IV. Further Analysis...................................................................................................... 322
Notes ........................................................................................................................... 323
#24. Comparison of Scales for Degrees of Opinion ......................................................... 325
Notes ........................................................................................................................... 330
#25. Simple Test for Predicting Opinions from Their Subclasses ..................................... 331
I. What the Test Is ...................................................................................................... 332
II. What the Test Will Do .............................................................................................. 334
A. In Qualitative Terms—"classes" in symbolic logic and "attributes' in statistics. . 334
B. In Quantitative Terms—scaling and probability ................................................. 336
C. In Relative Terms — Correlation ....................................................................... 338
D. In Geometric Terms—paints, lines, planes ....................................................... 339
E. In Dimensional Terms ....................................................................................... 340
III. What the Tester Did ................................................................................................ 341
Notes ........................................................................................................................... 345
#26. The Coeffient of Equiproportion as a Criterion of Hierarchy1 .................................... 349
Notes ........................................................................................................................... 362
#27. Note on an Index of Conformity ................................................................................ 365
Section 4: Studies on Value in Polling ................................................................................... 369
#28. The Likability Theory ................................................................................................ 371
I. The "Likes" Models ................................................................................................... 372
A. The Problem...................................................................................................... 372
B. The Observing ................................................................................................... 372
C. The Likes Hypotheses ...................................................................................... 373
D. The testing (Ref. 26) ......................................................................................... 374
II. The "Likables" Models ............................................................................................. 376
A. The Problem...................................................................................................... 376
B. The Observing ................................................................................................... 376
C. The Hypotheses ................................................................................................ 378
D. The Testing ....................................................................................................... 378
E. The Applying ..................................................................................................... 380
F. The Systematizing ............................................................................................. 381
III. The Full “Likability” Models ..................................................................................... 381
A. The Problem...................................................................................................... 381
B. The Observing ................................................................................................... 381
C. The Hypotheses ................................................................................................ 382
D. The Testing ....................................................................................................... 382
E. The Applying ..................................................................................................... 383
F. The Systematizing ............................................................................................. 384
Notes ........................................................................................................................... 392
#29. A Tension Theory of Societal Action ........................................................................ 393
I. The General Equation Measuring Societal Tensions Definitions .............................. 394
A. Definitions ......................................................................................................... 394
B. Numerical Illustrations ....................................................................................... 395
1. A political example ....................................................................................... 395
2. An educational example ............................................................................... 395
3. A biological example .................................................................................... 396
4. An economic example .................................................................................. 397
5. A philosophic example. ................................................................................ 397
C. The Cases of Qualitative, Negative, and Unlimited Desiderata......................... 397
D. Systems of Values ............................................................................................ 399
II. The Effective Societal Processes Derived From the Tension Theory ...................... 400
A. First Order Processes ....................................................................................... 400
B. Second Order Processes .................................................................................. 403
C. Zero Order Processes ....................................................................................... 406
Notes ........................................................................................................................... 410
#30. On Criteria for Factorizing Correlated Variables ....................................................... 417
Notes ........................................................................................................................... 427
#31. The Concord Index for Social Influence (short published version) ........................... 429
I. Introduction – From “Social Control” to “Social Influence .......................................... 430
II. The Basic Categories ............................................................................................... 431
III. The Concordance Index — A Coefficient of Social Influence .................................. 434
Notes ........................................................................................................................... 438
#31. The Concord Index for Social Influence (Full version of paper) ................................ 439
I. Introduction – from “social control” to “social influence” ............................................ 440
II. A Unit of Social Influence and Its Coefficient ........................................................... 441
A. The basic categories ......................................................................................... 442
B. The concordance index - a coefficient of social influence ................................. 445
III. Application .............................................................................................................. 448
Notes ........................................................................................................................... 455
#32. The Concord Model for Social Control ..................................................................... 457
#33. Racial Attitude Survey as a Basis for Community Planning: The Broadview (Seattle)
Study ................................................................................................................................ 459
I. Relationship of Research to Planning ....................................................................... 460
II. The Incident Pointing up Broadview ......................................................................... 460
III. Move to Promote Good Feeling .............................................................................. 460
IV. Survey of Interracial Attitudes Is Requested ........................................................... 461
V. Background Factors Are Explored ........................................................................... 461
VI. The Controversial Question of Property Values Is Appraised ................................. 462
VII. Petitions Pro and Con Are Investigated ................................................................. 463
VIII. Tolerant Attitudes Are Shown ............................................................................... 463
IX. Summary and Conclusion ...................................................................................... 464
#34. Can We Be Scientific about Humanism? .................................................................. 467
Notes ........................................................................................................................... 473
#35. Use Scientific Methods in Planning .......................................................................... 475
I. Introducing the Message ........................................................................................... 476
II. Why Use Scientific Methods? .................................................................................. 476
III. What Are Scientific Methods? ................................................................................. 478
IV. How To Use Scientific Methods In Planning ........................................................... 480
A. Formulating the Problem called "Stating the Goals". ......................................... 480
B. Observing the Facts in Goal-setting .................................................................. 483
C. Hypothesizing a Set of Goals ............................................................................ 484
D. Designing Experiments in Goal-setting ............................................................. 486
1. Experimental Designing of goals by the OAS and Arab League .................. 486
2. Experimental Designing of Goals by Individual Student Theses .................. 487
E. Adopting the Best Tested Sets of Goals ........................................................... 488
Epilogue ....................................................................................................................... 488
Appendix A: What Is Liked Most? ................................................................................ 490
Appendix B: How To Start Ranking Human Values A Questionnaire, Version A2 ....... 493
Notes ........................................................................................................................... 495
#36. A Measure of Man's Maturity .................................................................................... 497
I. What Is a Major Measure of Human Maturing?......................................................... 498
II. How Can Men Measure Up to Greater Maturity? ..................................................... 500
III.Why Should Man Mature Further? ........................................................................... 502
IV. So "Whither Mankind? ............................................................................................ 505
#37. A Proposed Campus Poll or "Opinion Gauge1 ......................................................... 507
I. The Growing Need .................................................................................................... 508
II. The Campus Poll ..................................................................................................... 509
A. Its Objectives..................................................................................................... 509
B. Its Program........................................................................................................ 509
C. Its Organization. ................................................................................................ 510
D. Its Policies. ........................................................................................................ 510
E. The Polling Transaction. ................................................................................... 511
III. The Probable Consequences ................................................................................. 512
Notes ........................................................................................................................... 514
#38. The Course Critique Corrector* ................................................................................ 515
I. The Problem ............................................................................................................. 516
II. "Corrector Hypothesis" ............................................................................................. 517
III. The Method ............................................................................................................. 517
IV. Findings .................................................................................................................. 518
V. Follow-up Poll .......................................................................................................... 520
VI. Author’s Conclusion on these Course Critiques ..................................................... 521
VII. Goals for Future Course Critiques and Correctors ................................................ 522
VIII. The Underlying Democratic Philosophy of a Self-Governing Society ................... 524
#39. Dimensions of Lundberg’s Sociology as Foundation of Dodd Sociology .................. 527
Section 5: Studies on Systemizing in Polling ........................................................................ 537
#40. .................................................................................................................................. 539
I. Civic Research .......................................................................................................... 540
II. Basic Research ........................................................................................................ 541
III. Technical Research ................................................................................................ 543
IV. Training Researchers ............................................................................................. 544
Notes ........................................................................................................................... 546
#41. Washington Public Opinion Laboratory: Seven-Year Report 1947-1954.................. 547
I. Purposes of the Laboratory ....................................................................................... 548
II. Accomplishments ..................................................................................................... 549
A. Civic Research .................................................................................................. 549
1. Community Polling and Consultation. .......................................................... 549
2. Statewide Polling. ......................................................................................... 550
3. National Research. ....................................................................................... 551
4. International Research. ................................................................................ 551
B. Basic Research ................................................................................................. 553
1. Studies of Human Interaction. ...................................................................... 553
2. Studies of Human's Motivation. .................................................................... 555
3. Studies of Speech Behavior. ........................................................................ 556
C. Technical Research .......................................................................................... 557
1. Techniques of Designing. ............................................................................. 557
2. Techniques of Questioning........................................................................... 558
3. Techniques of Sampling. .............................................................................. 558
4. Techniques of Interviewing........................................................................... 559
5. Techniques of Analyzing Data. ..................................................................... 559
6. Techniques of Reporting Findings. ............................................................... 559
D. Training Researchers as Directors.................................................................... 560
1. Curriculum. ................................................................................................... 560
2. Projects—in Library, Community, and Laboratory. ....................................... 560
III. Administration ......................................................................................................... 560
IV. The Staff ................................................................................................................. 562
V. Reports of Laboratory Activity.................................................................................. 562
Bulletins ................................................................................................................. 564
Articles ................................................................................................................... 565
Theses ................................................................................................................... 571
Exhibit A: Civic Research - Polling Interracial Tensions (37) ....................................... 572
Exhibit B: Basic Research—Testing a Logarithmic Model of Reactions (54, 119) ...... 574
Exhibit D: Conditions for motivating men ..................................................................... 578
Exhibit E: Technical Research - Repolling in the National Elections of 1948 (26) ....... 580
#42. "Scientizing the Probable Acts of Men — Copenhagen Lectures"............................ 583
Seminar 1 Summary: Conspectus of a Scientizing Model for the Cosmos .................. 585
Seminar 1 Exhibit A: Conspectus of 32 Hypotheses in Epicosm Models ..................... 587
Seminar 2 Summary: Scient-Scales" ........................................................................... 589
Seminar 2 Exhibit A: Exhibit of Scient-Scales-Rating Sheet for Stratum 1 .................. 590
Seminar 3 Summary: The Reiterant Matrix in 4 Cycles ............................................... 591
Seminar 3 Exhibit A: How Reiterants Help General Mathematics ................................ 592
Seminar 4 Summary: Normal Cosmic Acts .................................................................. 593
Seminar 4 Exhibit A: Mass-Time Triangle .................................................................... 595
Seminar 5 Summary: Squaring Acts for Explaining the Cosmos ................................. 596
Seminar 5 Exhibit A: The MASS-TIME Triangle Extended Through the Radiation
Spectrum ..................................................................................................................... 599
Seminar 6 Summary: Cycling Acts for predicting the cosmos ...................................... 600
Seminar 6 Exhibit A: The Epicosm Models .................................................................. 602
Seminar 6 Exhibit B: The Key Periodic Table Generating Cosmic Constants .............. 604
Seminar 7 Summary: Cosmic Acts and Human Value Systems .................................. 605
Seminar 7 Exhibit A: .................................................................................................... 607
Seminar 8 Summary: Fulfilling Acts to control cosmos increasingly ............................ 609
Seminar 8 Exhibit A: The Semiotic "Epicosm Hypotheses" ......................................... 611
Lecture 1 Summary: A Brief Statement of the Transact Model1 ................................... 613
Lecture 1 Exhibit A: How Transact Modeling Forms Subforms .................................... 615
Lecture 2 Summary: Speech Acts................................................................................ 617
Lecture 2 Exhibit A: The Lord’s Prayer in TILP ............................................................ 619
Lecture 2 Exhibit B: Ten Semantic Tangles in Speech Transactions ........................... 621
Lecture 3 Summary: How "Momental Laws" Can Be Developed In Sociology ............ 623
Lecture 3 Exhibit A: A Conspectus Of The 'Moments Clan' Of Actance Models .......... 624
Lecture 4 Summary: Valuing Acts - the Likability Model and 4 sub-models ................. 627
Lecture 4 Exhibit A: Things Liked Most ........................................................................ 629
Lecture 4 Exhibit A: Conditions for Motivating Men ..................................................... 631
Lecture 5 Summary: Agreed Acts ................................................................................ 633
Lecture 5 Exhibit A: 24 Hypotheses Project Consensus .............................................. 635
Lecture 6 Summary: Diffusing Acts .............................................................................. 637
Lecture 6 Exhibit A: Logistic Diffusion when Clique Size Varies .................................. 639
Lecture 6 Exhibit B: Rules for Predicting Probable Acts of Men under equal
opportunities ................................................................................................................ 641
Lecture 7 Summary: Countering Acts .......................................................................... 643
Lecture 7 Exhibit A: The Negative Powers or “Countering Clan” of Reactants
Submodels .................................................................................................................. 645
Lecture 8 Summary: Planned Acting in Crafts ............................................................. 648
The Life and Work of Stuart C Dodd
Stuart C Dodd was born in 1900 in Talas, Turkey, where his father was a medical
missionary. He received B.S. and M.A. degrees and, in 1926, a Ph.D. in psychology, all from
Princeton. Dodd developed and directed the Social Science Research Section at the University
of Beirut from 1927 to 1947. He married Betty Dodd in 1928. They had two sons, Peter and
Brian. During World War II, he served as Director of Surveys with the U.S. Army in Sicily. In
1947, Dodd accepted an offer to direct the Washington Public Opinion Laboratory at the
University of Washington, a position he held for the next 14 years. He left the laboratory in
1961 to devote more time to his research. He retired from the University in 1971. He passed
away during the Christmas holidays of 1975 while on a trip to see his son Brian in California.
In 1951, the Air Force's Human Resources Research Institute (HRRI) awarded the
Washington Public Opinion Laboratory a contract, to research the effects of leaflet drops on
U.S. communities under the name, "Project Revere". Dodd served as principal researcher for
Project Revere which ended in 1958. Dodd's influence on mass communications was
significant. He trained a generation of mass communications researchers and his writings
suggest a clear and profound ideological purpose that shaped his work.
Dodd's published writings between 1939 and 1974 were extensive and varied, and they
demonstrate the degree to which he sought to make social phenomena reducible to and
controllable by, mathematical formulae. "It is possible with our present knowledge to begin
constructing a quantitative systematic science of sociology," he declared in his 1942 text
Dimensions of Society. In 1951, he attempted to provide operational definitions, through the
use of mathematical equations, for such concepts as "freedom," "equality," and "democracy."
In a 1951 article appearing in Educational Theory, Dodd sought "to translate the traditional
concepts of the Christian religion into the terms of modern social science." Dodd explained
how social scientists view good and evil, the soul, sin, prayer, and other such notions. A 1959
article which possessed direct bearing on how he viewed communication, was entitled An
Alphabet of Meanings for the Oncoming Revolution in Man's Thinking. Recognizing the rapid
development that had taken place in communication technology since the turn of the century,
and seeing a resultant inefficiency and imprecision in most communicative acts, Dodd
proposed the development of a single, international language based on mathematics. Dodd
noted that his new ten-letter alphabet, referred to as "TILP," would be a perfect symbolic
system, capable of expressing every human meaning, without indefiniteness or waste of
energy. Dodd asserted that adoption of this new alphabet of meaning would revolutionize and
streamline all thought, reduce conflicts and misunderstanding, and even shorten the extended
period of time that students spend in school. Finally, according to Dodd, adoption of this new
alphabet of meaning would "revolutionize human speech and thinking and thereby significantly
accelerate man's cultural evolving.”
Dodd pursued his quest for the understanding of human society beyond the social
sciences. This work included his Mass-Time triangle which was a diagram that included the
physical, biological and social sciences in one hierarchical model arranged according to the
mass of the entities involved. He originally called his cosmic model, the Epicosm. His basic
concept was that ceaseless random interaction of the entities at each level of his model
generated everything in the Universe. Later, he changed the name of his model to the PanActs Cosmos in order to better express his basic concept.
"My lifelong quest is for greater unity, pervading and tying together all diversity. Whether
divergent counsel in a group's formal discussion where I seek the synthesizing motion; or
seeking a life center for emotional satisfactions as I find in Betty; or searching for a simpler yet
ever more inclusive formula for all things knowable to man as I developed in the Pan-Acts
equation, a/ct=1, for God c (If seen as the Creator and Ruler of All in self-creating and over-all
ruling cosmos when defined as the Universal set (Uo= 1) of all things namable) - all these and
much more are manifestations of my mostly subconscious quest towards integrating - always
trying to systematize from chaos, forever wanting to see things more wholly and as a whole."
Stuart C. Dodd, 1971
Sample of Stuart C Dodd’s ideas:
A Preview Introducing and Evaluating the "Pan-Acts Matrices" (excerpt)
64 Exhibits of the Pan-Acts Model for the Dimensions of Cosmos
in 16 Conspectuses of 64 pages
I
What prompted this Exhibits booklet?
At the age of 74, after two heart attacks in the last four months, I have decided to record
this sixteen-year inquiry into the cosmos in preprint form so that, should I leave it unfinished,
other cosmists can continue to explore and develop, to test and apply, this cosmic research. I
intend to flesh out these skeletal outlines as far as time and health and assistance may make
possible.
Such writing up of the Pan-Acts Modeling will use:
A. these 64 pages of exhibits as the gist of the modeling, supported by
B. 400 "EpiDocs" of some 2000± pages, which are mimeographed or Xeroxed studies
preparing for fuller publication (Individual copies or whole sets are available at
Xeroxing cost.);
C. 140 published research articles (and ten books) under contract with Gordon and
Breach for republication in four volumes in 1975-76, entitled Systemed Studies on
Human Transacting.
D. 30 notebooks, chronologically ordered, of my daily "Dawning Thots" (= "DT's"),
hunches, memos, early drafts, etc., etc.
E. My 5-hour video-taped autobiography (from a seminar at Brown Univ. on "Masters of
Sociology."
F. 6 systematizing monographs of some 4500 pages.
I plan a volume on Dimensions of Cosmos, comparing 4 versions in parallel columns on
each two-page spread. The four versions, mutually enriching each other, would try to
communicate to: (a) lay generalists; (b) scientists; (c) algebraists; (d) geometrists. All readers
could enlarge their understanding of this World View by reading as most congenial and
informative to them, while augmenting their knowledge by glimpses thru the other languages of
science.
II What recent advances have helped deal afresh with the ancient questions of, Whence the
universe? and Whither Mankind? I list in partial answer:
A. Set theory, viewing every word or symbol as a name for the set of instances of its
referent;
B. Systems theory, viewing complex wholes, with logarithms as the best algorithm;
C. Semiotics, studying systems of man x symbol x thing interactions;
D. The Zero exponent, X: = 1 , for every set, word, or qualitative entity (identified by
sub-scripting) which enables man to Seal with all things qualitative with
mathematical rigor equal to current dealing with all things quantitative;
E. Computers, greatly simplifying complex calculating and checking of quantifiable
hypotheses;
F. Stochastic processes, discovering continuous creation;
G. Combinatorics (or "Combics" for short), explaining increasingly large sections of
scientific laws, processes, forces, formulas, etc., in simple terms of combinings,
permutings, or repeatings (called "reiterings" here).
III What guiding principles helped most in this cosmic inquiry?
My central quest was for better symbolizing (concepts, units, scales, formulas,
hypotheses, laws, etc.) for analyzing the cosmos resynthesizably — to mirror its whole activity
ever more simply and clearly to man. Mostly I used intuition, setting my subconscious to work
while I slept to come up with my volumes of "Dawning Thots" on waking. But always I made
explicit use of scientific methods when formulated as: "To so describe whatever is studied as
to explain its past genesis, predict its future recurrence, and control its anytime changing better
than hitherto."
IV What use of this cosmic Pan-Acts Model may be expected?
For philosophers, I expect Pan-Acts Modeling dill offer a more simple yet complete, a
more explicit and exact, cosmology and epistemology than any present alternative theory.
For theologians, I expect Pan-Act-Theism to help solve ancient problems of the nature
of God, God's Will, Good and Evil, the Creation of all things and the Destiny of man. (See
EpiDoc 303.)
For scientists, I expect Pan-Acts modeling to prove a more comprehensive yet precise,
more clear and operationally testable description and explanation of the whole cosmos than
any current rival theory.
For sociologists, I expect the Transact submodel for Society gradually to be used as a
societal supersystem or integrative framework within which more specific theories of human
society can be expressed and tested.
For people universally, I expect the Pan-Acts model to provide a World View tending to
help integrate nations and people, religions and ideologies, in One World Community.
Things Categories of Cosmists’ Actions or Scientists’ Four Aims
Things Liked Most
This “Things-Liked” theory of human behavior aims to so describe the probable acts of
men as to explain and predict them increasingly and thus help to augment human self-control.
The theory starts by building, thru polls of humanity, a list of human wants which
posterity now seems most likely to work and live for. This listing starts, in turn, on this page by
inviting every reader to rank and revise the list above so it will best express his own system of
values.
This statement of a human value-system, or "things-liked," is intended to be highly;
1. Comprehensive: sampling somewhat typically all ten institutions of any culture;
2. Universal: satisfying all in most religions, ideologies, statuses, and periods;
3. Exact: specifying, in 400 words, 200 items of desiderata
2. Understandable: words-per-syllable ratio = 400/440 = 90% of maximum simplicity;
3. Important: organizing 200 most preferred items into a standardizing hierarchy.
4. This base-line statement of values can help test such axiological hypotheses as:
5. Personal Hypothesis: If each reader substitutes one’s own more preferred items
(keeping within 400 words) and rates them, then one can express his own value
system.
6. Group Hypothesis: If groups, in controlled experiments, rerate each item after
discussion-with-intent-to agree, then closer consensus tends to result.
7. Human Hypothesis: If people everywhere experiment thus, persistently from childhood
on up, producing more consensus on more values, then a world value-system for
humanity tends to emerge, that is democratically desirable, definitive and durable.
Project Value-systems, S.C. Dodd, University of Washington, Seattle, WA, USA Epidoc 138
This EpiDoc 138 expands the “Dimensions of Societal Planning” matrix, EpiDoc 314:1
0.
Our Personal Present
We all, the PEOPLE of our Earth,
want now a life of greater worth
for each in every place and time;
We seek ten means to climb:
and thus fulfill mankind
1. Hygenic
6.
Domestic
We like TO LIVE in health—
We like TO LOVE and be loved—
We want less sick of any kind;
as mate or parent, child or friend,
We want more whole in heart and mind.
as neighbor, kin or fellow men,
We strive to grow both safe and strong;
as living, dead, or yet to be,
We year for life filled full and long
each in due ways and due degree.
2.
Economic
We like TO GET more wealth—
thru work and trade we ‘re free to choose,
ourselves to feed, clothe and amuse,
to fill all needs from high to low
as we from child to adult grow.
7.
Philanthropic
We like TO GIVE what’s used—
to help men climb, those with least health,
those least in other wants like wealth;
and those who long have lacked the most
of what their own groups prize the most.
3.
Political
We like TO RULE by law—
thru rulers picked by vote of all
who let no rights nor freedoms fall;
with justice and security
in local or world community.
8.
Religious
We like TO WORSHIP well—
our God, our good, our goals in life.
We march in quest but with no strife,
for what as holy each may see,
while free to speak of what should be.
4.
Recreational
We like TO ENJOY life—
each day, in work, or play, or rest
on hill or plain, with more of zest,
with memory; at cost that’s due,
with fun for us and playmates too.
9.
Artistic
We like TO BEAUTIFY—
ourselves, our homes, and all around—
thru music, pictures, gardened ground,
so what we touch or taste or smell,
so lovely feelings within us dwell.
5.
Scientific
We like TO LEARN the truth—
to test out how —as science tries—
things move, or breathe, or symbolize;
to learn to curb our fears and war,
to progress fed by research more.
10. Educational
We like TO TEACH wisdom—
to each child here or not yet born,
and all who hope from us to learn,
the best of ways to live and grow
thru what feel and do and know.
11.
Our Social Future
If we our FUTURE Earth
plan safe from bomb and dearth
then each in roles one plays
must help all win in ways
that best augment man’s days.
Pan-Acts Cosmos Pictured as the Mass-Time Triangle
General Systems: A Creative Search for Synthesis (excerpt)
J O H N C U R T I S GOWAN
S T U A R T C. DODD
During the past several years, a small number of persons*, including both authors of this
paper, have been at work on a particular aspect of general systems theory involving creative
transformations in the mapping or recognition of isomorphisms between one theory and
another. This has been accomplished by mail, individual visits, and by meetings held
particularly during the 1974 and 1975 Creative Problem-Solving Institutes at Buffalo. Other
meetings have also been held through the aegis of Jeanne Rindge of the Human Dimensions
Institute of Buffalo. This is the first and necessarily incomplete report of some of these
activities; it is completely unofficial, and should be understood only as the personal perceptions
(and possibly the errors) of two of the participants.
…
Table 3: Land-Dodd consolidation using 3 x 3 set.
(Land) —*
ACCRETIVE
REPLICATIVE
MUTUAL
subprocesses of
interaction
combining
repeating
permuting
math operation
adding
multiplying
empowering
output
sum
product
power
(Dodd)
4. In place of two discrete, empirical models, we now have one categorical
model, which is a two dimensional set of three elements, capable of being
mapped into other theories, and of being extended into three dimensions, such
as when it may be possible to map it into the Genesa theory of Langham and
the eidetic concepts of Evering.
Stuart Dodd's last written comments on this work follow in the next paragraph which
was constructed from his suggestions dated September 15-October 8, 1975.
The relationship between Land's theories and the Dodd Pan-Acts model is shown in
Table 3. Note that the second and third rows are the first two of four rows of the reiterating
matrix (Dodd, 1975). This matrix summarizes the methodology or scientists' behavior in
deriving the Pan-Acts model. Row one generates the symbols, in the Pan-Acts model, row two
generates the syntax, rows three and four of the Pan-Acts model indicate the four key
processes and the four cosmic submodels by tense; the three tenses are transforms of
the three stages in the Land-Gowan model (by taking the middle stage as the present
tense.
…
*Among these were Jere Clark, Betty and David Cox, Barbara Hubbard, Gus Jaccaci, Robin
King, Lucy Krall, George Land, Derald Langham, Rendle Leatham, Jo Mock, Mort Rapp, and
Jeanne Rindge.
Stuart C Dodd Institute for Social Innovation
The Stuart C. Dodd Institute for Social Innovation (SCDI/SI) is a not-for-profit, tax exempt,
organization registered as such with the U.S. government. Incorporated on May 9, 1997; 501
(c) (3) application approved on September 9, 1997.
“As human systems and organizations grow ever larger, more complex, and more impersonalin our schools, in our communities, in our churches, in our governments, and in industries and
commerce-the individual shrinks toward facelessness, hopelessness, and frustration.”
Dr. Stuart C. Dodd,
"Citizen Counselor Proposal,"
The Seattle Times, November 10, 1974
Internet web address: www.stuartcdoddinstitute.org
Office Address: 4427 Thackeray Place NE; Seattle, WA 98105-6124
Phone: (206) 545-0547; Fax: (206) 632-1975;
Electronic mail general information: info@stuartcdoddinstitute.org
Web master: Webmaster@stuartcdoddinstitute.org
Other e-mails: DrRSKirby@aol.com (Director).
LUCSTEW@aol.com (Office Manager)
Purposes
SCDI/SI encourages scholarly, interdisciplinary research in the archives of Stuart C.
Dodd (1900-1975), Professor Emeritus of Social Science at the University of Washington.
Scholars and associates pursue the intellectual, moral and civic legacies of Dr. Dodd in the
fields of sociology, business administration, education, urban planning, sustainable
communities, cosmology, statistics, and mathematics. Topics of particular interest include
organization development, administrative theory, many-to-many communication, and value
reporting.
Our Mission
The Stuart C Dodd Institute for Social Innovation is dedicated to using the full range of
human knowledge to achieve a society that is democratic, equitable, just, compassionate,
spiritual, sustainable, diverse, and fulfilling.
Our Methods
“We advance social innovation on the successive planes of theory (particularly Social
Innovation Theory), organization (particularly The Stuart C. Dodd Institute for Social Innovation
with help from the Forum Foundation), event (particularly our annual conferences and regional
training activities), in social policy, program and experiment, and communication.
Our social experiments range from uses of the Fast Forum@ technique to the establishment of
civic innovation programs in entire cities such as Slidell, Louisiana. We operate with a view to
doing ideally profitable social science and social philosophy research deriving from the legacy
of Stuart C. Dodd. Our national and international activities collectively act as a model social
science "think-tank." We explore new horizons of excellence in self-management and
organizational development, in social theory, philosophy of society and political
science/philosophy/art, and in the development of breakthrough social technologies. We aim to
advance the theory and the practice of social innovation, as part of the legacy of Stuart C.
Dodd. We conduct research in sub-fields of social innovation such as educational innovation.
We are developing an in-house five-year plan for the planes on which we operate, from the
pre-theoretical to the communication/publication plane.
SCDI Founder: Richard Spady
Richard Spady, a student and practitioner of Administrative Theory, is president of the
Forum Foundation. A Seattle businessman, he co-founded Dick's Drive-in Restaurants in 1954
and currently is its president. Spady is active in community affairs, and is a lay speaker in The
United Methodist Church.
“I learned long ago that if something needs to get done, one has to get organized to do
it. So in 1997 I helped organize the Stuart C. Dodd Institute for Social Innovation. It is a 501 (c)
(3) non-profit, tax-exempt organization to encourage scholarly, interdisciplinary research in the
archives of Stuart C. Dodd. Here is the story behind the Institute.
An institute to continue the work of Dr. Dodd had been a dream of mine for many years.
But Dodd's work on mathematical cosmology is not for the faint of heart; few people can
understand it, much less move it forward. My acquaintance with the Rev. Dr. Richard S. Kirby
through the World Network of Religious Futurists told me he might be the person for the job.
I visited Dr. Kirby in London in 1994. He had finished his Masters of Divinity studies at
General Theological Seminary of the Episcopal Church, New York (USA) in 1985 and his
Ph.D. in Theology studies at King's College, London from 1987 to 1992. Richard Kirby is an
outstanding scholar. I judged he had the capacity to pick up the traces left in the archives of
Stuart Dodd.
Dr. Kirby was open to my proposal, and he wanted to return to the United States. I
offered him a home and office in Seattle (owned by my business firm and dedicated to public
service) and invited him to be the Stuart C. Dodd Chair in Social Innovation at the Forum
Foundation. He accepted, came to Seattle in 1995, and is now an American citizen. When the
Stuart C. Dodd Institute for Social Innovation was formed in 1997, Richard Kirby became its
first Executive Director.”
The Leadership of Civilization Building
by Richard J. Spady and Richard S. Kirby
Administrative and Civilization Theory, Symbolic Dialogue, and Citizen Skills for the 21st
Century
“As a new century dawns, this book introduces new truths about the theory and practice of
civilization. It offers a fresh and imaginative analysis of the meaning of citizenship and
introduces practical tools for the empowerment of people in business, government and religion.
Indeed, the authors have broken new ground in the practical definition of civilization building. It
invites people everywhere to play a major role in the civilizing of their own civilization. In these
pages, you will find:
 An enlightened approach to leadership in organizations and society
 New theories on how to develop your powers as a citizen
 New tools to communicate more effectively through symbolic dialogue”
Available through the Forum Foundation.
The Forum Foundation
“The Forum Foundation ....Conducts futures research in the field of Administrative Theory and
Many-to-Many Communication technology to discover those dynamics which tend to move
organizations and institutions, universally, toward solving their problems and anticipating or
adapting to changes in their environment.”
www.forumfoundation.org
Presidential Address project
Each year, the Forum Foundation conducts a survey of high school students on the subject of
the Presidential address in January for that year, State of the Union or Inaugural Address.
Each student is given the opportunity to agree or disagree with every sentence in the address.
DVDs available from the Forum Foundation
Visionary Voices Series with Richard Spady/Dr. Cecil H. Bell, Jr.
The Fast Forum Opinionnaire
A New Approach to “We the People” says Dick Spady
Founding SCDI Executive Director: Rev. Dr. Richard S. Kirby (1949-2009)
Rev. Dr. Richard S. Kirby, Executive Director, former adjunct faculty of the University of
Washington's School of Business Administration; former Director of Administration,
International Mensa; Chief Executive Officer of the World Network of Religious Futurists; and
co-author: The Temples of Tomorrow: World Religions and the Future (Grey Seal Publishers,
1993).
Richard Spady and Richard Kirby met at a conference and found that they shared interests
and goals. Spady and Kirby discussed the Stuart C. Dodd Institute for Social Innovation and it
was agreed that Kirby would become the Executive Director.
World Network of Religious Futurists
“Religious futures scholarship focuses on predictable occurrences in the future of religion,
based on present observable trends, and past trends in religion, compounded by expectations
of wild cards or quantum leaps, in the context of society's future as a whole, ranging from
science to technology. What kind of science and technology excites you, what kind worries
you? Get involved in bringing religious values to the future of these sciences and
technologies!”
http://www.wnrf.org
SCDI Catalyst: August T. Jaccaci
Jaccaci has worked as a futurist, artist, teacher, and international consultant to companies
both large and small helping them develop successful business strategies using his
METAMATRIX future mapping methodology. In 1995, Jaccaci and Susan B. Gault founded the
Social Architects Associates and began their discovery of a natural evolutionary process to
charitable and profit-making enterprises.
Gus Jaccaci was a personal friend of Stuart C. Dodd. They both attended annual workshops
at the Creative Problem Solving Institute, University of New York at Buffalo. Gus had the
opportunity to video tape and interview with Dr. Dodd before his death. In the mid-90s, Gus
was instrumental in bringing together the founder of the Institute and the editor of the DML.
2008 Thomas Jefferson Returns
Letters received and recorded by Gus Jaccaci
“Mr. Jefferson is the author of the Declaration of Independence and the social architect
of the Bill of Rights is a primary guardian of the American Soul. Under the terrible pressure of
losing the American Experiment in self-governance to the forces of fear trending toward
fascism, Mr. Jefferson has returned to empower individual Americans to rise up to their full
responsibility to protect their souls and their personal and community potential for truth, beauty
and goodness which together express love, the meaning and destiny of America.
Mr. Jefferson’s writings received at various moments on and after July 4 th within the
year 2007 are meant to be letters to all humanity. They are, however, focused on aspects of
the American soul lest it journey farther from planetary potential toward planetary problem.”
Jefferson 2040
"Awaking America" is an interactive and spontaneous conversation with a revered social
architect of our nation, Thomas Jefferson. Speaking from the year 2040, he addresses the
challenges of our time and reveals the creative steps Americans will take in the coming
decades to reinvent our politics, our economy and our relationship with all life. Mr. Jefferson
combines his view of America's past, present and ideal future into a provocative, entertaining
event we describe as a "transformance" since no one leaves unchanged.
http://www.jefferson2040.org/
Unity Scholars
Unity Scholars is a non-profit organization of social inventors who host visionary people,
projects and events dedicated to enhancing healthy human evolution. The purpose of the
organization is to address global social and environmental problems by applying the
knowledge gained from studying nature's universal patterns and by developing the creative
potential of individuals and their communities. Unity Scholars comprises a worldwide web of
people who share our dedication to this purpose and who support our endeavors through the
exchange of ideas, teaching and participation in Unity Scholar's projects.
http://www.jefferson2040.org/unityscholars.html
Futurum Grid, 46th Annual Creative Problem Solving Institute Reference Sheet
SCDI Dodd Memorial Library Editor: Burt Webb
Burt Webb is a computer consultant in the Seattle Washington area. He has written,
contributed to or edited books and articles on alternative energy, robotics, artificial intelligence,
nanotechnology ,SETI, ethno-botany, social impact of technology, current affairs, economics,
sociology and psychology as well as science fiction, fantasy and adventure novels and movie
scripts.
“I met Stuart C Dodd during the founding of the World Future Society Evergreen
Chapter in Seattle, Washington in the early 70s. We enjoyed discussing comprehensive
multileveled models of reality. In 1974, he asked for my assistance in writing up some of his
material for a general audience. He was a very warm, cordial, intelligent and witty man. I
enjoyed our association very much. Dr. Dodd passed away in 1975 but his work was ahead of
his time and is very relevant today. After his death, all his papers were given to the University
of Washington and still reside in the archive there. Access is restricted and the papers were
never cataloged. It is very difficult for anyone interested in his ideas to access his writings.
I also met Dick Spady at the WFS Evergreen Chapter meetings. We had many lively
conversations about futurist subjects. In the late 1990's, Dick approached me and asked for my
assistance in bringing Stuart's work to a new audience. I was glad to have the opportunity to
assist in this project. Due to the generous support and encouragement of his friend, Dick
Spady, the valuable work of Stuart C. Dodd will now be available to scholars everywhere who
wish to explore his legacy.”
Burt Webb
The Nexilist Notebook
I have been interested in interdisciplinary studies since I was a kid. So this blog will cover a
wide variety of subjects including how everything is connected. Politics, society, psychology,
technology, religion, myth, dreams, the arts, and winning at the game of life.
http://www.nexilist.com
Dodd Memorial Library
The Dodd Memorial Library is a project of the Stuart C Dodd Institute for Social
Innovation. The purpose of the project is to publish a series of volumes that contain the work of
Stuart C Dodd. Dodd’s intellectual legacy resides in the Special Collections of the University of
Washington in Seattle, Washington. Dodd willed his writings to the University upon his death.
They consist of a series of file boxes that occupy about 110 linear feet of shelving at the U of
W. They contain his published and unpublished books and articles. In some cases, he had self
published collections of his papers. In other cases, he left detailed Tables of Contents for
books that were intended as collections of his papers. The archive also contains his
correspondence, notes, working documents from his years as a college professor, working
papers from the Washington Public Opinion Library and other personal papers. The DML will
consist of all his published work collected and edited into a series of volumes by subject. It will
also contain a collection of biographical material in one volume, a research guide to his
collected works at the U of W and an encyclopedia that contain representative papers that
cover the breadth of his life’s work. The volumes will be announced on the Institute website as
they are released and will be available for purchase.
Dodd Memorial Library Editor Note
Stuart produced 9 books, thousands of research papers and hundreds of journal articles
during his 50 year career. The material occupied over 110 feet of files. Over two years, I
managed to obtain copies of his books and over 700 articles and papers. During his life, he
created a number of collections of articles and tables of contents for a number of new books. I
found these tables of contents and proceeded to collect all the articles mentioned. I am now in
the process of creating a 25 volume set of books covering the life's work of Stuart C Dodd.
This series will be made available thru the Stuart C. Dodd Institute for Social Innovation.
Inquiries about the series should be referred to the following contacts:
Burt Webb: Editor
(206) 729-7410
phoenix@eskimo.com
Volumes in the Dodd Memorial Library
Behavioral Sciences
Best Published Articles
Best Unpublished Articles
Dimensions of Cosmos
Dimensions of Society
Epicosm
Fitness for Self-government
General Organization
Horizons of Thought
Human Values
Interactive Symbolizing
Message Diffusion
On Language
Pan Acts Cosmos
Pan Acts Theism
Probable Acts of Man
Probable Acts of Men
Public Opinion Research
Dodd Research Guide
Revere Studies on Interaction
Social Relations in the Near East
Synthesizing Oneself, Society and Cosmos
Systematic Social Science
Transact Modeling
Techniques for World Polls
(Bolded volumes are currently available – email phoenix@eskimo.com for details)
Systemed Studies on
Opinion Polling
Overview of Book
This book was originally intended to be one of a set of four themed volumes covering the life’s
work of Dr. Dodd for publication by an academic publisher around 1970. Unfortunately, the
project was cancelled. This particular volume collected Dodd’s papers on Opinion polling.
Notes on Articles in Systemed Studies on Opinion Polling
written by Dodd for Christopher's rewriting
The format of each of the 42 Prefaces to the 42 articles of this Volume 3 was agreed on
to comprise four sorts of paragraphs (though the last three may be flexibly merged, omitted, or
otherwise treated within and between sets of articles).
i.e., 1) an Abstract of the article, labeled as "Abstract," in 100-200 words;
2) Sentences placing the article (with hindsight) within the eight-factor transact
formula,
Bir = (A;P;T;V;W;L:M:C]
― noting the factors stressed in the five sections ― and noting any relation to the
Likability submodel
BLK =
s
s[
s
(AF;AK;ADi);PP;T-1;(V);W;L;M;C]s ;
3) any biographic comment, date of article, etc.;
4) any evaluative, or other, comment 8CC may want to write.
For procedure in writing up, I would advise preparing abstracts first at least for each
section. Then write up the other paragraphs for each article or section of articles. Finally, write
the general preface to Volume 3 so that it grows inductively out of the summarized articles as a
synthesis of your comments (and mine here) in general.
I would like to have the general preface to the volume include:
a) Its place among the four volumes of Systemed Studies on Human Transactions.
This can be briefly noted since my Preview to the four volumes will be repeated
in each volume.
b) Note that "polled opinion" is speech behavior (or symbolic interaction) measured
by polls and so is a transact or "recorded act-in-context." It's a subcase of my
Methodological Transact Theory which says: "Insofar as Transact A and later
Transact B match features, in just so far A predicts B," etc. The eight
transfactors, especially Wordings, W, and the four-corner scripts or facets;
especially the zero exponent (cite articles in other volumes), all apply here. I'd
like to develop the Transact Model as an explicitly predictive and controllative
theory and more than a classification scheme.
c) Note that polling is methodology, the observer's behavior, i.e., the scientist’s
transaction.
d) Note that the Likability Theory is an amplifying of transact theory and so should
cover or subsume both "opining" and "polling" behaviors. Does it? Invite the
reader to test this hypothesis.
The general preface might well run over the five sections (unless you prefer a
Section Preface). Show how the topic of each focuses on specified transfactors.
Note that the classifying of articles under these headings is ex post facto and since
all 150 articles are to be fitted in somewhere, some articles fit better than others. The first
and last articles in each of the five sections are arranged generally to introduce and
summarize the section heading.
My biographic periods have been (this is too detailed--as you requested)
1900 Oct.3
1905-6
1906-13
1913-17
1917-18
1922-23
1923-26
1926-27
1927-47
1929-47
1943-44
1947-71
1950-53
Born in Tales, Turkey. (My father was a medical missionary.)
First grade in Montclair, N.J. (Parents on furlough in U.S.)
Finished 8th grade in Turkey (Father taught me Arithmetic; mother all
other subjects)
High school, Montclair
Tutored a paralytic boy to start earning my way through college 1918-22
Princeton, BS, BK, MA, and Ph.D.
Psychologist, State Home for Boys
MA & Ph.D. Princeton
London Karl Pearson and C. Spearman Rockefeller Fellow
Prof. of Sociology, American University of Beirut (AUB)
Director, Social Science Research Section at AUB
Director of Surveys in Sicily AFHQ Lt. Col. and later in London
Professor of Sociology, University of Washington, Seattle 1947-61
Director Washington Public Opinion Laboratory
Director Project Revere (Diffusion Experiments for Air force) June 1971
Retired
Also see my vita, 12 Who's Who's etc., and Article #39.
Section 1 on World Polling
The first seven articles on World Polling developed out of World War II pressures on me
— and opportunities also. On Christmas day of 1942, I returned from the U.S. to Beirut via
Australia to find a request and a research opportunity awaiting my arrival. I had left Beirut
eighteen months earlier when the Nazi take over forced American of military age out of the
country or go into concentration camps. The British Ministry of Information and American
Office of War Information (O.W.I.) wanted me to try out the new instrument for gathering
intelligence — polls — from whole civilian populations. Just how far would the Arab populace,
largely opposed to British and French colonialism, help or hinder the Allied war effort in specific
ways and areas? How much would polled respondents, when suspicious and often hostile, lie
to the poller takers? Our success in measuring such living and demonstrating the value of
polling generally, in Lebanon, Syria, and Palestine in 1943 (reported in Polling S in Syria,
Government of Palestine Press, 1943) resulted in an urgent telegram from General
Eisenhower's Headquarters in Algiers for me to go there to organize surveys in Italy as the
Allies progressively took over. Our polls helped so dramatically in Sicily to resolve a rationing
breakdown and a Mafia crime wave that my next assignment was to plan the post-war uses of
polling in reconstructing war ravaged societies. This aim grew into planning for measuring
international opinion in the future operations of organizations, then existing only as early
blueprints, but later known as the United Nations and UNESCO. In 1943 I developed the
dream of a Barometer of International Security to help world decision making become
progressively more integrated over the rest of the twentieth century. This was intended to help
develop "One World."
Articles 1, 2, 3 and 6 tell of initial hopes and early progress reports on the Barometer.
Article #4 proposed a set of measurable standards to help assure integrity, competence and
accuracy when polling across diverse cultures and language areas. Thus, for example,
international polling introduced new variables when translating the questions in common into
several languages. I invented an operational index of fidelity of translation. The questionnaire
of n words would be translated from Language A to Language B and then independently
retranslated back to A and the percent of identical words computed in the initial and terminal
versions of A.
In 1944 in London, I negotiated with over a hundred relevant officials of Allied
governments up to cabinet levels, military generals, (Russian, British, French, Italian and
American) and polling executives and mass media publicists to explore the Barometer project.
We tried to build it into UNESCO's charter but Russian opposition thwarted that. Of the
wartime polling agencies I helped set up, several continued such as the German Institute for
Demoscopy — a word I coined for "observing people by sampling." Back at AUB in 1945 after
my year's leave in the Army, I drew up the proposed charter for organizing the world's pollers
in what became the World Association for Public Opinion Research — now an official nongovernment organization advisory to the UN — as one item in the Barometer. At the Central
City (Col.) conference of pollers in 1946 I became co-chairman with George Gallup of
WAPOR's organizing committee and served as its secretary for six years. In 1947, I came to
the University of Washington to launch and direct the Washington Public Opinion Laboratory
("POL"), the first polling agency dedicated to pioneering in basic research in the Behavioral
Sciences.
During this period, 1943-57, I published these articles:
#1.
“Barometer of International Security” was a call to mobilize interest in an eventual world
Barometer;
#2.
"Towards
#3.
"Steps
#4.
“Standards for Surveying Agencies” proposed a set of measurable standards to help
assure integrity, competence and accuracy when polling across diverse cultures and
language areas.
#5.
“Techniques for World Polling — A Review of the Methodological Literature for CrossCultural Surveys” In 1953 a three cornered contract with UNESCO, W.A.P.O.R., and
P.O.L. gave me resources to survey the field of intercultural polling. The resulting
volume developing the theory as well as the practices of polls (unpublished by
UNESCO for lack of a budget) was summarized here.
#6.
"The
#7
"Developing
World Surveys" reported developments through 1946 when I visited Seattle
from Beirut as Walker-Ames Lecturer.
towards a Barometer," reported progress in 1950 and
World Association for Public Opinion Research," reviewed the whole movement
towards international surveys up to 1957.
Demoscopes for Social Research" broadened the field of polling by
outlining "pan-sampling" of all pollable behaviors, "organization sampling," "world
sampling," and "time sampling." In these four directions I developed my dimensional
transact model's dimensions of the Acts of People in Location and Time as "Tr =
f(APLT)."
Section 2 on Techniques of Polling
These 14 articles on "Techniques of Polling" contributed hardware and software innovations to
the polling profession. Let me comment on each in a sentence or two.
#8
“The 'Steps-and-Parts' Model for Polling”, an ASA paper in Denver, has been published
only in pieces hitherto. I see it as a highly scientific theory of the polling process. For it
describes polling so fully as to explain its stages, and predict their recurrence, and so
control the whole process, whenever people want a poll enough to pay all costs
involved. The article analyses any polling operationally and extensionally such as to
resynthesize it and thereby restore that original polling and even enable improvement
upon it. It develops a theory of methodology for all science, summarized in the three
variable equation, A = a xoc. This implies the rule that: Only in so far as the observer's
actions, a, are standardized to become an invariant factor will the variance of the
speech acts to be observed, x, agree exactly with the variance of the observed data, A.
This paper grew out of a "Manual for Surveying" I wrote in Sicily to help in training new
staff and new agencies in occupied territories.
#9
“Dimensions of a Poll” Analyzes a poll into its standard dimensions in such detail as to
permit a poller to plan and estimate costs and make bids on a future poll — let alone
teaching the novice how to execute a polling operation. Note that both articles #8 and
#9 (as well as the next three) were written in the early days of the POL at UW where I
was both conducting polls and teaching students how to poll better in the future.
#10
"Sociomatrices
#11
"Predictive
and Levels of Interaction." I consider this to be my best contribution to
methodological theory in Sociology. If Sociology is to study groups and organizations in
human population, then the three-axis, or solid, matrix becomes the best operational
definition of these three degrees of interaction and the best way of ordering data for
exact or mathematical analysis and synthesis. I first introduced the sociomatrix into
Sociology as a systematic reordering of Moreno's sociograms in Articles #? This article
extends those two-matrices to three-matrices by including the third axis to provide for
ordered rescoring of all role-acting and consequent organizational relations — the heart
of Sociology. I predict that: "Insofar as sociologists in the future explicitly use the threematrix with its variants and implications, in just so far Sociology will get on with
becoming an exact science able increasingly to so describe Society as to explain,
predict, and control it even better."
Principles for Polls." Here is a comprehensive treatment of the central aim of
all science — to predict recurrence under recurring conditions. This is the acid test of all
empirical science applied in detail to Behavioral Science and its subfield of Speech
Behavior or Symbolic Interaction as sociologists prefer to call it. Incidentally this paper
with its analyses of concepts, followed by twelve rules for predicting better, and a check
list of 82 chief predictor dimensions in polling gives an excellent review of my
philosophy of scientific methods and its techniques of dimensional analysis up to 1951.
#12
"Scientific
#13
"On Reliability in Polling" was a comprehensive theoretical analysis of polling reliability
and a practical application with data under wartime conditions in Syria and Sicily. It
answered effectively the crucial question of the top military, civilian and polling leaders:
"How reliable are public opinion polls, or can they become under optimal administration,
in semi-hostile wartime populations?" The outstanding findings from many indices and
diverse populations was that while reobserving individuals gave only 63 per cent to 90
per cent agreement, reobserving plurals, or averaged responding in a population, gave
a trustworthy 99 per cent of agreement. The averages of properly conducted polls
seemed safe indicators of a public's attitudes and probable future behavior — in the
situations studied.
#l4
"The Standard Error of a Social Force" antedated any wartime polling. It was carried
while on furlough in the United States in 1935 with the help of a critical reading of my
ms by S. S. Wilks of Princeton. It was based entirely on my inter-village hygiene data
from A Controlled Experiment on Rural Hygiene in Syria. It innovated in developing
rigorous error formulas for social forces. It also computed instances of them from
censuses of whole communities in a primitive culture, but under our experimentally
controlled conditions with a before and after design.
"The Applications and Mechanical Calculation of Correlation Coefficients " This article
and the next one report my inventing of a gear-wheel machine to graph and compute
correlation coefficients and the first three moments of any distribution. In Graduate
School on switching from Economics to Psychology, I found I needed to learn some
statistics, especially, what a correlation was. So I immured myself in my room for two
weeks to master Truman Kelley's Statistics even to the extent of sending its author a list
of its misprints. From such intensive study, relaxed in sleep, the correlation machine
subconsciously crystallized and arose luminous and almost whole on waking one
morning. The Princeton Physics Department put their mechanic onto a working model,
pictured herein. The Franklin Institute for research in Physics invited me to a banquet
lecture which they published as my first publication. (1926)
An amusing item was an invitation to exhibit my correlation machine at the AAAS
Convention in Washington that Christmas. Two gentlemen passed by and asked
searching questions about its loss of accuracy through grouping the data. In reply, I
explained how its inaccuracy would be less than 1 per cent according to Kelley's
argument on page x — and after some further intricate discussion I found I was
expounding this textbook to its author!
#15
#16
Methods in Human Relations" is a popularizing statement that I often use in
my courses. Lundberg in his “Can Science Save Us” used to wish social scientists had
as convincing evidence of the effectiveness of scientific methods in their field as
physical scientists had in their field. This article offers several items of evidence in the
early 1950's. The present set of four volumes of Systemed Studies on Human
Transactions reviews just one author's contributions to such evidence among hundreds
of thousands in the last two decades.
"A Correlation Machine" was an invited article addressed to my fellow professionals in
Psychology — also published in January 1926. This correlation machine next went into
an electrically driven model. The machine shop puttering on it in Trenton ran me $500
into debt before I panicked at the prospect of losing all my National Research
Fellowship and livelihood and stopped work on it. Later, after our wedding my wife
devoted her major wedding present check not to setting up our household but to bailing
me out of debt. Then the Cambridge Instrument Co. undertook its development and sold
models to Harvard, Berkeley, and Chicago labs before the stock market collapse of
1929 and ensuing depression stopped further manufacturing there. When I returned
from Beirut in 1934, the IBM electric computers were making gear wheel devices
obsolete and my correlation machine had become a technological dodo. In 1950, Sam
Stouffer told me Harvard's copy might as well rust on my storage cupboard as in a
Harvard attic and shipped it to my polling laboratory as an historic fossil.
Twenty years later, and IBM official phoned in "out of the blue sky" to inquire
about "the Dodd Correlator." They wanted to mount it in their permanent exhibit of "The
History of Computers" in their new head-quarters skyscraper, then going up in central
Manhattan. Had I a model of it? Could he come to Seattle to see it? On negotiating
about their embalming of it in their Exhibit, I told him its history. Whereupon he wrote me
a check to reinstate our wedding present forty years deferred.
#17
"On Predicting Elections or Other Public Behavior" This paper discusses three
measurable and largely preventable types of error in any polling which accounted for
the polls mis-predicting the US 1948 presidential election. In that election all the polls
except the newly founded Washington Public Opinion Laboratory (P.O.L.) predicted a
sure victory for the Republican candidate, Dewey, whereas the Democratic candidate,
Truman, won it. It was my Laboratory's maiden exploit and dramatically supported our
Laboratory's use of more rigorous scientific methods than were then current.
#18
"A Call for Experimental Designs for Election Polling" Based on our 1948 success, we
organized a symposium in the International Journal of Opinion and Attitude Research to
plan improved designs for polling and testing the polls in the 1952 elections. This article
sketched the basic methodology or polling design proposed and increasingly followed,
for the future presidential and other political polls.
#19
"On
#20
"Research Note on the Law of Forecast Feedback" This research note suggests how
better theorizing, if pollably stated, may improve the predicting of human mass behavior.
A hypothesis (miscalled a law) of "forecast feedback" claimed that: In a situation of
Estimating Latent from Manifest Undecidedness" This article jointly authored by
Kaare Svalastoga, now the Professor of Sociology at the University of Copenhagen,
grew out of a question I asked him in this oral examination towards the Ph.D. degree. It
sparked his thinking and the search for a fuller answer which was found in our data. A
high percent of "Don’t Know" or "Undecided" responses predicts (r> 9) instability or
probable change of opinion (if reobserved in a second poll) among the decided subset
of the polled population on that issue. An unusually large “don't know” percent indicates
an unreliable prediction among the decided. It indicates that subsequent events can
change the prediction. Herewith we have a useful item for pollers. A high percent of
undecided voters in a poll indicates an unstable or unformed opinion which the trend
with time can predict with greater refinement. Had this been known and applied in the
1948 U.S. elections more of the polls could have predicted the out-come more
accurately.
evolving opinion the feeding back of a published forecast of its outcome would affect
that evolving "unpredictably." Our likability theory changed the hypothesis to expect
predictability and specified it in testable terms.
#21
"The Momental Models for Diffusing Attributes" This invited article in the Indian
quarterly, Darshana, describes the moments family of diffusion curves that explain and
predict the stochastic processes called the normal, exponential, and logistic laws.
These diffusion curves were the outcomes of our P.O.L.'s massive experimental
research called "Project Revere," executed for the U.S. Air Force, 1939-53. It combined
applied research to guide the Air Force in leaflet-dropping operations with pure research
to induce and test mathematical models for diffusing of items through a population. The
moment laws described here develop the elementary theory of diffusing an item through
a population or of one way communicating. Communicating in turn is a factor process in
every social process or organization. It seems the elementary dynamic unit of society.
Section 3 on Semantics in Polling
The six articles, or studies, in Section III on Semantics in Polling focus on the Wordings
factor, W, in the overall 8-factor formula for any human transaction,
BLK =
s
s[
s
(AF;AK;ADi);PP;T-1;(V);W;L;M;C]s ;
Transactions that focus on interactions of people with words or symbols are usually
spoken of in Sociology as the field of "symbolic interaction." Though not in the current main
stream of symbolic interaction, represented by writers like Skinner, Homans, etc., these
articles contributed to its dimensional current that develops formal standardizing factors and
universal facets of symbolic interaction. Thus the author would classify the six articles here by
their semiotic power facet as follows:
WO = Qualitative Wordings chiefly
e.g., Article #22 on "Public Opinion Definitions" helps to standardize concepts in
polling.
WI = Quantitative Wordings Chiefly
Here Article #27 "Note on an Index of Conformity" develops a quantitative core
index for conforming transactions;
and Article #24 "A Comparison of Scales for Degrees of Opinion" refines some
quantifying in opinion polling — the general theme of this volume.
WII= Relational Wordings Chiefly
Here Article #23, "The Interrelation Matrix," first proposed the now widely used
sociomatrix tool for ordering, measuring, and mathematizing increasingly the
pair relations between persons that form the basic elementary "interact units" of
all communities and of human society.
Article #26, "The Coefficient of Equiproportion" proposed a new and more exact
measure of hierarchic relations among intercorrelated variables.
WIII= Systemed Wordings Chiefly
Here Article #25, "A Simple Test for Predicting Opinions from their Subclasses"
innovated in charting a logico-statistical system of concepts, indices, and
harmonizing relations among them. These measurably furthered the central aim
of scientists to predict recurrences under recurring conditions.
These six articles, though written over a span of a quarter century, can yet be integrated
with hindsight by the transact models facets. The post-superscript, 1s, or semiotic exponent,
classifies them as contributing to the polling of opinions of the qualitative, quantitative,
relational and systemed levels of complexity in symbolic interaction. These four levels of
increasing complexity or organization are Kant's "categories of the understanding" (though the
last shifts Kant's "modality" to the more operational "systemed" concept). These four semiotic
levels can be systemed in a series of rising dimensionality in semantic n-space. They can be
symbolized operationally by matrices of n axes as illustrated in Article #10 here which spells
out the four levels of increasing complexity for the population factor (see below).
These four semantic power levels are derived and defined by successive and
cumulative rounds of reitering as named in each row of the table below. This table reviews
these four power levels in their more common syntactic and transactional versions.
Table 1
Reiterant
Derivation
Extensional
language
The Power Facet, Xs, of any Transaction Factor, X.
Lay terms
or cornerscript
Intensional
language
Power
Notation,
Exponents,
Post-superscripts
A Quality
Xo
A Quantity
XI
Moments
of any
Distribution
Space
TransFactor
Population
TransFactor
Time
TransFactor
μ0
L1
Pp
Tt
A Person
P0 (=1)
A Date
T0
A Line
LI
A Plural
PI
A Group
PII
A Period
TI
A Correlation of
time
series
TII
A System
of 3t time
factors
TIII
Listing into
SETS
Adding into
SUMS
Multiplying
Into
A Point
μ0 = ΣX0/N =1 L0
μ1 =
ΣX1/N
PRODUCTS A Relations XII
Multiplying
Into
μ2 = ΣX2/N
An Area
LII
XIII
μ3 = ΣX3/N
An OrganiA Volume zation
LIII
PIII
POWERS
A System
The power facet has a versatile capacity to order and systematize transaction factors and
human speech or symbolic interaction generally. Its application to classifying these six articles
of Section III on Semantics in opinion polling is only a sample of its usefulness. (See Article 1
in Vol. 2 and Articles 20 and 39 in Vol. 4 for fuller explanation of transaction factors and
facets.)
In later articles on the Reiteration Rule and the Reiteratings Matrix we shall see how
every symbol man uses and all mental life expressible in symbols can now be analyzed and
improvingly re-synthesized in terms of elemental reitering speech operations. A large causal
factor in the progress of modern inductive science seems due to the use of extensional
thinking — the language of set theory. I see increasing use of reitering sets of symbols or
opinion elements as the on-coming revolution in the behavioral sciences including all its
symbolic interaction and polling subfields.
Section 4 on Values in Polling
The 12 articles in this Section IV concern first opinion polling — the theme of Volume 3
— as studies on the methodology or observers behavior when observing the opinions, i.e., the
verbalized attitudes or asserted preparedness of people to act. Secondly, the subtheme here
concerns values in polling — what is chosen as most valuable or worthwhile to poll. The focus
is on polling the things-liked most, the values factor, V in the 8-factor transact model. Volume 1
collected together my articles that focus primarily on the substantive values that people live
and strive for. In this Section of Vol. 3 the articles deal primarily with how to poll such values,
how to measure themes These articles concern the concepts and indices, the matrices and
other scientific methods whereby men can observe values more exactly and fully.
#28
#29
"The Likability Theory" Thus the first two articles — on the Likability Theory and its
subtheory on Tensions towards things-liked (called "likables") — occur in fuller versions
in other volumes but here they may be re-studied with their methodology of upper most
interest.. In the Likability Theory we have analyzed and synthesized more fully than in
any other articles of mine — the precise methods for isolating and measuring the four
parts of the definitive behavioral sentence: "Values are valuing things-valued in relevant
context." This sentence isolates the agent, the act, and object of value in its modifying
setting. Its grammatical structure of subject, verb, object, and prepositional phrase
analyzes cleanly (out of an often blurred composite situation) the four universal
transfactors, PAVC. Thus the product of factors of Population, Action, Values, and
Residual Circumstances provide standardizing synthesis of highly universal factors.
They decompose an initial valuation situation, or gross unit of symbolic interaction, into
elemental units that can be re-composed to restore and even improve on the initial
situation. Here we have sketched the strategy of scientific method applied to the polling
of opinions, especially most valued opinions. This Article #28, spells out the last
sentence in more explicit detail.
"A Tension Theory of Societal Action," though written earlier, actually expands the
second and third parts of the Likability Theory. For a tension, E, is defined as a ratio
(E=AF/V) of the total strength of feeling or the liking activity =AF to the amount of the
thing-liked (=V), (This A/V is the demand-supply ratio in Economics). This exchange
ratio when expanded to include the matrix of its context (the other transfactors in any
transaction studied) becomes the full Likability Theory.
1) I believe that the Likability theory including its three parts, namely: the “likes
hypotheses” of how feelings, knowings, and doings largely determine a person's
behavior, as to its internal determiners;
2) the likables, or tension, hypotheses of how this relation to the supply of things-liked
further conditions a person's behavior; and
3) the likability hypotheses as to all further conditioning by any relevant setting whether
experienced in the past, operative currently, or expected in the future —
all jointly explain and predict people's behavior more fully and testably than any other
comprehensive theory of the behavior of man-in-society. This working hypothesis can,
as it becomes more widely published and known, be tested by other behavioral
scientists throughout the range of human cultures.
#30
"On
Criteria for Factorizing Correlated Variables" This was the earliest of my
quantitatively rigorous articles. It was only in part drafted by me — Karl Pearson wrote
most of the text. I had the idea and convinced Pearson of it when I turned up in his
Biometrika Laboratory in London in 1926 fresh with my Princeton Ph.D. in Psychology
but with an abysmal gap in my mathematical training. He took my first draft of the
argument (that if a multiple correlation were observed to be perfect at R = 1.00, its
standard error from sampling must always be zero) and rewrote it overnight. The recast
draft he handed me the next morning was so technically exactly phrased I couldn't
understand it! It took me several weeks study to learn the implications of each sentence
and equation. He later offered to publish the article in Biometrika (and thereby gave me
a strong boost in professional reputation that flowered in time into a dozen Who's Who
listings of scientists). I claimed the article should be signed by him as its actual writer
and not by me who had drafted only a small fraction of its phrasing. But he insisted that
the idea and its general proof was mine and his contribution had been a part of his job
as professor to instruct me in the technical language and for this I had paid my
laboratory fee. A generous treatment of a young colleague which I gratefully record!
The classifying of this article "On Criteria for Factorizing. . ." under this section
dealing with "values in polling" needs justifying. It could have been placed elsewhere in
these four volumes, but I felt its crux was the choice of the criterion, the value judgment
in selecting the case of Rsl.00 to discuss. The ensuing article was deduced from the
implications of that initial choice and simply illustrated a piece of such statistical
methodology.
#31& "The Concord Index for Social Influence" and "The Concordance Models for Social
#32 Control" These two articles should be treated together; they are Parts 1 and 2 of a
proposal to so measure cases of social control, or self-government, as to help society
get the results and by-products it wants increasingly and to prevent undesired
outcomes. Their joint or unified impact on readers and society is greatly reduced by
their publication in journals so distantly related in audiences and subject matter.
Bringing these two articles together is an example of the usefulness of this 4-volume
series — to multiply the unitary impact of its ideas, now fragmented and scattered
among over sixty journal and publishers.
If human society wants improved self-government in directing social evolution as
mankind may want most, then the social scientist should help develop and test better
means to such ends. Such improved means are proposed in these two papers
describing indices and models whereby controllers, their agents, and their controllees or
voters, officials, and citizens generally can get more effective and efficient integration of
their various goals and programs there to and fulfillments thereof.
Science is often accused of supplying only cognitive means to man's ends and
neglecting affective factors as being in humanistic fields involving value judgments. Too
frequently scientists supply the know-how to solve some problem but not the motivation
to use it properly. They may even disclaim that function. The concord and likability
models can help here. For the index of concord can integrate indices of knowing,
feeling, and doing applied to society's goals, programs, and evaluations of them. The
concord indices, suitably specified, can both help men to diagnose their current goaldeficits or problems and also help to remedy them with improved know-how, motivation
and experience-based performance.
#33
"Racial
Attitude Survey as a Basis for Community Planning: the Broadview (Seattle)
Study." This study illustrates how the policy sciences or "value sciences" — for which
polling supplies the chief methodology — can be built up with increasingly rigorous
application of scientific methods.
This study illustrates how scientific observing and measuring of the values and
feelings of a population with the testably valid logical and statistical inferences from
those attitudinal facts can help to convert so-called soft humanistic disciplines into hard
scientific disciplines, increasingly.
In evaluating this study of racist hostility and a community's efforts to reduce it,
the reader should note its date — in 1949. This antedated by five years the Supreme
Court's decision declaring that separated schooling was unequal schooling and so
accelerating the civil rights and anti-segregation movement and legislation in the United
States. This study was made before I had developed the indices for likability and
concord as in the pre-ceding articles — which could have helped to sharpen and
summarize the Broad-view survey.
#34
"Can
We Be Scientific About Humanism?" The thesis of these four volumes of
Studies on Human Transactions" is the thesis of this paper. It seems to me
essentially the thesis of scientists generally and might be called the value system of
scientists. Though variantly expressed, this thesis asserts that scientific methods make
a science; that the empirical testing of all scientific hypotheses and theories builds
man's knowledge of scientific laws; or that science is that which works; i.e. that which
testably recurs most invariantly under recurring conditions. This paper starts applying
scientific methods — such as inducing and testing hypotheses — to the humanists'
philosophy of life. It invites further empirical testing with increasingly rigorous techniques
of controlled experiments (see Article 26 in Vol. 1 for such extensions).
Biographically, this paper, published in The Humanist in 1958, was written during
the period of my most active connection with the American Humanist Association
serving a term as Director and a term as Vice President. It contributed to strengthening
Scientific Humanism as distinct from Literary Humanism within that Association.
"Systemed
#35
"Use
Scientific Methods in Planning" This paper again illustrates my interest in
combining pure and applied research. Here I advocate, with examples, applying purely
scientific methodology to practical problems of developing and testing national Five and
Ten-Year Plans in the Arab world. The paper was presented at an annual conference in
Boulder of the 13,000 member Organization of Arab Students. It received their Merit
Award of the year. It expressed, from twenty years service at the American University of
Beirut, my advice to Arab students preparing in America for leadership in their
homelands. It expressed my answer to George Lundberg's question of a quarter century
ago "Can Science Save Us?" I answer, as he did, "Yes!" and add "Here's how science
can save us — by pursuing the value sciences to learn solutions to our social problems
in the future as fully as pursuing the physical sciences has given us solutions to our
physical problems in the past."
#36
"A Measure of Man's Maturity" This paper started out as a sermon I have at a Unitarian
Church service in Bremerton. It developed into a scientific hypothesis, plausibly
expressed here, but not yet experimentally tested, that an index measuring social
maturity could be devised from the time span of one's goals. Maturity is hypothesized to
grow as that time span grows from a few minutes in a baby, 'Lione ss lifetime in youth,
and on to generations ahead among humanity's most far-sighted statesmen. I believe
that such an index can be standardized and its implications worked out and increasingly
tested by controlled experiments on individuals and groups. Here seems to me a case
in the value sciences of a hypothesis not yet fully formed, yet pregnant with promise.
#37
"A Proposed Campus Poll or Opinion Gauge" This proposal for a permanent campus
polling agency or demoscope could combine, to very high degree applied and pure
research in the value sciences or policy sciences. Monthly polling of students, faculty,
and relevant community could yield data on their behavior, attitudes, and values or
preferences that could guide their governing bodies or body as to decisions involving
their members assent, conforming, and approval or their opposites. Close and equal
participation by all persons in decision making could be sensitively secured however
large and impersonal the system — as in a university with several tens of thousands of
persons. As human institutions whether of government or business, education or
religion, or any other organization grow ever larger and more complex and consequently
shrink the relative importance and power of every individual, a democratic society
urgently needs compensating or corrective devices like periodic polling to sensitize the
monster system to the needs and aspirations of its every member. Such polling could
reduce confrontations, demagogic pressures, minority overclaims, and let any "silent
majority" all speak with equal opportunity and due weight in the group's actions.
As an instrument for pure research a funded demoscope could gather,
intercorrelate and systematize cumulatively, any and all data that are reportable by
persons from their own behavior attitudes and situations. A demoscope could advance
the social sciences as the microscope has advanced biology, or the telescope has
advanced astronomy.
#38
"The Course Critique Corrector — A Feedback Subsystem for Evaluating Course
Critiques" This study reports a feedback upon a feedback on a social system. The
system of university education or interacting of students and teachers has long had a
feedback in its examination system whereby teachers get feedback on what students
have learned. In recent years, published studies such as reported here, have probed
the student's reactions to their teachers and polled analytically the impact, in a dozen
scaled respects, of their teachers on the student respondents. Such two-way feedback
on and by each of the two interacting parties develops stabilizing and fuller control, or
cybernetic guidance, in any large and self-governing system.
Incidentally, this study coupled with our Epicosm models for a dynamic ever
cycling cosmos (Articles 28, Vol. 1; Articles 37-40, Vol. 4) have combined in my thinking
to suggest an hypothesis of staggered cycles of interacting that may explain and predict
all cybernetic feedback whether random or systematic in any system whatever (not yet
published).
#39
"Dimensions
of Lundberg's Society as Foundations for Dodd's Sociology — A Case
Study of a Professional Partnership" This article is my review of twenty years in Seattle.
It reports to the profession my second half life as a research professor of sociology. It
includes a listing of my 150 research articles that contribute to some twenty or more
disciplines. These have been inaccessibly scattered over sixty journals and publishers
until Gordon & Breach are now republishing them for unifying impact and greater
availability in the present four-volume series of Systemed Studies on Human
Transactions. (Editor’s Note: This publishing deal was cancelled due to financial
problems of the publisher.)
This paper also marshals in one list the 53 scales and indices I have contributed
to the Measurement step of methodology as in the Behavioral Sciences.
The paper was invited for the Lundberg Memorial session of the Pacific
Sociological Association in1967 and expected to appear in a PSA memorial volume for
which funding was voted but was never carried through after a change of editors. It was
then requested by Kaare Svalastoga for publication in the University of Copenhagen's
Sociological Microjournal.
It contains a story of biographic and human interest in the development and
strategy of George's and my thirty years of professional partnership.
Section 5 on Systemizing Polling
This final Section V of Volume 3 presents three articles which review our administrative
and academic efforts to systemize polling.
Note our distinctions among the three verbs formed from the noun "system. "
a) Systeming = the act in progress of forming a system among empirical entities or
referents of the name, "system," e.g., building an arch out of a set of stones or a
football team from a set of men, etc.
b) Systematizing = the act in progress of forming a system among the words, naming
the interacting parts and the whole, a verbal system, an algebraic formula, a
constitution for a nation, etc.
c) Systemizing = the act to progress of forming (a) and/or (b) their product or their sum.
See the Overview of the 4-volume series repeated in each volume and articles. See Article #8
and #9 and also #7 in this volume.
#40
“The Washington Public Opinion Laboratory” is the prospectus in 1947 initiating the new
Laboratory. It tells of our practical efforts to organize the Washington Public Opinion
Laboratory at the University of Washington in Seattle, WA.
#41
“Seven Year Report 1947-1954” reports on our progress in the operation of the
Washington Public Opinion Laboratory (P.O.L.) during its first seven years.
#42
"Scientizing the Probable Acts of Men — Copenhagen Lectures" tells more of my
attempts verbally to systematize the behavioral science and its chief operational
instrument, the demo-scope or poll. This verbal system is represented by the transact
model for the probable acts of men, the step-parts model for polling, and the present
four volumes on "Systemed Studies." This paper was a fuller reporting to the profession
in 1968 on my P.O.L. and lifetime contributions towards systemizing behavioral science
increasingly.
A major problem underlying Section V and most of Volume 4 seems to me to be the
extensionalizing of behavioral science. I see the most promising future for behavioral science
as the chief means to man's end to consist in developing it extensionally. This means to try to
use all terms in behavioral science as names for a set of referents which are the instances of
occurrence (in time and space and the human population) of whatever that word stands for.
This means seeing every word extensionally as a name for a set of pointed-out instances
instead of perceiving it intensionally as a name for a property or compound of properties (as in
most dictionary definitions). The extensional view means closer and more testable one-to-one
correspondence between every word and its referent — as in mathematics. The exact or hard
sciences achieve this semantic ideal of one-to-one correspondence between name and thingnamed, word and deed, more fully and testably than the less exact or softer Behavioral
sciences.
This extensionalizing of behavioral science can be greatly furthered by our reiteration
rules. These tell how to analyze and re-synthesize every word into reiterant elements of
appropriate sorts and recompound their sums, products, etc., into extensionally formulated
concepts whose interaction as words in a sentence or symbols in a formula will predict the
behavioral or empirical interaction of their referents. Thus compare the fuzzy intensional
phrase “public opinion A on an issue resembles opinion B" with the definite extensional phrase
"polled opinion on A correlates at r = .71 (or r2 = .5) with polled opinion on B." By the
extensional phrasing we learn that half the determining elements of opinion A are shared with
opinion B (r2 = 1/2) and this population's consequent expected behavior can be more exactly
and reliably predicted to that extent.
The whole effort of P.O.L., dedicated to basic research in behavioral science, was
increasingly to scientize, — to systemize — the behaviors of the pollers and the people they
polled.
Section 1 on World Polling
#1. A Barometer of International Security
The United Nations may use a tool the League of Nations never had —public opinion
polling. Stuart C. Dodd describes the steps already taken in that direction, the argument for
further steps and the possible means of combating the forbidding, falsification, or frustration of
the polls by individual countries. Professor Dodd writes his appeal for the International
Barometer against a background of wide academic and practical experience in the field.
Professor of Sociology and Director of the Social Science Research section of the American
University of Beirut, he also directed opinion survey work for the Allied Expeditionary Forces in
Sicily and the Near East.
In organizing international security there must be machinery to signal any threat of war
as well as machinery to deal with such threats. The Council and Assembly of the United
Nations must have a dependable information service that is swift, worldwide, and accurate.
They must know of potential trouble before it becomes an emotionally charged issue, rocking
parties and governments and finally international order. The Council and Assembly should not
have to depend for their information solely on the diplomatic and other government offices of
member nations, or on the press, for these national agencies are apt to conceal a threat to the
peace in the very country that is brewing, a war. Impartial and authoritative information must
be available to all the nations about sinister movements in any one country. The Council must
know beyond a doubt just how strong the irredentist movement is becoming in country X, or
how fast the popular misunderstanding between two major powers is moving towards hostility,
or to what extent the settlement of an arbitration commission has satisfied and reduced the
tension between countries Y and Z.
I. The Instrument at Hand
The need for quick, focused intelligence could be met by the use of the public opinion
poll. There is already a wide experience in the use of the polls for governmental purposes in
the domestic field. A list of surveys made runs into thousands for the federal government
alone. The Department of Agriculture, the Office of Price Administration, the Office of War
Information, the Bureau of the Census, and the State Department’s Division of Public Liaison
are federal agencies which have built up staffs of technically trained surveyors to conduct or
analyze public opinion polls. In England, although polls acquired a certain stigma from the time
that Duff Cooper sent out interviewers who unfortunately became known as "Cooper's
snoopers" the Ministry of Information has developed the "War Time Social Surveys" agency.
This agency has serviced so many ministries and has such a backlog of unfilled and annually
recurrent orders on hand, that its continuance in peace time is almost certain.
In the military world, several major polling agencies have been built up, including covert
surveys in enemy territory, the exact operation of which is still a military secret. In the
American army a unit, several hundred strong, has been created to study and report on the
morale of troops. A monthly magazine, What the Soldier Thinks, reports these polls for the
guidance of commanding officers. By similar methods, freshly captured German prisoners of
war have been polled and a monthly morale index reported, showing the state of mind among
this sample of the enemy. Another surveying agency has been tried out, among the civilian
population of occupied or liberated territories, providing information to guide the propaganda
services in radio, press and film, and also serving the civil affairs administrators. UNRRA is
among the potential clients of the survey agency, since surveys will be needed to guide the
complicated, many-sided program of relief in the wreck of postwar Europe.
As a preparation for the surveys in enemy populations, research was called for to invent
and perfect techniques for insuring sincere answer from suspicion or hostile informants. How
much lying would be met in such surveys? How could unreliability be measured? Could it then
be controlled? To answer such questions, preliminary experiments were designed and carried
out in Palestine and Syria in 1943 among an Arab population, many of whom were pro-Axis.
This laboratory was next duplicated in Sicily, where a further test was made on a population
fresh from Fascism. Indices were developed to measure the unreliability coming from the
informants, from the interviewers, from the inter-relation between informants and interviewers,
from the schedule card, and from attending conditions such as the amount of preparatory
publicity. Only when these trials showed high reliability was the polling technique ready to be
extended from the democratic countries of its origin to the totalitarian and hostile countries.
II. International Postwar Uses
In general, surveying could be used by an international organization to predict social
trends and thereby achieve greater control over them. It can measure conditions in a
population and the behavior of people as well as their opinions.
A further possible significance of polling lies in its relation to the war aims. The Atlantic
Charter, officially endorsed by all the United Nations as their war aims, calls for "consent of the
governed" and "self-determination." Polling provides one mechanism for realizing these
principles. It is a mechanism far simpler than the holding of a plebiscite. It can be done in
countries which totally lack a democratic tradition or responsible political parties. It can be
done quietly and swiftly without stirring up the emotional heat that is excited by the controversy
of an election. It can implement democracy in making the voice of the people heard in
government. It was this feature that greatly impressed both Arabs and Sicilians, as shown by
remarks to the interviewers like: "This is the first time the Government ever asked my
opinion.... Is this what you mean by 'democracy' in the West?... So this is keeping the promises
in the Atlantic Charter."
But perhaps the most significant use of surveying in the postwar world would be to
serve the United Nations as a barometer of international security. As a psychological
barometer it would serve to measure political pressures that threaten another storm. It would
measure the force of public opinion and the area in which it operates, thus extending the
concept of “pressure” from physics to psychology. In this operational definition of pressure as
force times area, “force” in turn is defined as the arithmetic product of the acceleration of
change of an index of public opinion times the population-mass thus changed. Thus these
psychological forces and pressures become measurable.
Is friction setting two Balkan states on fire? Are the Germans feeling frustrated, bitter,
determining to try again to conquer the world? Or are they developing healthier attitudes,
seeking achievements in other than military fields, and becoming a sane member of the family
of nations. Is the American public slipping back to isolationism and neglecting their
international responsibilities? Are the public relations of the British and the Russians improving
or deteriorating? Is the action of the United Nations' Council last year in the Middle East
tending to solve the problem of the Arab-Jewish temperatures rising in war fever? That such a
political barometer would be useful is obvious. The questions become: Is it practicable? How
can it be set up so that it really will work?
III. Nature of the Barometer
Consider how such a barometer might be constructed. In each country there would be
teams of trained nationals of that country, employed some by a private non-profit agency and
some by a government agency. Each type of agency could make some kinds of surveys better
than the other; hence the barometer should comprise both private and intergovernmental
elements. All surveying would be done only with the consent of the government of each
country. No surveys would be imposed on any United Nation. But surveys would be conducted
by the occupying authorities in enemy territories until they were adjudged ready for United
Nations membership. All surveys would be fully and publicly reported, including standardized
specifications of the degree to which scientific standards had been fulfilled. Comment upon
and interpretation of findings would be clearly separated from the objective and numerical
findings.
Some surveys would deal with local issues, others with world issues; some would be
made once, others would be made periodically to reveal trends, some would be simple — a
''yes‘" or "no" to but one question — while others might involve many questions with conditional
and varying answers to analyze thoroughly some complex issue. The surveying staff would be
nationally employed full time and required (as were the official of the League of Nations) to
give up any other remunerative employment and activities in political parties. They would be
professionally trained, world be sifted for integrity, reliability and skill, both before and after
employment, by six quantitative tests that have been developed for this purpose. Any
surveying agency that wished to be accepted by the international accrediting bureau would
have to be inspected and have its records audited. Private surveying agencies might be
financed by any combination of the current ways: grants from foundations, usually for scientific
research, fees from clients, whether governments or public agencies, for surveys made for
them at cost; sale of findings as copyrighted news to press syndicates; endowments; gifts; or
government subsidies. There would have to be some central agency or clearing house for the
United Nations. This central agency would initiate world-wide surveys with comparable
techniques. It would accredit those surveys which had been inspected and fulfilled scientific
standards, thus guaranteeing against incompetence or manipulation of surveying. The central
world agency would publish a journal reporting all surveys bearing on international security,
whether conducted by the central agency or by more local agencies coordinated with the
central agencies. The journal would be the brain, or telephone switchboard, of the barometer,
communicating the opinions of any country to the public of that country, to governments, and
to the world.
IV. Difficulties Anticipated
In exploring this barometer proposal, three fundamental difficulties constantly recur,
arising from the extent to which a country permits freedom of speech, and the integrity of the
surveyors. In any given nation, surveys can be forbidden, or falsified, or simply frustrated. How
can each of these three hurdles be surmounted?
Surveys may be forbidden by an autocratic government within its borders. To meet this
difficulty, three counter measures are suggested. One follows Sumner Welles' suggestion that
membership in the United Nations or its security organization should require each nation to
guarantee reasonable freedom of expression to its people. (Sumner Wells, “Pillars of Human
Rights,” Free World, September, 1944 page 33) Without this one of the "four freedoms" no
nation could be trusted to cooperate for peace and such a guarantee would be the charter for
opinion surveys. In fact, permission to survey opinion might even be made a touchstone of
whether that freedom of speech guarantee was being observed. Violation of such a treaty
provision could be brought before the World Court as a justifiable issue.
But in the absence of such a guarantee exacted by the United Nations, two other
measures could be taken in the case of a nation opposed to making surveys of opinion as
against surveys of behavior. In a very backward country, or one in a highly disturbed condition
temporary disinclination to hold opinion surveys might well be justified. In such cases it would
be more practicable to begin with surveys of social and economic issues and avoid political
questions. This was the technique used in Sicily to accustom the public to .surveys. In postwar
Europe the impartial and beneficial surveys for UNRRA’s A's relief program should provide an
acceptable demonstration of the techniques and serve as an entering wedge for later surveys.
A third measure of guarantee would be to publish in every world survey a list of
countries both cooperating or forbidding that survey. Eventually pressure of world opinion may
soften the uncooperative attitude of the forbidding nation. The United Nations' Council could
embarrassingly suspect the truth of information supplied by a government about itself when
that government refused to allow its own nationals to survey freely the question at issue and
provide a checking audit. This third method relies upon the eventual returning of free speech
as world conditions become more stable. It implies that the barometer might not have world
coverage at first, but would enlarge the number of participating nations as rapidly as
international confidence and freedom are restored.
A second major difficulty is that surveys might be falsified. This might range from a
completely faked survey by some government that wanted the appearance, but not the
substance, of democracy, to minor "adjustments” of figures. For instance, should a
government want a negative vote on a question, they could lump the "no opinion" vote with
the “no replies” (a very different matter from having no opinion on a question) and report only
two categories of replies. This is misleading without being false. But falsified surveys could
soon be eliminated by applying the central bureau’s scientific standards, carefully developed
in cooperation with the most authoritative scientific bodies, and by the application of the
accrediting mechanism worked by an international inspectorate.
A third major difficulty in international surveys is that attempts might be made to
frustrate them in certain countries. Thus, by appointing timid officials to the staff of a
government surveying agency or by other pressure on directors of private services, the
pollsters might be induced to refrain from surveying "delicate" or “critical" questions. Obviously,
such questions could be the very ones for which a barometer of international security was
most needed. This difficulty might be countered either by the central world agency's taking
steps to set up an indigenous but bolder surveying agency in that country, or by initiating an
international survey and pillorying the demurring nation with publicity about its fear to survey
that issue in its own public. This policy would have to be used with discretion, however. These
may be questions on which a survey, or its public reporting in a crisis, would hinder more than
help international security. The Council of the United Nations organization might control the
policy in such cases.
V. Current Development of the Barometer
Towards getting such a barometer set up, what progress has actually been made?
These arc the steps taken so far:
 A surveying agency is operating in liberated Europe. Now that Germany is occupied
it will find out what can be done there so plans for the barometer can be realistic
ones.
 The Social Science Research Council in the United States has started research on
scientific problems of polling. This includes development of scientific standards to
guarantee the integrity and accuracy of international surveys and the question of an
accrediting agency to enforce those standards.
 An organizing conference of all surveying interests and specialists — military and
commercial, scientific and government — has been proposed. The Social Science
Research Council has been asked to sponsor this conference.
 The private surveying agencies in England and America have been canvassed and
they have appointed a committee to work towards a world institute of public opinion
or private non-profit agency to implement the non-governmental part of the
barometer.
Some fifty exploratory conversations with appropriate officials in the British
and American governments, and a few others, have revealed interest in the
barometer.
 The relevant United Nations organizations have been canvassed to explore the
extent to which each may be making international surveys and need a surveying
agency.
 Meantime, popular discussion of the project will lead to an intelligent public opinion
on which to base any barometer of international security.
#2.
Toward World Surveying
The Public Opinion Quarterly, Winter 1946-47 Issue
PLANS are now under way to establish a world association of public opinion reporting
agencies. A member of the world association Planning Committee discusses the possible
functions of the world organization, and indicates the current development of these functions.
Stuart C. Dodd is Professor of Sociology and Director of the Social Science Research Section
of the American University of Beirut. His experience includes directing opinion survey work for
the Allied Expeditionary Forces in Sicily and the Near East during World War II.
IN THE MOVEMENT towards world surveys an outstanding development during the
past summer was a decision of the conference of public opinion agencies of North America
which was held at Central City in July, 1946. A unanimous vote there called for the
establishment of a World Association of Public Opinion Reporting Agencies. Although plans for
this were to be started immediately, its launching would be after the projected National
Association should be under way.
A further vote called for an international conference of opinion agencies and the
profession generally in Canada, probably in Montreal, next summer (1947) and set up a
committee to organize this. Another committee was appointed to plan and organize the World
Association.1
This committee proposes to take the following steps in carrying out its instructions to
plan for the World Organization:
1. Publicity to the profession throughout the world about the proposed Association and
inviting suggestions and participation in the planning. This article is part of that
publicity.
2. A questionnaire circulated early in 1947 to the profession exploring the issues
involved in a World Association and development towards world surveys.
3. A draft constitution for a World Association drawn up from the results of the
questionnaire and circulated to all concerned before the summer of 1947.
4. The proposed constitution, revised after circulation, will be presented to the
conference next summer for further revision and adoption.
Meantime, the national Institutes of Public Opinion which are affiliated with Gallup's
American Institute of Public Opinion are holding a conference in London next May to tighten
their affiliation and improve their capacity to make surveys of an increasingly large proportion
of the countries of the world.
I. Possible Functions of a World Association
In planning a World Association of Opinion Agencies, the possible functions of such an
Association need to be thought out. Functions which have been proposed are listed below for
consideration.
International surveys, aiming at world coverage eventually, however impossible today,
should be progressively developed (from their present beginnings among the twenty or more
national Institutes of Public Opinion and other surveying agencies). Those surveys which are
relevant to international tensions and threats to the peace might be organized or trademarked
as "The Barometer of International Security."
Geographic extension may await spontaneous organizing in countries without polls at
present, or may be accelerated by an experienced organizer with a promotional budget
launching indigenous and self-supporting institutes of public opinion in new territories with the
assent of their governments.
Alternatively, appropriate private or government leaders in a country without surveys
may be invited to attend conferences, observe surveys, inspect agencies, or attend training
courses if they wish, in order to encourage them in initiating agencies in their own countries.
In countries where public opinion surveying agencies would be under strict
governmental control, the policy in developing them might be to develop the agency as a
government instrument, like its census bureau or ,other statistical office, serving internal
governmental purposes; and expect that its wider use on United Nations problems and that its
more public international reporting would grow slowly. The surveying agency might begin its
international participation on noncontroversial issues and social and economic issues before
political ones and on questions officially put by some United Nations organization of the nation
was a member. International inspection and accrediting would follow adoption of the scientific
technique of sample surveying but might not follow till long afterwards in some cases. In short,
the policy would be to establish a sample surveying agency well within a nation before drawing
it into international functioning.
Standards in polling must be increasingly better specified and agreed upon.
Membership requirements in an international association might be one form of guaranteeing
standards.
Accrediting to enforce standards and guarantee the integrity and competence of any
survey or agency is required. There should be developed an international inspectorate whose
services are available on demand, like chartered accountants, or which performs routine
annual certifying.
Research on techniques of surveying. The scientific reliability and accuracy of polling
can be greatly improved with research. A professional association should lead in planning
research on all aspects of the surveying process and in getting such research financed,
executed, and utilized.
A Journal as publication center for all international surveys and interests of the
profession might be developed, perhaps out of the present Public Opinion Quarterly.
A Records office, or regional offices (like Princeton University's office of Public Opinion
Research or the Scandinavian Institute for the four national institutes in Scandinavia), should
be further developed. This would cumulate and tend to standardize all records of surveys, from
duplicate punched cards up to published reports, making comparisons of regions, samples,
and periods, and further research, increasingly possible on a world scale.
Conferences of surveying personnel should develop world survey with specialized
sections on methodology, public policy, regional interests, etc.
A Training center to develop practitioners at all levels from interviewers and statistical
analysts to directors and organizers in the opinionist or "demoscopic” profession is desired by
many. A university with field work in a polling agency is most often suggested, such as
Princeton, Denver or Michigan.
Services of specialists are wanted in organizing a new survey, planning a survey,
questionnaire construction, public relation in different countries, etc Such central personnel might
be consulted or borrowed for limited periods, to help agencies get started or tided over some
difficulty.
Fostering associations. The World Association should encourage development of, and
cooperate with, associations of more limited r specialized purpose, such as regional
associations — (the Scandinavian Institute, the U.S. National Association), market research
associations (New York and others), government bureaus, single surveys (committees or any
organizations which investigate an issue by surveys), and agencies for subsidiary functions
(interviewing staffs, tabulating services, etc.).
Professional interests. The association should foster the interests of the polling or
surveying profession in other ways in addition to the list of functions here. Thus in the future it
may be desired to run an Information Service to answer any question about polls or pollers, or
an Employment Exchange within the profession, or a Training Exchange of personnel between
surveying agencies in different countries.
Developing Demoscopy. The association should promote the wider use of demoscopy
— i.e., the observing of people by sampling. This includes observing their opinions,
information, behaviors, or conditions — in short, sampling the facts for any of the social
sciences. Demoscopes should contribute to both pure and applied social science — to the
development of social theory and laws eventually, as well as to the guiding of public policy and
action immediately. New techniques and applications for demoscopes are to be expected in
the future and should be initiated, fostered, and used by the World Association. At the same
time the role of demoscopes in social science should not be exaggerated. Demoscopes are
one instrument among other instruments for observing facts in society, and observing the facts
is but one step in scientific methodology, though demoscopy may contribute to other steps
such as the inducing of social laws and the testing of them.
Public relations of surveying agencies should increase the appreciation and appropriate
use and reduce the misuse of demoscopes, by governments, public agencies, and the people
generally.
Servicing the U.N. should be a major function of international surveys. They can
implement democracy scientifically on the new world scale. World surveys can amplify the
voice of, people in the councils of the United Nations, which is an organization of governments
at one remove from the peoples of the world. Most U.N. agencies and commissions will need
international surveys as part of their job and should have experts in sampling, questionnaire
construction, interviewer organizing and other technical aspects of surveying, available to any
United Nations agency, such as the Public Information or Statistical subsections of the
Secretariat, or alternatively in UNESCO. The relations between the private World Association
and any United Nations agency need to be further worked out as both develop in the years
ahead. In planning the world surveys under the label of a "Barometer of International Security"
both an inter-governmental service to the United Nations and a non-governmental service to
the public directly (such as the national institutes of public opinion currently give) were
proposed.2 At the present writing (October, 1946) it looks as though both services would be
centralized under the private, but non-profit, World Association which, through its contractual
relation with the Secretariat, would be a semi-official agency of the United Nations for world
surveying.
II. Current Development
In weighing the above functions of a World Association, some information and
comments on the current development of these functions may be useful.
World Surveying. Towards world surveys, there is, at present, the network of the ten
Gallup-affiliated Institutes of Public Opinion which now unite ten countries and are in process
of enlarging their coverage. Their affiliation consists of the use of the copyrighted name in
common, first publication rights on one another's findings, monthly common polls and
increasing use of a common Records Office for depositing duplicate records at the Office of
Public Opinion Research. For polls in common, the affiliates’ first cable suggested questions to
the Canadian Institute. As coordinator, it cables back four candidate questions for a vote. The
cabled outcome of this vote determines the monthly question — which will be surveyed
simultaneously by all the affiliated institutes in comparable fashion.
The Public Information Section of the Secretariat of the United Nations suggested a
subscription arrangement in the summer of 1946, but this scheme has not been implemented
by them as yet with a budget and a contract. The scheme is outlined here to indicate one way
in which the United Nations can assess world opinion by using a private agency which it may
guide but which it is not responsible for conducting. By this scheme the Secretariat would
consult each month with the representative of the affiliated institutes in deciding the questions
to be polled and would have the right to report first the results in its bulletins which at present
digest world opinion as gathered from newspapers and from the current polls with their patchy
coverage. For this coordinated polling service the Secretariat would subscribe with an annual
fee much as newspapers subscribe in supporting the institutes, and just as the Secretariat now
subscribes to the Associated Press and other news services. When the World Association is
organized it would take over and centralize such sustaining annual subscription arrangements
with governments or other bodies. One major government has indicated its intention to
subscribe thus and the interest of several other governments in receiving such an international
poll each month is being explored.
In executing such international polls the world association would contract out the
execution of surveys among all the existing agencies, which measured up to defined standards
of workmanship, in some equitable scheme.
Geographic Extension. Towards world coverage there are Gallup-affiliated Institutes of
Public Opinion in ten countries; namely, the American, British, Canadian, Australian, French,
Norwegian, Swedish, Danish, Finnish, and Brazilian institutes. Other institutes or agencies
operate or are in the process of being organized in a dozen other nations, i.e., Belgium,
Holland, Spain-in-exile, Italy, Switzerland, Czechoslovakia, Mexico, South Africa, New
Zealand, China, Germany, and Japan.
Roper's International Public Opinion Research is organized to do public opinion work as
well as market research in nine South and Central American countries: (Cuba, Mexico,
Colombia, Venezuela, Peru, Uruguay, Brazil, Chile, and Argentina) with others in prospect.
The J. Walter Thompson Co. does opinion surveys for Government departments and
other clients in India and parts of Europe as well as England.
There are indications of other public opinion agencies being planned in Poland,
Hungary, Rumania, and Yugoslavia. Also there are more than one agency operating in the
United States, United Kingdom, France and Holland, and the first three of these countries have
surveying agencies within government departments also.
Standards. Towards the development of professional standards, the new National
Association in the United States, which is in process of formation, is expected to set some
minimum standards of what the profession currently demands of its reputable members. A
further step has been the specifying of a series of standards or quantitative ways of measuring
the degree of excellence of any survey from all points of view. 3 For each of these standards is
suggested
(1) a minimum point which might serve as the requirement for membership in any
national or international association, and
(2) a maximum point which sets the upper limit of what has been achieved by the most
scientific and best conducted surveys appearing to date.
Between these two points any survey can be measured as to its excellence in each of these
dimensions and any agency can specify objectively and quantitatively to its client what
standard of service it can offer for a given cost. Also, any inspection agency, which may be
asked by a surveying agency to come in like a chartered accountant, will be able to certify the
standards of that surveying agency to the public.
Accrediting. An eventual step, after standards have become better defined than they are
at present, would be to set up an accrediting or certification agency that could inspect and
certify to the public as to the integrity and competence of any survey or surveying agency. This
will be especially necessary in international surveys in areas or on topics that may be
controversial. One approach towards such an agency at present is the committee on
membership which is projected in the American National Association and in the World
Association.
Research. Towards the development of fundamental research to improve all aspects of
the surveying process, the Social Science Research Council and National Research Council
have set up a joint committee of a score of leaders in the opinion field in the United States in a
"Committee on Attitudes, Opinion, and Consumer Wants." This committee is charged with
planning specific research projects, getting them funded and farmed out to the most competent
research men.
Two major projects involving nearly $80,000 have been thus far up — one on sampling
problems and another on digesting (in three volumes including one on methodology) the
surveys on morale, etc. made in the American Army during World War II. Further research
projects are intended as fast as competent personnel and funds can be secured.
Journal. Towards providing an adequate publication center or clearing house for the
profession throughout the world, there is at resent the Public Opinion Quarterly of Princeton
University which could be further developed. There is also the little bulletin, Opinion News, of
Denver's National Opinion Research Center. A number of institutes and market research
agencies get out their own bulletins for clients or the public.
Records Office. The chief development of a centralized records office to date is the
archives under Professor Hadley Cantril at Princeton University. The Gallup-affiliated Institutes
plan to deposit duplicate punched cards and other records here routinely, although as yet only
the American and British institutes have been able to work this out in practice. A recent grant
of some $7,000 from the Rockefeller Foundation is enabling this records center to construct an
index of national polls throughout the world for the past two years. This index is expected to be
kept currently up to date in the future.
Conferences. The conference called by NORC of Denver at Central City, Colorado, at the
end of last July represented almost all of the public opinion reporting agencies of North
America. It acted, as mentioned above, to call a world conference next summer, to form an
American National Association of the industry, and to plan a World Association.
A little conference in Great Britain called by Chatham House at the end of September,
1946, prepared the way for British participation in the World Association.
Ahead there is a projected conference in Paris in December, 1946, of European
agencies under the auspices of the French Institute of Public Opinion; the conference of
Gallup-affiliated National Institutes of Public Opinion next spring (1947); and the general
conference of all agencies and the profession throughout the world that is projected during the
summer of 1947, probably in Montreal.
Training. Towards training personnel from interviewers and statistical clerks up to
directors and organizers of surveys, a number of universities provide a few courses. Chief
among these are Princeton and Denver, each of which is well equipped with attendant facilities
for field experience, one being the headquarters of the American Institute of Public Opinion
and several other surveying agencies at Princeton and the other having the National Opinion
Research Center in Denver. Both centers are ambitious to develop as world centers for
training in the Profession. A full training course at the University of Michigan, under Rensis
Likert, is expected to be operating by the fall of 1947. The possibility of a two-month training
institute for new personnel of all kinds in the surveying industry is under discussion, depending
on the demand expressed, for the summer of 1947 in connection with the world conference.
Specialists. To date specialists have been largely trained by in-service methods, as in
the case of the American Institute of Public Opinion, which has frequently loaned its personnel
to strengthen sister institutes. Many of the agencies report a dearth of capable and trained
personnel. They have funds and jobs in excess of competent personnel for which they are
clamoring.
Associations. For regional associations there is the Scandinavian Institute, federating
the national institutes of Denmark, Norway, Sweden, and Finland. There is the projected
American National Association in process of being organized. There is a committee in Great
Britain under PEP (Political and Economic Planning) which has spoken to some extent for the
industry in Britain. There is the affiliation of ten national institutes of public opinion (Gallupaffiliates) in different countries cited above. Finally, there are well organized and older market
research associations notably that of New York with some 75 member agencies, some of
whom do public opinion work, as well as commercial work, for clients.
The Polling Profession. The development of professional interests and consciousness
among surveyors has greatly increased within the past few years. The calling of the Colorado
Conference, the projected National Association in the United States and the proposed World
Association, and the Broad Sheet issued this past summer by the committee from PEP
reviewing the whole industry and making recommendations for its future in England, are
examples of this professional consciousness.
Demoscopy. Demoscopes, which may be defined as instruments of all kinds for
observing people by sampling, have greatly increased in number, in geographic extension, and
in variety and thoroughness of content surveyed during the past decade. Surveying agencies,
beginning under journalistic auspices, have also been developed by government, military
authorities, the broadcasting and film and publishing industries, universities, and a great
variety of commercial firms. A bibliography of surveys for one year by the Federal Government
in the United States during the war ran to 1,500 titles, for example. Demoscopes are being
widely used and the profession expects them to be used much more intensively and
extensively in the future, not only for surveying opinion but also for surveying the information,
behaviors, and conditions of people.
Public Relations. The industry has had the support of the press to a large extent,
especially those syndicated newspapers that finance their servicing institutes. However, a few
unscientific surveys with low standards and sometimes questionable integrity have done much
to injure the reputation of the reputable surveys. Some items in building up public appreciation
of polling have been Gallup's A Guide to Public Opinion Polls (presented as questions
frequently asked by the public and their answers), NORC's new manual on interviewing, PEP's
broadsheet on Sampling Surveys and, of course, the large number of reports on polls with their
attendant publicity.
United Nations Relations. The arrangement between the Public Information Department
of the United Nations' Secretariat and the affiliated Institutes of Public Opinion noted above will
make the World Association the semi-official agent of the United Nations. It is expected that
this servicing will grow and that the other sections of the Secretariat and agencies of the
United Nations will avail themselves of this surveying network to get more of the information
they need about the world's opinion, or behavior, or conditions, as far as population sampling
can provide such. Thus, within the Economic and Social Council, the Sub-Commission on the
Status of Women and the Food and Agriculture Organization, who lack world data on women's
status and malnutrition, respectively, could well get it by periodic international sample surveys.
UNESCO (United Nations Educational, Scientific, and Cultural Organization) has
included on its docket the study of international surveys and the development of them in
carrying out UNESCO'S functions. Negotiations with their temporary secretariat (before their
permanent secretariat was set up by their first general conference in November) revealed a
twofold interest of UNESCO in the world association. UNESCO'S Social Science Division is
interested in demoscopes as an international scientific instrument. Their Mass Media Division
is interested in farming out contracts through the world association to make the international
surveys (for which it has budgeted) upon the flow and effects of international communication
by press, radio, and movie.
III. Possible Organization
In order to carry out functions such as the above, what kind of organization is desirable?
The answer will come from the questionnaire to be circulated to the profession around the
world this winter and from the world conference next summer in adopting the constitution of the
World Association. But since many forms of association are possible, the author has
repeatedly been asked to outline an international organization which seems favored by leaders
in the surveying field as indicated by his interviews with some 106 practitioners and relevant
government officials in Europe and America during the summers of 1944 and 1946. Although
details of organization were discussed with only a minority of them, the most favored form of
organization that emerged might be sketched as follows:
As to membership, the association would be primarily a professional one built around
the public opinion reporting agency as the chief unit of membership and chief unit of voting.
Any agency which reports public opinion representatively sampled by a staff of interviewers,
provided it had met the minimum standards of workmanship (which are to be set up), would be
eligible to an "agency membership" even though much of its work was not strictly limited to
opinion. At least part of its work should be published or available to the public on request. This
includes agencies which survey for government departments or of public bodies or even for
private and commercial clients if the results are available to the public. It excludes any
agencies polling covertly or secretly and agencies doing exclusively market research which is
wholly confidential to a private and profit-making client.
"Individual full memberships" might be open to persons with high professional
qualifications such as top personnel in agencies and scientists conducting research on
surveying, teaching it in universities, or publishing creditably in the field. Individual full
members share with agency representatives the right to vote, the right to attend business as
well as "open" meetings, and other privileges of the association.
"Individual associate memberships" without vote and entitled only to attend "open"
meetings of the association and to receive only certain of its publications or other privileges
might be offered to any personnel in surveying agencies, students, and other specified but
more general categories of people.
As to government, the association might be directed by a Board of Governors. This Board
might represent the practitioners, the scientists and the public. Thus if there were twelve
Governors, six might be elected by the agency members, three elected by the individual full
members, and three to represent the interest of the world public might be appointed by
UNESCO. If each Governor had a three year term, two agency representatives, one "individual
full membership" representative, and one public representative would be chosen annually. The
Board might choose its own Chairman. The Board would govern the association, appointing its
salaried personnel and determining its policies. Votes of the members at meetings or polls of
them between meetings would be only advisory and be binding on the Board only in case of a
three-fourths majority. Agency and individual members might be differentially weighted in such
voting.
Salaried personnel might consist at first of a Director, an assistant and several
secretaries, and expand later as justified by the work and funds of the association. The
Director should be an able executive, experienced in international negotiation even more than
in surveying. He would direct relations with the public, governments, and member agencies in
general. The Assistant Director would need to be more technically competent in surveying, a
scientist able to head up international committees on research, standards, and inspection.
As to costs, the minimum budget on the initial scale sketched above would be about
$40,000 a year, assuming half of this for the salaried personnel and half for operating
expenses including travel of governors, personnel, and members of inspection committees.
This amount would increase steadily, in proportion as the non-profit World Association
developed more fully the fifteen functions described above.
As to revenues, the association might expect dues from members, part of the
subscription fees from governmental and other bodies for surveying services, and grants from
scientific foundations and others. The dues from individual members, both full and associate,
might be a flat annual sum equivalent to dues in most professional or scientific societies. The
dues from agencies might be proportioned to resources which may also approximate benefits
received from association membership. One suggestion is that the annual dues be scaled to
some percent, say one percent, of the cost of all interviews by that agency during the
preceding year. It could thus be set aside currently by the agency and also passed on in the
charge to the client. In return for these, the agency gets by membership a share in contracts
for international surveying proportionate to its capacity. It gets the publicity and prestige as
being "accredited" or inspected and certified to the public as to its standards and its right to
participate in the United Nations "Barometer of International Security" and other surveys. It
gets the benefits of inspection, research results, the world publication on surveys, the index
and archives, the conferences on surveying problems of all kinds, the specialist and
professional services as these may be developed, and centralized help in dealing with
surveying agencies, publics, and governments abroad. These benefits to agency members will
take time to develop to the full of course, and could only begin modestly the first year.
Subscription fees to international surveying service would probably be fixed on a costplus-percentage-for-overhead basis.
Grants may be expected mostly for special functions, such as scientific research or
developing the archives and index. Some minimum percent, perhaps 20 percent, of the World
Association's total net receipts (i.e. from dues, from its overhead charge on survey contracts
brined out, and from grants or gifts) should be reserved for research on the surveying process
itself, for this is the essential guarantor in the long run that demoscopes will grow in usefulness
and so in use by society.
As to site, our canvass points to the headquarters of the United Nations. Aside from
many other considerations, the mechanical problems of translation into the world's languages
and the necessary contacts with official representatives of, and experts on, all countries can be
best secured there.
As to title, the term "poll," favored in America, is disfavored in England, while "surveys,"
though in general favored, is inexact since there are geologic and other kinds of surveys.
Although surveys are no longer limited to "opinions," yet the national institutes of public opinion
have entrenched prestige for "World Institute of Public Opinion" (WIP0). The new term
"demoscope," which can come to mean exactly whatever the World Association does, has met
with surprising approval for a neologism.
IV. Issues to Be Settled
From the above sketch of a World Association it should be evident that there are many
issues which should be settled by the profession rather than by the Planning Committee. A
tentative analysis of these issues is contained in a questionnaire dealing with matters of
membership, function, organization, financing, site, and name of the World Association. Copies
of this questionnaire will be mailed out shortly to all the profession around the world. It invites
such views as: Should the association be of agencies only or include individuals in some way?
Should it include publicly reporting agencies only or include market research agencies or
government agencies also under some conditions? Should it include national cross-section
agencies only or also agencies for parts of nations, or for groups of nations, or for special
purposes, or for special groups, etc.? Should its functions be as listed above? What should be
the form of its organization and the amount and manner of financing it?
Should its site be at the United Nations headquarters (New York) or UNESCO's
headquarters (Paris), or Princeton, Toronto, London, Copenhagen, Geneva, Jerusalem, or
elsewhere?
Should its name be (to list a few that have been suggested) : World Association of
Public Opinion Reporting Agencies (W.A.P.O.R.A.); World institute of Public Opinion
(W.I.P.O.); International Surveys Agency (I.S.A.); United Nations Sample Surveys Associate
(U.N. S.S.A.); Public Opinionists, Ltd. (P.O.L.); World Demoscope; World Samplers; World
Surveys; World Polls; Orbiscope; .or Mundirneter?
It is hoped that full response to this questionnaire will guide the Planning Committee in
drafting a proposed constitution for the World Association.
Notes
1. This Planning Committee for a World Association of Surveying Agencies consists of Mr.
George Gallup, Chairman, American Institute of Public Opinion, 16 Chambers Street,
Princeton, New Jersey; Mr. Stuart C. Dodd, Co-Chairman, American University of Beirut,
Beirut, Lebanon; Mr. Wilfrid Sanders, Canadian Institute of Public Opinion, 38 King Street,
Toronto. Canada; Mr. Rensis Likert, University of Michigan, Ann Arbor, Michigan; and Mr.
Elmo C. Wilson, Director of Columbia Broadcasting System, Inc., New York City.
2. See Stuart C. Dodd, "A. Barometer of International Security," Public Opinion Quarterly,
1945, 9. 194-200.
3. These specifications are contained in a paper by the author who will appear in the Public
Opinion Quarterly Spring issue, 1947.
#3. Steps toward a Barometer of International Security
Proceedings of WAPOR Conference
Public Opinion and International Security
Chairman: Theodore Lentz, Washington University.
Participants: Stuart C. Dodd, University of Washington;
Kaare Svalastoga, University of Denmark.
Wars begin in the minds of men, states UNESCO's charter. Gauging such pre-war
tensions is the task of the proposed Barometer of International Security, a periodic polling
service for the member nations of the United Nations.
The Barometer would have a twofold purpose. In dealing with issues of international
concern, it would supply information on the attitudes and feelings of the peoples of the world.
In addition to this civic purpose, it would serve as a scientific instrument, a demoscope, to
observe facts about humanity transcending any one culture or nationality. The social sciences
require such a demo-scope particularly to accumulate knowledge of the exact similarities and
differences of attitudes among the various national groups.
I. Historical Sketch.
In 1945 the joint British-American surveying agencies had operated in Syria, Sicily,
France and Germany. Under the postwar regimes these were either continued or, in the case
of Germany and Japan, partly converted into major bombing surveys. Japan lists some fortyseven surveying agencies that have operated there, for various periods, since the war.
A Barometer officially operated by any United Nations secretariat is unlikely, as a series
of negotiations have shown. Polling and accurate reporting of polls are not possible in the
U.S.S.R. and her satellites. The United States, also, does not seem to be interested, and has
discontinued its wartime information office. The Barometer must therefore depend on private
initiative, at least at first.
The proposed Barometer would be in the form of a subscription service, supported by
fees from individual or group subscribers. These subscribers would nominate questions and
receive prior releases of results, and for this service would pay a fixed annual fee. Responsible
officials of the United Nations and of the United States State Department have indicated their
interest in subscribing to the Barometer, when it should become a going concern, just as they
now subscribe to various private press services.
Two United Nations agencies and three private organizations have made significant
steps in laying the groundwork for a Barometer.
1)The United Nations Economic and Social Council (ECOSOC) has established a subcommission on sampling statistics. This sub-commission has been developing the
methodology necessary for scientific international surveys.
2)The United Nations Educational, Scientific, and Cultural Organization (UNESCO) set
up a continuing inquiry into tensions affecting international understanding. This
project included an eight-nation survey of tensions, which has not yet been followed
up.
3) The polling profession formed, in 1946, the World Association for Public Opinion
Research (WAPOR). Although prohibited by its charter from the carrying-out of
surveys, WAPOR's Committee on International Surveys is formed to promote
coordinated international surveying. (The society's meetings are held in alternate
years in Europe and the United States. In Europe, WAPOR meets in odd-numbered
years with the European Society for Opinion and Market Research (ESOMAR). In
even-numbered years, the meetings are held in the United States, jointly with the
American Association for Public Opinion Research (AAPOR). These joint meetings
contribute to the development of unifying techniques and personal acquaintances
across national boundaries.)
4) Meanwhile, the surveying agencies known as the Gallup Affiliates have formed the
International Association of Public Opinion Institutes. They meet biennially to plan
future surveys and publish the results in the bulletin "World Opinion." They have
found that a polling network, lacking centralized control, is seriously limited as to the
number of countries and topics that can be covered on the same date.
5) The Roper group has organized International Public Opinion Research, under the
directorship of Mr. Elmo Wilson. Operating in some twenty countries, this market
research agency undertakes contract research from private and commercial firms,
foundations, and governments.
These various efforts show clearly the possibility of a Barometer, surmounting problems
of translation, synchronization, and centralized control. At the same time, the content of polls
suitable for use in a Barometer has been developed. Examples are the two polls, on
International Tensions and National Security made recently by the Washington Public Opinion
Laboratory. They involve an hour's interviewing and over a hundred variables. The National
Security Poll alone contains forty small attitude scales in its fifty content questions.
II. Two further proposed steps.
These were taken by WAPOR at its annual conference this year:
1) The Research Committee was charged with exploring the possibility of an Index of
Peace Expectation;
2) The Public Relations Committee was charged with a campaign to develop UNESCO
interest and eventual action in sponsoring the Barometer.
A. The Index of Peace Expectation.
The proposed Index of Peace Expectation consists at present of one standardized
question, to be asked around January 1 of each year. Many pollers could add this single question at little cost to themselves, in comparison with the prohibitive cost of a full-scale multiple
question survey.
The question proposed is one frequently used in past polls to gauge the expectancy of
war or peace: "As things look now, do you expect we will be in an all-out world war within:
a) the next year;
b) the next 5 years;
c) the next 10 years;
d) the next 20 years;
e) or never?"
This wording is subject to refinement, especially in translation.
The question yields a simple percentage index: The percent of man-years of peace (or
war) expectation. The index is calculated by plotting the successive periods of years, as the
abscissa, against the cumulated percentage of respondents expecting peace to hold within
each period. In this plot, the area under the curve, as proportion in the total area, is the index
of war expectation in man-years.1 The complement, or the area above the curve, is the
percentage of peace expectation.
The Index reflects the fear of war and its probability of occurrence in the minds of the
people. It takes into account both the number of people expecting war and the time within
which they expect that war to start, its imminence. These are the most obvious of the many
dimensions of the complex evaluation of war expectancy.2 The other dimensions must first be
explored by individual agencies.
As the use of the Index becomes an established affair, it will be easier to get grants-inaid and government support. Public interest and confidence can be built around this initial
question, so that in a few years other questions can be added, and it will be possible to
develop the Barometer more fully.
Surveying agencies asking this question around January 1 are requested to coordinate
first with the Research Committee of WAPOR (Chairman, Mr. J. Stevens Stock, Bureau of
Labor Statistics, Department of Labor, Washington, D. C.). They are further requested to notify
the Research Committee of their intention to poll on this question, and to report their results for
publication in this journal.
It is hoped that this single Index will achieve several initial purposes:
1.
2.
3.
4.
To check the reliability of polling, in a population polled by more than one agency;
To reveal comparative results between different countries;
To show, in time the trend of opinion from year to year;
To provide a basis for the study of the causes and correlations of war and peace
expectations.
B. Toward Interesting UNESCO.
The sponsorship of the Barometer by UNESCO is another possible move. UNESCO's
action must await a recommendation from some of its national bodies. Such a resolution was
adopted by the UNESCO body in the State of Washington, and recommended to the national
organization. There it was felt that the move for an international polling agency had better
come from several national bodies outside the U. S.
Individuals in different countries who are interested in such a Barometer should bring
the attention of their local and national bodies to the matter, and urge its recommendation to
the Plenary session of UNESCO. The appended resolution states the situation in tentative
form, as the exact conditions might be left to the UNESCO Secretariat. Surveying agencies are
invited to introduce a similar resolution to their UNESCO bodies in as many different countries
as possible.
If UNESCO is to adopt the Barometer, there must be many nations active and
interested in it. The Barometer itself, however, may be operated either by private agencies or
with some degree of UNESCO control. In any case, it would be a way of applying scientific
methods to UNESCO’s central problem, the origin of wars in the minds of men.
III. Proposed Resolution for UNESCO
Concerning
A Barometer of International Tensions
to be initiated by any WAPOR member in local or national UNESCO bodies
WHEREAS
WHEREAS
UNESCO's charter states that wars begin in the minds of men;
UNESCO has a research project on "tensions affecting international
understanding";
WHEREAS exact knowledge of the current amounts, trends, and locations of war-breeding
tensions is necessary for effective efforts to reduce tensions and avert wars;
WHEREAS sample surveys using interviewers and standardized questionnaires can measure
tensions in the minds of men with progressively improving accuracy and with
measurable reliability;
WHEREAS sampling surveys are now internationally organized and in progress by private
agencies with coverage of some thirty countries among the various agencies;
WHEREAS a World Association for Public Opinion Research serving as a professional
association of individual surveyors is available for UNESCO to consult to assure
scientific integrity and competence in surveying;
WHEREAS the Economic and Social Council of UN has a sub-commission on sample
statistics which might cooperate in any Barometer project;
RESOLVED: the UNESCO Secretariat be instructed to study and report for next year's action
and budget upon the possible establishment of A Barometer of International
Tensions under UNESCO auspices.
IV. Discussion
A. Form of the Barometer
One possible form of the Barometer to he studied is a subscription service like the
United Press or Reuters. UNESCO, governments, or other UN bodies might subscribe an
annual fee for prior release of surveys on questions nominated by the subscribers. UNESCO
would thus coordinate the periodic sampling surveys of tensions without administrative
responsibility. Private polling agencies under contract could execute the interview surveys. The
World Association for Public Opinion Research (which is in process of becoming incorporated
to then seek consultative status with UNESCO) could advise, inspect and certify as to the
scientific integrity and accuracy of the surveys.
B. Some Background Facts
1. The Barometer proposal grew out of surveys for Allied Headquarters in Europe since
1943.
2. Officials in the U. S. State Department and UN Secretariat have promised support and
at various times have proposed starting to get such subscription on their budgets.
3. UNESCO’s tension project, under Mr. Cantril, organized one international survey (not
yet published).
4. WAPOR has set up a Committee to further International Surveys (headed by Sven 0.
Blomquist, Swedish Gallup Institute, Stockholm).
5. International public opinion surveys are currently well organized. International Public
Opinion Research operates in some 20 countries; the Gallup affiliates operate in a
looser network in ten countries; and other agencies are available.
6. Sampling techniques are well developed. Their accuracy can depend largely on the
budget available to pay for rigorous probability sampling designs.
7. Questionnaires and scales have been constructed for measuring tensions in many ways
and have been applied in different countries. While improvements with research are
always to be sought, yet an instrument is now available for gauging tensions in any
population (see the International Tensions Poll 6 of the Washington Public Opinion
Laboratory proposal in early form may be found in Public Opinion Quarterly, "A
Barometer of International Security," Summer, 1945, and "Toward World Surveying,"
Winter, 1946-47, by Stuart C. Dodd.
8. Many UN organizations need data not now available which sample surveys could
supply by interviewing people about their opinions, information, behavior, conditions,
and needs.
9. For observing war-breeding opinions and attitudes "in the minds of men" in a whole
population, sample surveys or demoscopes provide the most scientific tool for
measurement that has yet been developed.
Notes
1. Adopting a twenty-year limit to make the expectation within a definite period — such as one
generation.
2. For an example, the results of the question in the State of Washington in January, 1949:
War within 5 years, 30%; within 10 years, 21%; within 20 years, 8%; not within 20 years,
15%; undecided, 21%; incomplete answers, 6%. This yields an Index of Peace Expectation
of 59%, meaning that, during the next 20 years, the population felt on the average that war
was unlikely, at least during the first half of the period. More exactly, of the total number of
man-years of expectation reported, 59% were expecting peace and 41% war.
#4. Standards for Surveying Agencies
Dr. Dodd, Professor of Sociology and Director of the Social Science Research -Section
of the American University of Beirut, outlines a set of standards by which the quality of any
demoscope (poll . or survey of people) may be evaluated. The list represents a forward step
in the professionalization of polling, and should provide guidance for any association of
agencies seeking to set up meaningful membership requirements. The present paper has
particular significance in the light of the author's previous discussion of the proposed World
Association of Surveying Agencies (Winter QUARTERLY).
As surveying by sampling a population grows in use, the need for standards to
distinguish between good and bad workmanship also increases. Objective indices are needed.
Each should measure the degree of excellence of a survey in some respect, or along some
dimension. In terms of these dimensions (which define the kinds of standards) and points on
each dimension, one can specify amounts of each standard, such as a minimum point, current
actual point for a particular survey, an average point, or a maximum point. Then it becomes
possible, by means of these standardized dimensions, for surveying agencies to specify the
quality of their service to a client or to the public, or for any professional association to set up
membership requirements.
In proportion as standards may become more exacting in the future, demoscopes will
become more exact instruments for observing people by sampling and the social sciences will
become more exact sciences. Exactness in observing facts and inducing generalizations from
them increases the power of any science to predict and control its phenomena and fulfill the
function of science. Toward these ends, then, the following standards for demo-scopes are
proposed.
In the column below, headed "Kinds of Standards," some forty dimensions of excellence
in surveying are operationally defined by specifying some index. Wherever possible the index
is an ordinal or cardinal index, but for the present it will often have to remain simply an all-ornone index asserting either the presence or absence of some attribute. This column tells how
to measure each kind of standard.
The second column below, headed "Amount of Standard: Minimum," suggests an
amount of each standard such that about 90 per cent of current surveying agencies would fulfill
it and about 10 per cent would not be up to that standard. This minimum, stated in these
relative or percentile terms for the most part, is suggested as a requirement for admission to
any national or international association of surveying agencies. These proposed minimum
points are collected together at the end of this paper to make a suggested set of Membership
requirements. The phrase "current practice" is used to allow latitude at first while professional
standards are forming. It should be interpreted to include the practices of some percentage,
such as 90 per cent, of the better agencies, during the preceding year.
The third column below, headed "Amount of Standard: Maximum," specifies a higher
point on each dimension and a point that is about the best yet achieved by any scientific
survey with ample resources. This point on each dimension may be expected to rise in the
future (as also the minimum points) due to rivalry among agencies for most accurate prediction
and due to research on surveying techniques.
Whenever any Membership Committee or International Inspectorate may want to use
these indices, they may have to be operationally defined greater detail. This will include
constructing scales to measure those standards which are inadequately specified here. For
summarizing these indices into a single net index of surveying excellence, it will be useful to
devise weights for different purposes and also a standard "general purpose" set of weights.
Eventually Manual of Standards currently revised to keep pace with research and growth in
demoscopy will become desirable just as for any other major scientific instrument of increasing
precision. Differential weighting including zero weights can adjust these standards to any
purpose in hand including varying degrees of accuracy desired in a survey. Thus surveys with
no respondents, as when the "interviewer" observes people's behavior directly, will omit those
standards which are relevant only to the respondents.
Such periodic revision will help avoid ''freezing" standards prematurely — a fear
sometimes expressed. But the chief insurance against freezing out new and better techniques
— which may be opposed by vested interests in the opinion industry — is research. Scientific
experiments should decide future controversies as to what are superior techniques and not
majority of votes of surveyors based only on impressions from general experience.
It may be noted that these standards do not specify whether an agency is governmental
or not, or operated for profit or not. A scientific instrument should be specified independently of
any economic or political system as far is possible.
I. Standards for Surveying Agencies
Kinds of Standards
I. Agency Credence Standards
1. Responsibility
There should be on public record:
a) The agency executing the survey
b) The. agency (if different) controlling
its policies and purposes
c) The body reported to
d) The sources of funds
2.
Amount of Standards
Minimum
Maximum
All four on
public
record
Integrity
Current
Honest fact finding and the reputation
practice
for it should be the chief purpose of
every survey agency. Indications of
integrity include:
a) Willingness to be inspected
b) Frequent validity studies
c) Interpartisan staff and backers in
preference to mono-partisan ones,
and nonpartisan ones wherever
possible
in
preference
to
interpartisan ones
d) Sponsoring by public leaders or social scientists of high integrity
e) Absence of plausible charges of
falsification
f) Pledges of honest fact finding by
interviewers and by the agency
g) Public recording of all standards
h) Absence of government control of a
non-government agency
i) Listing in Directory when ready
3. Impartiality
Surveying agencies should not be
public partisans or pressure groups on
issues surveyed
All four should be named, with
addresses, and they should
be more accessibly published
Above 90 percentile on all
these indications of integrity
Partisan
Possible partisan connections
agencies in- more rigorously eliminated
eligible
(e.g., either
labor unions
or manfactures’
agencies)
4. Certification
An international inspectorate should be
available to investigate standards,
integrity and performance of any
survey or agency, when invited
Inspected at least annually,
Inspected
and getting rated above 90
once for
Association percentile
membership
5. Validity Studies
Occasional
Agencies should make studies of their correlations
survey results, correlating them
published
against criteria (such as elections, censuses, government statistics, larger
samples, friends' samples, and other
independent measures of behavior or
conditions which check on assertions,
etc.) whenever possible, and
publishing the correlations. Accurate
predicting is the final test of validity
Every possible correlation
calculated and published.
Correlation indices averaging
above .9 are satisfactory
6. Comparability
Agencies should strive for comparability between different surveys, in
different time periods, in different
countries or languages by progressive
research on, and standardization of,
terms and techniques
Keeping all standards the
Routine
statement of same except the one respect
relevant dif- in which a comparison is
ferences in
standards
and conditions if comparisons are
made
7. Duration
a) Age of the agency
b) Probability of continuance for at
least a year ahead as rated by
competent judges
a) 1 year old 10 years old and probability
(to get validity above 95 per cent
data)
b) probability
above 50 per
cent
II. QUESTIONNAIRE STANDARDS
8. Scope
a) Any opinions
b) Information
c) Behavior
d) Conditions or records observed by
sampling of people ("Demoscopy")
9. Pretesting
Per cent of year's questions pretested
on:
a) Experts
b) A sample of the eventual sample
Current
practice
(mostly
opinions)
All five classes included
Current
practice
100 per cent pretested on at
least 10 experts and on
square root of eventual
sample. Split ballots also used
to test effect of variant
wording
10. Exactness
Current
Whenever standardization of response practice
is desired:
a) Standardize the wording of every
question
b) Begin exploring an issue with openend questions
c) In proportion as responses are
known, prefer closed-end questions
d) Prefer ordinal form of response
(i.e., a series of degrees) to all-ornone form where practicable-.
e) Prefer cardinal form of response
(i.e., equal calibration units) to
ordinal form, where practicable
Using most exact forms
currently possible
11. Clarity
Formal testing of paraphrases
The meaning of questions should be
unambiguous, definite, and constant
for all as tested by the percent of
respondents who when they restate it
in their own words, agree essentially
on its interpretation
Current
practice
12. Fairness
Current
Questions should be so phrased as to practice
analyze the issue fairly to all factions
as testable by per cent of agreement
on phrasing by a competent panel of
persons holding all the diverse
opinions on that issue. (This "Fairness"
dimension may merge into "Phrasing
bias" below)
Formal testing and 100 per
agreement by an adequate
panel representing all
the relevant opinions
13. Isolation of issue
Questions should be so devised as to
isolate the issue free from factors of
prestige, taboo, or other extrinsic prossure. Until scales may be developed to
analyze and to measure this degree of
isolation, the rating of experts after
pretests may serve
Current
practice
Rigorous phrasing refined by
pretests till maximum rating
results
14. Intensity of opinion
Responses should have their intensity
measured wherever appropriate
Current
practice
Intensities routinely measured
by a calibrated scale
15. Phrasing bias
Current
To control phrasing bias:
practice
a) Questions must be designed which
scale together (Cornell technique), pr2ctici.
b) Their intensities must be
determined
c) The zero at the bottom of the Ucurve of the intensity function must
be found
d) The "debiassed" proportions of
respondents on each side of the
issue must be determined relative
to that zero point
Debiassed proportions determined from well-scaled
questions
16. Thoroughness
The number of questions and subquestions probing the issue, or
included in a survey on one topic,
measures "Thoroughness" (assuming
all the questions to be valuable for the
purpose in hand where "valuable"
means contributing to other
standards, especially Nos. 5, 6, 1o, 13
and 17)
Current
practice
Largest number of valuable
questions exploring and
analyzing one subject of any
survey recorded
17. Informedness
Wherever degree of information is a
"biasser," i.e., correlates with opinion
on that issue, the respondent's
amount of information on the issue
should be determined by appropriate
questions and opinions should be
classified by their informedness, as
into "uninformed," "informed," or
"expert" categories
Current
practice
Information measured and
opinion sub-classified by it.
Universe
indicated
Universe specified in exact
detail with reasons
III. SAMPLING STANDARDS
18. Universe
The universe or parent population
sampled should be published
especially as to such characteristics
as: territory, age, sex, economic and
educational levels,'-informedness, and
other characteristics as may be
relevant to a given survey
19. Adequacy of sample
Size of sample (p) should be stated in
absolute numbers, or as x per cent of
the square root of the population (P)
sampled. Adequacy of size is
measurable by y per cent of
agreement between randomly split
halves. (At the limits, a sample
becomes a case, p = I, or a census,
p=P)
20. Representativeness
Representativeness may be analyzed
into its three aspects as follows:
a) Population bias. Every
characteristic, B, (called hereafter a
“biasser”) which correlates higher
than a critical amount, r, with the
question surveyed should be
determined. Common biasser are
region, sex, age, information on
question-at-issue, vested interest,
income, educational level,
occupation, nationality, party or
sect.
b) Stratification. Each biasser must be
stratified; i.e., proportionately
sampled. The per cent of any
biasser in the universe sampled
must be matched in the sample
Samples to
averages
over 10 per
cent of √P,
(e.g., this
means
1200 for
the U.S.A.
sample).
Ten per
cent of the
sample is
suggested
as a
minimum
for
reporting
any
breakdown
or subcategory.
100 per cent of √P, or p=√P
(=12,000 if P = U.S.A.) y = 95
per cent (as of probability
from chi square test of
distributions from two halves
of the sample fitting each
other.
Current
practice
(inclusion
of “check
questions”
routinely to
assure
representativeness
Possible biasser correlated
and identified by r > .5.
Biassers matched within r per
cent, either by chi square test
or the proportion:
(pB - PB)
(p + P)
c) Randomness. Beyond the above,
the sample must be randomized;
i.e., selected by techniques which
best eliminate further bias and
assure randomness
A rigorous randomizing
techniques used
21. Method of sampling
The basis of selecting the sample
should be published, whether by area
or quota control, randomized listing,
calculation, or other technique
Method
stated
Most fully representative
method, as determined by
crucial experiments, used
regardless of cost or
convenience
22. Panels
Wherever a constant set of informants
is needed to isolate the variation of the
variable at issue through repeated
surveys, a panel should be used. It
may need to be checked as to its
continued representativeness, and
measured for biassers due to
repetition (such as practice effects and
annoyance from repetitions, tampering
from a pressure group, etc.
Current
practice
Panels rigorously used where
appropriate, and checked.
23. Reliability
Current
a) Errors of observation. Measured as practice
differences between reobservations
at one time between:
1) Different informants,
2) Different interviewers
3) Different wording of same
questions; as well as
4) Differences between
reobservations at different times
b) Errors of Sampling. Measure by the
standard error of sampling
Publication
in technical
journal
desirable
Differences averaging less
than 5 per cent, e.g., 95 per
cent probability from chi
square test that the two
distributions are from one
universe; or r > .97 where r
applies; or
(I - II)
(I + II) < .025
where
I = 1st observing
II = 2nd observing
Sampling errors published in
full
IV. Interviewing Standards
24. Interview conditions
a) Overt or covert?
b) Response oral, written, or by
further documents?
c) Singly, with witnesses, or in
groups?
d) Respondent identified or
anonymous
e) By interview, telephone or mail?
f) Followed up or not; i.e., return call?
g) Interview invited by informant in
specific appointment, or in general
required or requested by higher
authority, or requested by
interviewer only?
h) In comfortable, unhurried situation
or not?
i) In quiet, undistracted situation or
not?
j) At home, work, street, or other
place?
k) First, second or later interview (of
same person on same issue)?
l) Number of minutes of interview?
m) Given prestige at first meeting?
25. Selection
In recruiting interviewers preference is
usually given, for characteristics
practice such as:
a) Maturity
b) Education
c) Interest
d) Local languages spoken
e) Normality in tests of:
1) Intelligence
2) General opinions
3) Personality
f) Experience
g) Recommendations
h) Local residence
Current
practice
For most purposes, 99 per
cent of interviews should be:
Overt, oral, single, identified,
first, invited, unhurried,
undistracted, and short.
Current
practice
Definite standards required in
such a list of characteristics
26. Compatibility of interviewers to
respondents, or social distances.
Differences between them must be
uncorrelated with survey results; e.g.,
differences status as to:
a ) Race
b ) Economic level
c ) Educational level
d ) Sex (in Moslem areas)
e ) Language, etc. (if biassing)
f ) Degree of acquaintance should be
administratively avoided or
experimentally measured
Current
practice
Possible differences
measured experimentally and
significant biasses avoided by
appropriate selection of interviewers
27. Interviewer bias
The characteristics of the interviewer
must show a correlation of zero with
the replies.
Current
practice
Calculation of possible correlations and their reduction to
zero by reselecting,
retraining, or reassigning the
interviewers
Current
practice
Above 90 percentile on all
indices of competence
28. Competence of Interviewers
Interviewers should be employees
trained till they pass such efficiency
test as:
a ) Knowledge, x percent score on
objective standardized examination
about polling
b ) Honesty. x' per cent honest
interviewers as checked by
supervisors on y per cent of
respondents
c ) RAPPORT. x per cent or contacts
carried through to completed
interviews
d ) RECORDING. x per cent of
agreement between interviewers'
records and dictaphone records of
standardized interviews
e ) CAREFULNESS. x per cent of
errors found in editing
questionnaires as per instructions
in a standardized set of situations
f ) PRODUCTIVITY: More than x
percentile in interviews per period
of time
29. Instructions to interviewers
A general Manual of Instructions in the
appropriate language(s) should guide
the interviewer about selecting
informants, asking questions and recording answers, etc., supplemented
by special instructions as needed for
each particular survey
Current
practice
30. Supervision
The amount of supervision is
measurable as the ratio of the
supervisory man-hours to the
interviewer hours
Current
practice
,
V. Reporting Standards
31. Public reporting
Current
All surveys should be made available practice
to the widest possible public
32. Essentials of report
Reporting
Every report of a survey should include a) through
at least:
d)
a) The name of the responsible
agency
b) The date or period and region
c) The universe and size and nature
of the sample
d) Verbatim quotation of the question
and the numerical findings
33. Objectivity
Surveys should be honestly and objectively reported, including verbatim
quotation of the questions and
numerical findings. The "No opinion"
responses should not be lumped in
with the "No" responses, etc.
Subjective interpretation or
application should be distinguished
from the objective findings
Detailed and well-prepared
Manuals always provided for
every interviewer in ample
time
Current
practice
Supervising time
----------------------- = 5 %
Interviewing time
In addition to any popular
reporting, full technical details
should be published routinely
in professional publications
Reporting on all standards,
No. x through No 40, as
realized in surveys
Objectivity tested by zoo percent agreement as to the
statement of the findings by
surveyors of differing opinion
ions on those issues
34. Limits
The relevant kinds and amounts of
the standards listed in paragraphs to
No. 4.0 should be stated as among
the limitations of the findings in
interpreting them
Current
practice
The limitations implied by all
the standards listed in this
column should be reported
35. Recording.
All records original and secondary,with guides to their use should be preserved accessibly for further research
or inspection
Current
practice
Duplicate records filed an
indexed in some centralized
archives
36. Public interest
Current
Market research and “public polling” for practice
the public interest directly or through
reporting to a public body, such as a
government or scientific agency,
should be kept administratively
separate
VI. Administrative and Other Standards
37. Preparation of respondents
a) Wherever unaccustomed to being
surveyed the population should be
prepared by appropriate publicity,
educating them to cooperate
b) The greater the authority backing
the survey the better
Current
practice
Rigorous separation of civic
and commercial surveys
Publicity having a measured
and high impact preparing for
surveying; a law, a government, civic, religious,
journalistic or other influential
sponsorship
38. Tabulating
Current
Coding, tabulating and recording
practice
should use checks and mechanical devices as far as possible to reduce
clerical errors
Appropriate checks and machines used wherever they
exist
39. Speed
The speed of surveying may be
specified as: "interviewing period" from
first to last, interview, or as "overall
speed" or number of days from inception to reporting
Two days for emergencies is
about the record
Usually
within a 90
day overall
period
40. Cost
The average cost of an agency's
surveys including all overhead may be
reported as x dollars per interview. It
will exceed the average with:
a) Open-ended or numerous
questions
b) Samples difficult of access
c) Few interviews or surveys per year
to share the overhead
d) Need for expert interviewers
e) Fullness of tabulating and reporting
f) Price and wage levels in other
countries
g) Initial organizing of an agency
Current
practice
41. Demoscopic development
Current
Surveying agencies and their mempractice
bers should contribute to the development of their profession through
participating in their trade association
and/ or in some of the fifteen
functions below:
a) International surveying
b) Extending surveying to new regions
c) International inspection
d) Research on demoscopy
e) A world journal
f) Central Records Office(s)
g) International conferences on demoscopy
h) International. training center(s)
i) Maintaining specialists on call
j) Encouraging regional or
specialized associations of
demoscopists
k) Developing professional interests
l) 1) Fostering the public relations of
demoscopy.
m) Servicing the United Nations
n) Developing demoscopy in other respects
x=82, norm in U.S.A. in 1946
Active member in the surveying trade organizations
and promotes the 15
functions (eventually a 'scale
may be built to appraise such
professional participation
more
II. Summary of Proposed Minimum Standards
Is the following set of standards too rigorous, or not rigorous enough, to adopt at the
outset?
An agency that is a candidate for membership in the world association should satisfy
the Membership Committee1 that it has the following indications of integrity and competence:
 It must have operated for at least the preceding year with expectation of continuing
at least another year. (No. 7)
 It must have kept- on public record its sponsorship, financial backers, and
responsible officials with addresses and other bodies to whom its reports are made
or channels of publication. (No. 1)
 It must have abstained from pressure group activity, i.e., partisan activity such as to
prejudice impartial fact finding. (No. 3)
 It must state its willingness to be inspected and have preserved its records of at
least the preceding year so as to be available for future inspection. (No. 23, No. 35)
 It should be able to 'present endorsements by public leaders and social scientists of
high integrity and variety of political views. (No. 2d)
 It should have no plausible public charges of falsification. (No. 2e)
 It should-publish a manifesto promising honest and reliable fact finding and
reporting. (No. 2f)
 It should have given verbatim quotation of questions and numerical findings in
reporting any survey during at least the preceding year. (No. 33)
 Its questioning should have analyzed its issues fairly to all factions on those issues.
(No. 12)
 It should have published the size and nature of the sample and its universe
sufficiently to enable another agency to repeat and check the survey. (No. 2g, No.
18-21)
 It should have published some studies showing good validity for its surveys, (No. 5)
 It should undertake to continue these desirable minimum practices in the future.
 It should promise to deposit with the Association in the future the documents
indicated above including a complete current file of its survey reports. (No. 3r)
Notes
1. Since the application of these standards to judge each candidate agency leaves much to
the Membership Committee (until the standards can be made fully objective), the appointment of this Committee will become as important as the adoption of these standards in
principle. The ways of assuring that such a Committee represents the best scientific and
impartial personnel — with due regard to the public's interest as well as the profession's
interest in, raising standards — and avoids freezing the status quo through vested or other
interests resisting scientific progress are not discussed in this paper. They must be
carefully studied, however, and embodied in the constitution of any national or world
association.
#5. Techniques for World Polling
A Review of the Methodological Literature for Cross-Cultural Surveys
I. Planning for International Polling
In 1953 UNESCO contracted with its Non-Government Consultant Organization,
WAPOR (the World Association for Public Opinion Research) to survey the journal literature on
polling methodology. The resulting 600-page report is previewed here in some 6,500 words. To
keep this paper within limits of a journal article its many pages of bibliographic references are
omitted here. The reader is referred to their fuller citing and reporting in the chapters of the
volume corresponding to the sections of this paper.
The long-range objectives of this research contract were to further the development of a
Barometer of World Opinion in some form. Such a barometer would periodically poll a
representative sampling of an ever-increasing part of the world's population on as wide a range
of topics about man's chief institutional fields of interest as might be currently desired and
financed. It could measure more and more fully as time went on the pollable opinions of
mankind — their asserted aspirations and needs, their feelings and degrees of satisfaction,
their knowledge and expectations, their personal and public or organized actions. For
UNESCO an opinion barometer could mean early measurement of tensions in the minds of
men where wars begin and guidance for preventive steps. For the behavioral scientists a
barometer of world opinions could mean an instrument for observing mankind as a whole,
transcending all present limitations to fractions of humanity and fragments of human culture.
The short-range objectives were to survey cross-cultural methodology in order to
increase its comparability or standardization. The research contract was expected to help
standardize polling techniques internationally. It should tell how pollers, by standardizing their
instrument, are improving the measurement of opinion as to its five chief dimensions, namely:
its universality when done in any country (in the space dimension)
its reliability when repeated at a later date (in the time dimension)
its validity when applied to any kind of speech behavior (in the activity dimension)
its representivity when sampling all kinds of people (in the population dimension)
its utility when directed for any goals or purposes (in the values dimension)
Towards these objectives Project Demoscopes was expected to report to UNESCO, within two
years, upon a bibliographic search of the literature on polling techniques, for international use
especially. The search had to be limited chiefly to articles from two dozen journals during the
past thirty years — though some books and unpublished materials supplied by pollers were
also abstracted and covered in the review. An Advisory Committee of sixteen eminent pollers
from eight countries and a panel of fifteen assistant editors writing signed sections were
organized on a volunteer basis. With an assistant director, Professor Jiri Nehnevajsa, and a
staff of four graduate students the bibliography of 1,103 titles was selected and the larger part
of it abstracted onto form sheets. These were then written up following a detailed outline of
some 140 topics, each of which had been rated by the Advisory Committee. The twelve
chapters of the finished report, entitled Techniques for World Polls, are reviewed in the twelve
sections of this paper.
II. A Matrix Model for Standardizing Polling
In order to systematize both the survey and the field of polling methodology, a theory of
polling was developed in Chapter 2. This theory is presented in the form of a matrix model with
192 cells. Each of these 192 cells, or 'step-parts' as they are called, attempts to standardize
one source of instrument variation and so eliminate one kind of error. Every set of data from
any poll can be analyzed as a product (and not as a sum!) of two factors (not two addends).
Every set of polled data (or observations in any science, for that matter) is always some kind of
logical, mathematical and empirical product of
1) an instrument factor whereby we observe and
2) the phenomena observed.
The phenomena in polling mean the true opinions or speech behavior of respondents if
unbiassed by the standardizing interview situation. The basic hypothesis that is assumed for
testing in controlled experiments in this 'step-parts' model, or operationally defined theory of
polling, is expressed in the equation:
A=aX
Eq. 1 the methodology equation for all science.
This says in effect “the observations (= A) are always a product of the observing (= a) and the
observed (= )” or more simply here “Polls can be factored into polling and opinion or into
research acts of pollers and speech acts of respondents.” Each of these three variables is a
composite, a set of many elements. The kind of resulting product may be a logical product of
classes, or a conjunction of sentences in Symbolic Logic, or an intersect of sets in the algebra
of sets, or an arithmetic or algebraic product, or a product moment of statistical variates, or a
matrix product, or other kind of product appropriate to the form of the data — but not the
corresponding kind of sum (i.e., A ≠ a ± ).
This subhypothesis that products predict better than sums (under conditions specified
by a 'vanishing result' test) can be verified by controlled experiments at any time. Thus the
Washington Public Opinion Laboratory has shown that in spreading an opinion (or other item
or all-or-none state) from person to person with equal opportunity for everyone, the percent of
persons contacted or diffused will necessarily grow in an S-shaped logistic curve. If the
interactors (who are either knowers (= p) or non-knowers ( = q) of the item, so that p + q = 1)
are multiplied together to get their product, pq — which expresses the joint probability of their
interacting in meeting each other — these products cumulate in t successive periods to fit the
t
observed logistic curve excellently. The data and this logistic hypothesis (expressed by Σ1 pq)
showed a correlation of .99 between their in-crements; and the discrepancies were not
significant by the chi square test at the five percent level. Then the identical data were added
together to get their sum, p + q — which expresses the alternative probability of their separate
occurrence. This sum between a constant at unity correlated necessarily at zero with the
observed data of spreading the item by conversations in pairs. Thus merely multiplying versus
adding identical data results in either almost perfect prediction of the observed and reobservable facts or the worst possible prediction such as pure chance gives. This repeatable
classroom experiment is typical of the way in which our step-parts theory can be rigorously
tested. Each of its 192 step-parts is a hypothesis as to a source of error that can be tested in
controlled polling experiments. This Chapter 2 of the report maps out hundreds of research
projects for pollers and graduate students' theses for many years ahead. These should
systematically improve our polling tools, our demoscopes, until they match the accuracy in
specifiable respects of the best microscopes or telescopes in other sciences.
To develop this step-parts model, the polling process was first classified into eight
'stages' or standard subprocesses. These are:
1.
2.
3.
4.
5.
6.
7.
8.
Administering a polling agency, including organizing it;
Designing a poll, including all administrative planning for it;
Questioning, including development and pretesting of questionnaires;
Sampling, including preparation, execution, and evaluation;
Interviewing, including recruitment, training, supervision, and evaluation;
Analyzing, including all editing, tabulating, and statistical computing;
Reporting, including all recording and filing as well as publication;
Interrelating the parties, including pollers and sponsors, respondents and publics.
These eight stages are reviewed in eight chapters in the report. They form major rows in the
step-parts matrix shown in the Conspectus attached (Figure 1).
Figure 1
The Variables in the Model:
Its Central Hypothesis:
Its Testing:
Stages or
periods
Sub-Stages or
Steps
Every poll (A) is analyzed into 2 factors, i.e., the public’s opinion () and the
poller’s procedures (a): A =  x a.
The opinions (i.e., speech reactions in a poll) proposed as behavioral criteria (B)
for testing this model internationally are: assertions (as in a census) of age,
sex, schooling, income, occupation, nationality, religion, and languages
spoken.
The procedures of polling (i.e., role-actions of pollens) are analyzed here into 8
stages and 24 substages called "steps" in polling. Each step may have 8 type
parts (which are the dimensions of any human behavior) yielding 192 "stepparts" in the cells.
Standardizing the step-parts improves the comparability and validity of polls.
Reliability (rAA’) (primes indicate reobservings) and validity intraclass correlations
(r
AB, BA') from repolling the criteria' opinions, with each step-part in turn varied
alone over poll and repoll values, can test how standardizing may improve
validity. r AA’, A’A = 1 if σ(a – a’) = 0 and r BA. = r BA',AB
Eight Type-Parts
Acts Actors Values Time (dates, Space
(Behav (pollers, (wants, sequences, (places,
-ior) pollees, goals,
durations, distances
publics) desideretc.)
, areas,
ata)
etc.)
A
P
V
T
L
What is By
Why?
When?
Where?
done? Whom?
jj1
Administer Preparing 11j
-ing an Executing 12j
agency Completing 23j
Designing Preparing 21j
a poll
Executing 22j
Completing 23j
Question- Preparing 31j
ing
Executing 32j
Completing 33j
Sampling Preparing 41j
Executing 42j
Completing 43j
Interview- Preparing 51j
ing
Executing 52j
Completing 53j
Analyzing Preparing 61j
statistically Executing 62j
Completing 63j
Reporting Preparing 71j
Executing 72j
Completing 73j
Interrelat- Preparing 81j
ing all
Executing 82j
parties Completing 83j
jj2
jj3
jj4
jj5
Materials Symbols Residual
(equip(words, Conditions
ment and numbers, (i.e., all
funds)
records)
else)
M
VV
With
With
Which
Which
Things? Symbols?
jj6
jj7
111
C
How
else?
Jj8
118
The 192 “step-parts” cells in this matrix are derived by cross
classifying the 24 steps with the 8 type-parts or factors. Each
cell entry or “step-part” is defined as a logical product or joint
occurrence of ins rows and column entries. The cells spell out
an operation definition in three tenses for each of the eight midstages by telling who does what, when and where, why and
how.
Each step-part represents a standardizable unit of operating
procedure in polling.
Each cell’s code is a three-digit number, jjj, where: j stands for
each digit, or subclass, in turn; j**, the hundreds digit, stands for
the stages; *j*, the tens digit, stands of the substages; **j, the
units digit, stands for the type-parts; “0” stands for the absence
of subclasses; and “9” stands for mixtures of subclasses
831
838
Then each of the eight stages is subclassified into three substages or 'steps' by its
tense. These twenty-four steps in polling are the preparing, executing, and completing
(including any evaluating) of each stage. They specify the twenty-four rows of the step-parts
matrix (Figure 1).
Next, each step, since it is a human action, can be analyzed into the eight
standardizable 'type-parts' of any human behavior. These type-parts are:
1. Acts, A — which are here any step in polling, abstracted from its context which is the
product of the seven type-parts below.
2. Actors, P — which are here people and appear in any of their three forms or
dimensions, namely:
as actors, the public whose opinions or speech acts are wanted;
as reactors, the respondents sampled to represent that public;
as role-actors, the pollers who measure the sampled opinions.
3. Time, T — which appears as dates or frequencies of polling, as tenses or sequences,
as durations or indefinite existence, as speeds or celerations, all depending on the
origin points, units, and indices used in measuring any facet of time.
4. Space, L — which appears as locations, distances, and areas.
5. Values, V — which are operationally defined as 'desiderata' or `whatever pollees say
they want' and appear under many labels as desires, wishes, purposes, ends, goals,
aspirations, intentions, motivations, etc.
6. Material, M — which appears usually as equipment and supplies, money or budget.
7. Symbols, S — which appear as words, statistical variables, or other symbols, and all
records. And —
8. Residual circumstances, C— which complete the context making these eight type-parts
a perfectly closed system for polling.
These eight type-parts of any human transaction (= action-in-fully-factored-context) can
be mathematically treated for any purposes of exact science in either a dimensional formula or
a statistical formula.1 When cross-classified with the twenty-four steps, as in the matrix, Figure
1, they define the 192 step-parts, each as one type-part or aspect of one standardizable step
in the polling process.
The step-parts model, or theory of polling, can now be summarized in the testable set of
192 hypotheses, namely: 'Standardizing each step-part improves the reliability of polling and
the consequent validity, representivity, utility, and universality of polling'. For standardizing
each step-part reduces its variance (= σ2a), or error in the equation, A = a x . This leaves the
observed variance of the data (= a2A) more purely as the variance of the people's true opinions
(= σ2).This is the first objective in any polling — to record data that corresponds closely to
the phenomena free of instrument error.
These 192 step-parts (or coarser breakdown of polling into just eight stages or into just
eight type-parts if so desired) should be practically useful in many ways. They provide a
coding, as broad or fine as needed, for putting anything about polling into IBM punch cards,
into office files, into reports or textbooks on polling, into library and bibliographic classifications,
into courses for systematic teaching of polling, into specifications of contracts for market
research, into administrative organizing of polling agencies, or into theoretical systematizing of
polling, etc.
III. Review of General Publications
Chapter 3 of the report on Project Demoscopes reviews the general literature on polling
methodology. It covers the historical development from the Literary Digest's fiasco of 1936 and
Gallup's offsettingly successful first presidential poll in that year, on to the great wartime
expansion of polling, through the mispredictions of the American elections of 1948, and down
to 1955. It deals largely with the 000 and 999 cells of our step-parts matrix (which molds the
outline of the volume and of this paper). For the 000 cell denotes general polling materials
unclassified by stage or type-part and the 999 cell denotes materials of mixed classification
involving several stages and type-parts.
This chapter reviews the developing and expanding fields of polling. These may be
most comprehensively listed as the dozen chief institutions of society whereby men in all
cultures and periods interact and communicate to get what they want — thus forming,
expressing, and acting on their opinions. Polls are made nowadays in all twelve institutional
fields of politics and economics, family life and education, health and recreation, religion and
welfare, art and science, mass communications and military affairs.
The types of polling have similarly proliferated along all dimensions as described in
more detail in the following chapters. These review in turn the journal literature on each stage
of a poll.
The uses (and some misuses) of polling have also grown far beyond their original
journalistic functions. Perhaps the chief trend in usage has been to develop from purely factfinding polls to include hypothesis-testing functions also. In the early days polls asked
questions designed to collect the facts of a public's distribution of opinions. Increasingly today's
polling also seeks to test substantive as well as sampling hypotheses. Modeling and controlled
experiments and other forms of basic research which seek firm generalizations and even laws
of opinion are being added to the applied research which collects opinion largely to guide
immediate decisions.
Polling is developing from an art into a science. It depends increasingly on the
behavioral sciences, especially those studying man's speech behavior. The best training of a
modern poller calls for courses in psychology and the social sciences and some of their
technologies, together with selections from the semiotic sciences of semantics, logic,
mathematics, and their most relevant branches such as statistics, matrix algebra, and set
theory. The newly emerging theory of information with 'bits' for units is enthusiastically
described by a few writers as a major breakthrough in the behavioral and speech sciences —
perhaps equivalent to relativity theory in physics.
Scientific methodology in polling is improving. A general evidence of this is the
increasing use of operational definitions to provide observable, quantifiable, and testable
variables. 'Polled opinion' is increasingly defined in practice as the distribution of responses in,
a poll, i.e., as speech behavior observed under specified and repeatable conditions. 'Attitudes'
are increasingly treated as classes of observed speech acts, often measured in scaled
degrees. Concepts that cannot be defined at least in part by pointing to the questions and
answers in some poll are disappearing as of little interest for scientific purposes in the field of
polling.
Furthermore, substantive theories of opinion are being proposed which partly depend
on polling methodology for defining or testing them. Thus Guttman's theory of the first four
components (a basal opinion, its intensity, closure, and involvement) is defined in terms of
indices from a frequency distribution of responses of a population. Similarly Dodd's tension
theory of human values or motivations aims to predict mass behavior from a tension ratio
which compares the amount of some desideratum with a polled population's intensity of
desiring it. This 'give/get' ratio measuring the social price or motivating force of anything
wanted is best determined from scaled questions in polls finding out just how much one will
give or do for specified amounts of whatever he wants to get or to keep. This tension ratio or
exchange ratio involves the two dimensions of acts and values in the author's transact theory
cited above for predicting any human behavior including speech behavior in opinion research.
Then the author's mode-tense model extends the 'acts dimension' by subclassifying
polled opinions into the three modes of speech (in saying 'I feel —,' 'I know —,' or 'I do —') and
cross-classifying these by the three tenses of time (as past, present, or future, feelings,
knowings, or doings). The resulting matrix of nine mode and tense cells or subopinions is
expected to predict a public's behavior via multiple regression equations better than any other
set of the same number of opinions not stratified into these nine mode-tense categories. This
mode-tense theory of opinion is defined for any topic in any culture by a poll's questions and
procedures and is empirically tested by its degree of improved prediction of a later mass
behavior from an earlier poll. In this way the theory is operationally defined by operations of
polling. We expect this general trend to be strengthened in the public opinion research of the
future.
IV. The Administering Stage
Chapter 4 of WAPOR'S Report to UNESCO reviews the scanty set of articles bearing
on the organizing and administering of polling agencies, especially the international agencies.
Much of polling administration's problems are similar to those of business and public
administration generally. Manuals of secretarial and office practice, of personnel management
and financial accounting, of public relations and publishing all become more useful as a polling
agency grows in size to employ hundreds of interviewers and office staff.
The factors in the transact theory provide a universal and completely cross-cultural
frame for planning and executing any kind of polling operation. For the polling director's whole
job is to deal with each dimension of the context, each type-part of any human transaction,
whatever its contents or complexity. Note how each dimension of a transaction contributes in
international polling:
The administrator must first choose his objectives — the values dimension. Is the goal
of his agency to be market research, journalism, governmental guidance, public relations
research, basic behavioral research, or any combination of these and other purposes —
perhaps under the heading of 'public opinion research' — `P.O.R.' ? (This general objective
has appeared in several agencies and associations such as WAPOR, the World Association
for Public Opinion Research; AAPOR, the American Association for Public Opinion Research;
IPOR, International Public Opinion Research; etc.
Next the administrator must plan the agency's program of activities — the acts
dimension. Will face-to-face, telephonic, or mail canvassing be used? Machine or hand
tabulating? Short spot polls or exhaustive panel polls? Etc.
He must organize a polling staff and select a representative sample from some
population universe for each poll. These deal with the three dimensions of any population — its
role-actors, reactors, and (potential) actors.
He must schedule the polling with deadlines which coordinate the work of all the staff in
the time dimension. He must select the polling area, the respondents' addresses, the
interviewers' bases, the office quarters, and other aspects of the spatial dimensions. He must
possess the equipment for any demoscope from typewriters to printing press, from telephones
to transportation facilities, from pencils to tabulating machines, from central office to any
branch quarters, together with an expectation of income that enables him to budget expenses
and manage the materiel dimension. He must know how to adapt symbolic dimensions to his
objectives in their multiplex forms of phrasing the polls' questions and answers, reports and
records of all sorts including all statistical analyses. Finally, there will always be the eighth
class of dimension or type-part of any transaction. This includes the residual circumstances
with which the poll administrator must cope, whether it be a sudden revolution in one country
or a routine national holiday in another country which disrupts international uniformity in a poll.
This chapter on administering polls surveys the international polling agencies already
built up. These polling agencies and pollsters are fully listed in several directories and in the
membership lists of the three professional associations, WAPOR, AAPOR, and ESOMAR.
They may be classified in five types as follows:
1. International market research agencies like International Research Associates and J.
Walter Thompson Company. These do any sort of polling under contract in as many as
thirty countries through their national branches or affiliated agencies.
2. Associations of journalistic institutes of public opinion like the Gallup affiliates. These, in
addition to serving their newspaper chains, also engage in separate market researches
and other contractual polling in a dozen countries.
3. Academic agencies like the Organization for Cross-cultural Research. This organization,
centered in Oslo and supported by foundation grants, designs and helps execute
closely coordinated basic behavioral research in seven European countries.
4. U. S. governmental polling contracted out by the Defense Department or State
Department. These polls are made in countries outside the United States, whether
classified as confidential or not, in the interest of American foreign policy and aid
programs, the Voice of America, NATO, etc.
5. UNESCO contracts, through WAPOR to national agencies for one-topic polls, such as
Cantril's 1948 study of eight nations' attitudes towards each other and the appraisal in
three old university towns in three countries of a campaign on the Universal Declaration
of Human Rights.
These international operations seem the practical beginnings in private and
governmental forms of what may develop eventually as a Barometer of World Opinions. This is
the project to establish periodic polling on any topic of public interest in as large a part of the
world as may be currently ready for such cooperative service. It would serve the interests of
UNESCO and the United Nations primarily but might develop to become the world's most
accurate and comprehensive fact finding agency for whatever data are observable through
representative sampling of people. Its prospectus claims that it could make representative
democracy work better on a world scale by amplifying the voice of the people in world affairs. It
could help to integrate public decisions and policies with popular desires and information levels
among the seventy-odd member nations of UNESCO, for example.
This Barometer project may develop in several differing forms under either private or
intergovernmental auspices. Under private auspices an International Business Climate
Barometer is already functioning. Under government auspices it was included in 1943 in an
early draft of what later became UNESCO'S charter until Russian opposition deleted it there. It
was an explicit objective in founding WAPOR. WAPOR has several times by official resolutions
and by accepting contracts promoted the Barometer idea. The present Project Demoscopes
and volume on Techniques for World Polls, by reviewing the methodology of international
polling, was conceived as an important step towards such a Barometer. Several articles have
suggested that UNESCO'S Social Science Division establish an office for coordinating polls in
a progressively larger set of countries, on a larger set of issues, at more frequent intervals,
with an enlarging budget and staff. Whether or not all this were packaged with a label of 'World
Barometer' is of secondary importance, of course.
V. The Designing Stage
The review, in Chapter 5, of the journal articles on the designing of polls shows that a
highly general or standardizable description of planning a poll can now be given. For
transaction theory provides universal terms for specifying the essential operations and their
context as briefly as in a formula (Eq. 2) or as fully as in a manual of polling. The following
paragraphs offer an outline for cross-cultural purposes in 600 words.
The interact of asking a question and answering it is the core content of polling. The rest
is its context of preparing to ask and of processing the answers. This elemental interact can
then expand into a set of questions and a set of responses with any desired differentiating into
subsets. Each question has its subset of answers whether in the form of open-ended free
prose or closed-end listing of alternative responses, or scaled degrees of one kind of
response. The rest of the interview is speech context to enlist the respondent's full
cooperation.
To prepare the questions every poll starts with some objectives in its design, whether
explicit in writing or implicit in habit. These state the reasons for polling in the current case.
What opinions of what public are wanted for what purposes? Then those objectives must be
analyzed into a set of askable and answerable questions. These are pretested with a
thoroughness proportionate to the accuracy that is desired in measuring the public's opinions.
They are then printed with any instructions in a questionnaire to standardize them every time
they are asked.
The askers and answerers must then be selected. Immediately, the askers are a set of
trained interviewers competent to ask and record answers in standardized ways and the
answerers are a set of respondents selected on some representative or other specified and
repeatable basis. Mediately, the interviewers mediate in questioning for the poll director and
any sponsors or clients for whom the director mediates in turn. Similarly the respondents
mediate through standardized sampling procedures for some public or specified population
whose answers are the objective of the poll. The askers are then the chain of role-actors from
sponsors to pollsters to their interviewers while the answerers are the sample of reactors
representing by inference the public or speech actors whose opinions are wanted.
The answers of the respondents are recorded verbatim or in precoded categories ready
for tabulating the frequency of each response to each question. This tabulating, whether by
hand or machine, records the frequency distributions of responses that operationally define the
public opinions at issue as 'polled opinions.' These distributions will then be reported and
compared, discussed and interpreted, as the poller thinks relevant to his objectives and to his
readers. Whether these readers are his sponsors, other pollers, the respondents, the general
public, or any special public will determine the fullness and style of reporting in lay or technical
language.
The above outline of the polling process covers its eight stages and can be developed
to their substages or steps in polling with any amount of local variation. These variations from
the base line sketched above will be due to the eight dimensions or type-parts of polling. To
reduce such variation one specifies a particular poll's dimensions as follows:
What are the objectives? — This tends to fix the values dimension and bounds the
remaining dimensions.
How many questions? — This tends to fix the activity dimension.
How many interviews? — This tends to fix its actor dimensions of respondents,
interviewers, and other staff
How many polls in a period? — This tends to fix its timing dimensions.
How large an area? — This tends to fix its spatial dimensions.
How much analysis and reporting? — This tends to fix its symbolic dimensions.
How much money is available? — This tends to fix its materiel dimensions and limits all
other dimensions.
What special features has this poll? — This tends to fix all the residual dimensions, if
any.
VI. The Questioning Stage
The sixth chapter on questionnaire construction, in the report on Project Demoscopes,
along with the chapters on the remaining six stages, will be reviewed here more briefly. This
chapter discusses techniques of questioning, for cross-cultural polling and for all polling, in the
usual three substages of preparing the stage in hand, executing to it, and completing it.
Under preparation, the chapter discusses the choice of questions, the pretesting
procedures, and the many ways for fixing on their final phrasing including the phrasing of
closed-end and scaled responses.
Under content of the questionnaire, lists of types of questions are reviewed together
with notes on formats for presenting them. For an example of a closed-end question, a
standardizable 'rectangular' format which minimizes space and maximizes convenience for
interviewer and tabulator is shown. The internal and bottom boundary lines can be shifted to fit
the spaces needed by each particular question while keeping standard column width and
alignment of pre-coded responses.
Under completing the questioning stage, the evaluation of the questionnaire is
discussed with subsections on 'reliability,' validity,' 'acceptability' to the respondents in diverse
cultures, ' and 'fidelity' of translation for interlingual polls. An index of fidelity is developed, for
example. This is the percent of semantic units or phrases which are in agreement (either
exactly or within specified limits of synonyms) between the original phrasing of a questionnaire
in language A and its doubly translated phrasing. The doubly translated phrasing is obtained
by one or more translators translating it into language B and then having one or more new
translators independently retranslate those versions back to language A. This is a technique,
peculiar to cross-cultural polling, which can help to make such polls more fully comparable.
Figure 2
Question
number
Text of question in quotes and bold
type exactly as the interviewer is
to say it
Any instructions to interviewer in light italic type
Response
categories
in words in
bold or light
type
according
as spoken
or not to
the
respondent
Punched
card
column
and row
code, i.e.,
a digit (to
be
encircled
for the
responses
VII. The Sampling Stage
The enormous literature on sampling could not be adequately reviewed within the
manpower resources and space limitations of the report in this survey of methodology for
cross-cultural polling. Our limited canvass, however, justifies, we believe, the following five
generalizations about sampling techniques for world polling:
1. Sampling consists in selecting and using a part to represent some whole.
2. The 'whole' in polling usually means a specified human population but may be
measured along any other dimension such as in time sampling, areal sampling, activity
sampling, and opinion or attitude sampling (as when testing a sample of statements for
their scalability by the Cornell technique).
3. Representativeness of a sample is mathematically best assured and measurable by
probability sampling, especially when both the sample and its 'universe' are large. The
representativeness of quota sampling is usually less or unknown. Availability sampling
is still worse.
4. Probability sampling's exactness is somewhat offset by its greater expense,
administrative difficulty often, and possible biasses if imperfectly executed.
5. Research is progressing, but much more is needed, towards combining the accuracy of
probability sampling with the ease and lower cost of quota or other sampling.
A few items of this research may be glimpsed. Kendall and the British pollers have worked on
combining the best features of each. Deming's new duplicate sampling technique (reviewed in
this chapter) greatly reduces computational administrative costs in probability sampling. The
technique of weighting respondents in proportion to the number of evenings they were at home
in the preceding seven days helps correct for the `not-at-home' bias in probability sampling.
Stapel's 'birthday' technique of asking to interview the person with the nearest birthday
randomizes the choice of interviewee within the household more simply than by using Kish's
tables for this. Other techniques for better and easier sampling include development of: more
complete population lists, maps showing every dwelling unit, previous master samples
available for purchase, tables of random numbers and other mechanically randomizing
devices, etc.
VIII. The Interviewing Stage
The review in Chapter 8 of the journal articles (and a few books) on the interviewing
stage in a poll is broken down into the three substages, and each of these is further
subclassified by the eight type-parts. The fullest way to picture this chapter's review, when it
has to be further condensed in this paper to a paragraph, is to reproduce the table of contents
for the descriptive part of this chapter. (Its author, Peter Mazur, adds an evaluative part giving
his appraisal of the literature on each substage.)
Chapter Section
Type-part
I. Preparing the interviewers
A. Setting qualifications for interviewers
B. Recruiting suitable interviewers
C. Preparing instructions for training interviewers
D. Using interviewer equipment and calculating costs
E. Choice of a suitable place for training
F. Training interviewers in time
G. Other conditions
(Pre-stage)
Acts and values
Population
Symbols
Material
Space
Time
Circumstances
II. Conducting the interviews
A. Setting standards for an interview
B. Matching interviewers and respondents
C. Keeping interviewer records
D. Using material means in an interview
E. Placing the interview
F. Scheduling the interview hours and days
G. Other circumstances in an interview
(Mid-stage)
Acts and values
Population
Symbols
Material
Space
Time
Circumstances
III. Evaluating the interview
A. Measuring fulfillment of standards
B. Measuring incompatibilities among interviewers
C. Researching on semantics of interviewing
D. Measuring costs of interviewing
E. Measuring spatial factors
F. Measuring duration of interviewing
G. Appraising the context of the interview
(Post-stage)
Acts and values
Population
Symbols
Material
Space
Time
Circumstances
IX. The Analyzing Stage
The two authors of the report, Techniques for World Polls, followed the ratings of
WAPOR'S Advisory Committee as to the relative amount of reviewing space to be given to
each step in the polling process. The consequent abstracting of the vast literature on statistical
analysis was limited here to what might be most useful and distinctive to cross-cultural polls
leaving the international poller to read up on most of the statistical techniques in the many
available textbooks and manuals.
For cross-cultural purposes, the chief techniques are the editing of filled-out
questionnaires or schedules, the coding and preferably pre-coding of responses, their transfer
to punch cards (or edge-marked cards for hand tabulating), and the tabulating of the
frequencies of response in as many breakdowns or subclasses as desired. Once the data is on
a punch card, it is in an entirely international statistical language whatever the idiom of the
interview. The commoner statistical analyses such as getting any desired indices like percents,
other means, variances, raw and multiple correlations, chi squares, factor loadings, tests of
significance levels, of confidence bands, of excellence of fit, etc. are all completely
independent of the language of the questionnaire or the culture of the respondents.
A note may be made of the coming mechanical and electronic instruments which will
vastly increase the volume, speed, ease, and accuracy of international polling especially
(where a world sample may run up to 100,000 interviews eventually). Thus one unpublished
proposal blueprints a wire recording device the size of a cigarette case. It would be operated
by the interviewer's fingers of one hand punching in the responses as heard. It would flag
omitted questions before the interview was over, thus editing out such gaps. A thousand items
of .response could be recorded as magnetized or non-magnetized points a millimeter apart on
each person's meter of wire — which would be equivalent to one punched card. One hundred
spools could unwind 100 parallel meters of wire on a scanning table every second. This would
yield tabulations of 100 cases per second in each of 1,000 frequency counts which would be
rung up in 1,000 counters. The readings of the bank of 1,000 counters would be photographed
every second recording all data cumulated to date by successive samples of 100, permitting
quick testing of reliability. With I,000 'electric eyes' reading the 1,000 magnetized points
through 'scoring screens,' any selecting and combining of data becomes possible for item
analyses, personal scores, distributions, scattergrams, curve fits, factor analyses, variance
analysis, etc. A thousand items of response could conceivably be tabulated for a million
persons in one day by one machine. If a world census were taken on one day (as some
countries now do it), it could be all tabulated the next day and printed on the third day! A future
society that wants to predict and control its affairs most fully could organize and pay one
randomly sampled person in every thousand to work as a paneller for ten minutes a day. Each
paneller would spend his daily ten minutes recording into a telephone his 'statistical diary'
consisting of his responses to a questionnaire on some hundred items of diverse behavior
each of which he did or did not do that day. Central machines could then tabulate, for the world
or for any sampled fraction of it, each of as many thousand human actions or attitudes as
desired. Each of these or any combinations of them could then be mechanically translated into
daily forecasts for any periods ahead. All this is now physically possible whenever society
wants it enough to pay for it. The cost might be a small fraction of the billions of dollars now
paid for military armaments.
X. The Reporting Stage
The reporting of polls is subdivided in Chapter I10 of the report under review here into
its three substages, or steps, by tense, as planning the reporting, writing it, and publicizing it.
The reporting should be planned to inform its readers and so its language and length
will vary according as it is addressed in the appropriate national language(s) to the general
public, special publics, or persons technically competent in the content and/or methodology.
Often a three-level report is useful for mixed audiences — telling its story in an abstract first,
then more fully in the body of the report, with technical appendices of details for specialists.
Reports of polls are increasingly being written in four versions woven together. In order
to communicate most fully to most people, reporting in sentences is necessary; numbers and
tables supplement words with greater accuracy; graphs make these easier to see and
remember; cartoons or pictured meanings may make reports still better understood and
believed. The prose should be objective in style and operational in its definitions, permitting
repolling and verifying. It should tell what it wants to say in well-known and oft-used short
words — as proved by a words-per-syllable rate of .6 or more. (This ratio is .93 in the last
sentence and .5 in the previous sentence, for example.)
A set of standards for reporting polls, as endorsed but not adopted by AAPOR in 1948,
is:
Poll reports should state:
1. The purpose of the survey
2. For whom and by whom the survey was conducted
3. The universe investigated
4. Size and nature of the sample
5. Time spent on field work
6. Nature of the interviews
7. Control methods used
8. The phrasing of the questions
9. The bases of percents
10. The distribution of responses
For publicizing reports on polls the chapter gives a list of ways. The media for
publication, such as news releases, mimeographs, printed articles, books, are all used, of
course, but these are not peculiar to polls.
XI. The Interrelating Stage
The last 'stage' of interrelating all the parties to any poll is not so much a final process
as one which pervades all the previous stages and is only discussed last in summary. For
interrelating the parties means much more than the usual concept of public relations. It
includes the two-way communicating and cultivating of attitudes and actions in all
combinations among the following six parties. The six parties, directly or indirectly related to a
poll, are:
1)
2)
3)
4)
5)
6)
the sponsors who start a poll,
the pollers at all levels in the polling agency which executes the poll,
other pollers as competitors or fellow professionals,
the respondents in the sample,
the public that is sampled, and
the publicists (press, broadcasters, etc.) who may mediate between any of these.
A seventh party in the future may be poll auditors who on request will investigate with
appropriate measuring rods the honesty and competence of any poll or pollster and certify their
findings to the public much as chartered accountants do for a firm's financial bookkeeping.
Such accrediting may become available and much utilized in international polling especially.
The chapter reviews some principles of public relations in polling, a list of 'effective
policies' for developing public confidence in a polling agency, and the rudiments of a proposed
code of professional ethics for pollers.
XII. Recommendations for Developing World Polls
The agreements between UNESCO, WAPOR, and the Washington Public Opinion
Laboratory as subcontractor for this survey of polling methodology specified that in the volume
reporting it the authors' evaluations should be clearly separated from their descriptions of the
literature reviewed. Accordingly, the last chapter in the report presents the authors' personal
evaluations in the form of recommendations for action. These recommendations are grouped
by the chapters of the report. They are variously addressed to UNESCO, to WAPOR, to
pollers, to sponsors of polls, to publicists, to graduate students, and to the public. They add up
to pointing out steps for relevant parties to take whereby polling and international polling
especially can be developed as a more exact science and as a more useful tool for society.
For the specific recommendations the reader is referred to the report, Techniques for
World Polls, which will be in UNESCO'S files.
The present paper is a preview or appetizer for that review of polling techniques.
Notes
1. The dimensional formula [denoted by square brackets] specifies the exponents or level of
development by self-multiplication for each type-part of any transact (= B) or behavior-insituation thus:
[B = Aa • Pp • Tt • Ll • Vv • Mm • Ss • Cc]
Eq. 2
The statistical formula is expressed in our interdisciplinary and standardizing notation by
corner scripts. In addition to the exponents these four corner scripts also specify the origin
and other points, the units and indices by means of scripts on the other corners of any
variable, thus:
units
s
points
s
X
s
exponents
s
indices
Dimensional formulas are useful for generalizing as in classifying data, formulas,
and situations into families of dimensional models expressing hypotheses and laws; while
statistical formulas are useful for particularizing as in specifying local details of terminal
points, class-intervals, and indices of each variable as used in a particular study.
#6. The World Association for Public Opinion Research
Organized in 1947, the World Association for Public Opinion Research provides a
focus for persons in all parts of the world who are interested in this area of social science.
WAPOR has served not only as a stimulus to a developing profession, but has been instrumental
in advancing the ideals of the United Nations, especially through cooperation with UNESCO.
The author has played a major part in the development of techniques for international opinion
research and in the establishment and growth of WAPOR. He is Director of the Washington
Public Opinion Laboratory in Seattle.
In the summer of 1945 the Public Opinion Quarterly published an article by the present
writer entitled "A Barometer of International Security". The POQ editor prefaced it as follows:
The United Nations may use a tool the League of Nations never had — public opinion
polling. Stuart C. Dodd describes the steps already taken in that direction, the argument for
further steps and the possible means of combating the forbidding, falsification, or frustration of
the polls by individual countries.
The next summer, the first international conference on public opinion research at
Central City, Colorado, appointed a committee to plan a worldwide professional association. Its
future functions were then more fully proposed in a paper entitled "Towards World Surveying." 1
The fifteen functions detailed there were largely embodied in WAPOR's first constitution,
adopted the next summer (1947) at the Williamstown conference. In the revised constitution of
1954 the matured purposes and functions of WAPOR are set forth in Article II as follows:
The purposes of the Association shall be:
a) To establish and promote contacts between persons in the field of survey research
on opinions, attitudes, and behavior of people in the various countries of the
world, and
b) To further the use of objective, scientific survey research in national and
international affairs.
Functions and activities of the Association may include, but are not limited to, the
sponsorship of meetings and publications, development of improved research techniques,
encouragement of high professional standards, promotion of personal training, coordination of
international polls, and maintenance of close relations with other research agencies ... as well
as ... UNESCO and other UN agencies. ...
Since then WAPOR has held annual conferences, drawing from around 100 to 400
pollers for professional papers and discussions, transaction of organizational and private
business, recruiting, and professional acquaintance. WAPOR's conferences, varying between
spring and fall, have been held in alternate years jointly with AAPOR, (the American Association
for Public Opinion Research) in America or ESOMAR (the European Society for Opinion and
Market Research) in Europe, as follows:
Eaglesmere,
Pennsylvania
Paris, France
Lake Forrest, Illinois
Tunbridge Wells,
England Poughkeepsie (Vassar),
New York Lausanne, Switzerland
Asbury Park, New Jersey
Konstanz, Germany
Buck Hill Falls,
Pennsylvania
1948
1949
1950
1951
1952
1953
1954
1955
1956
WAPOR Proceedings were reported in the International Journal of Opinion and Attitude
Research up through 1951 and in the International Social Science Bulletin, published by
UNESCO, since 1952. In addition, by 1956 the organization was putting out several
newsletters a year to its membership. To date, the organization's membership has been drawn
from some twenty countries. By 1956, 158 persons were carried on its rolls. The following individuals have served as officers:
President
Secretary-Treasurer
1946 George Gallup (U. S.)
Stuart C. Dodd (U. S.)
("Chairman")
1947 Jean Stoetzel (France)
1948 James R. White (U. K.)
1950 John F. Maloney (U. S.)
1952 Jan Stapel (Holland)
1954 Leo Crespi (U. S.)
1956 Bjorn Balstad (Norway)
("Co-chairman")
Frederick Williams (U. S.)
Stuart C. Dodd (U. S.)
Stuart C. Dodd (U. S.)
Stuart C. Dodd (U. S.)
Helen M. Crossley (U. S.)
Helen M. Crossley (U. S.)
In addition to serving the professional needs of those interested in international opinion
research, WAPOR has played a part in advancing the aims of the United Nations.
UNESCO's Division of Applied Social Science organized several international polls by
contracts through WAPOR and in 1953 the Association was given the status of Nongovernment Consultant Organization. This puts it on the register of the Economic and Social
Council of the UN, making it in effect the chief spokesman for the polling profession before
United Nations agencies.
Three pioneering examples of international polling under UNESCO and WAPOR
auspices may be noted. In 1948, under Hadley Cantril's direction, UNESCO organized a poll in
nine nations, with a sample of about a thousand persons in each. The nations were: Australia,
Britain, France, Germany, Italy, the Netherlands, Norway, Mexico, and the United States. For
fourteen carefully constructed questions the resulting volume, entitled How Nations See Each
Other,2 compares the responses by countries and by their subclasses of sex, age, income,
schooling, and occupation. The questions dealt with the class structure, personal security and
satisfaction with life in one's own country, friendliness of feelings and stereotypes about other
countries, and ideas about human nature, peace, world government, and national character.
In 1951, UNESCO arranged subcontracts through WAPOR for a more experimentally
designed polling operation in three countries. A test was attempted of the effectiveness in
changing attitudes or information levels about the Universal Declaration of Human Rights
through educational campaigns. Polls were made before and after such campaigning in three
small towns (Oxford, Geneva, and Uppsala) which were seats of ancient universities in their
countries. There were no control groups for comparison, however, nor calibrating of the
reliability or precision of the polling. Hence the findings were somewhat inconclusive as to
whether the differences observed, which were small for the most part, meant that the brief
campaign changed most attitudes very little or whether the measuring was too coarse and
unreliable to reflect small changes.
Currently, in 1956, Stein Rokkan has been working closely with the UNESCO
Secretariat on the problem of comparable international polls. A WAPOR newsletter (February
29, 1956) reports him "working out plans — on behalf of UNESCO — in a preliminary but
systematic way — to ask several questions simultaneously in various countries by adding
questions to ballots in existing surveys."
I. The Developing Theory and Methods of International Polling
In 1953, a contract between WAPOR and UNESCO called for a survey of the
developing methodology of cross-cultural polling. This survey was to be executed by the author
at the Washington Public Opinion Laboratory in Seattle. WAPOR appointed an Advisory
Committee of 16 eminent pollers from eight countries to advise and criticize the plan of the
study together with the two drafts of its report. A two-year budget of $4,380 was provided, which
met one third of the project's cost. The task was to abstract and review the literature on survey
methodology appearing in journal articles around the world during the previous quarter
century. This review, and any synthesis its authors might achieve, was intended to increase the
comparability and standardization of cross-cultural polling. It should thus contribute to the
"know-how" of conducting world polls for any eventual Barometer of World Opinion.
The resulting scrutiny of some thousand articles was written up in a volume now
ready for press entitled Techniques for World Polls. 3 In addition to a review of the articles
and a set of recommendations by the author is to the polling profession, the volume presents
a systematic theory of polling methodology. This theory is stated in the form of a matrix model
for opinion research (not for opinion itself). The rows of this matrix, or tabulation of items,
classify the polling process into eight "stages," as somewhat separate and standardizable
subprocesses. Total polling behavior is thus broken down for convenience of analytic and
administrative treatment into:
Stage 1:Administering an agency;
Stage 2: Designing a poll;
Stage 3: Questioning on its objectives;
Stage 4:Sampling its population;
Stage 5: Interviewing with interviewers;
Stage 6:Analyzing statistically;
Stage 7:Reporting in all forms;
Stage 8:Interrelating the parties — clients, pollers, respondents, and publics.
Each stage may be further subclassified by tense into three substages or "24 steps
in polling" as the early, middle, and late substages. These deal in general with planning,
executing, and evaluating each stage. The three substages thus comprise the past,
present, and future tenses of the mid-stage.
These standardizable steps in polling are then cross-classified with the eight "typeparts" of any transaction or complex human activity. 4 These eight type-parts, analyzing
any behavior-in-context, are the columns of the matrix. They are the transactions:
1)
2)
3)
4)
5)
6)
7)
8)
acts and
actors, its
time and
place, its
ends and means including all
material,
symbolic, and
residual circumstances.
The resulting combinations of the eight type-parts of each of the 24 steps specify the 192
"step-parts" in the cells of the "step-parts matrix" or table.
Each step-part is a standardizable source of variation or error which may reduce any
poll's reliability (agreement with a repoll), its validity (agreement with a criterion behavior),
and its comparability (agreement between regions or cultures). Controlled experiments can
isolate and measure the results of varying of each step-part. Reducing the variation of each
step-part helps to standardize the instrument and bring out the variation of the true
opinion more purely in the polled data. This is hypothesized to increase the accuracy of
measuring the public's opinion and of predicting that public's behavior — thus specifying a
testable theory of polling.
The step-parts theory of polling has many other uses. By standardi zing the
classification into eight stages, or more finely if desired into 24 steps, or into eight typeparts, or most finely into 192 step-parts, it flexibly provides:
a)
b)
c)
d)
an organization scheme for any polling agency;
a job analysis for administering any polling operation;
a functional filing scheme for any polling office or library;
a systematic outline for any textbook, manual, course, training program,
conference, index or abstracting service;
e) a budgetary and accounting plan; and
f) a predictive and testable theory for improving polling methodology.
The Techniques for World Polls devotes a chapter to each stage and thus helps to
systematize the world's literature by means of this step-parts matrix or model for public
opinion research whether within or among any nations.
II. The Goal of World Opinion Research
Beyond a world association for the polling profession lies the larger goal of developing
world polls themselves. While this is expected to go forward along many paths, it has been
hoped that some of these ways would eventually merge in a Barometer of World Opinion. This
would be a public or private agency for periodic polling of an increasingly representative
sampling of the world's peoples. It would thus become a demoscope for observing and
measuring the speech behavior of humanity, including its actions and its wants, its conditions
and its needs, its current satisfactions and tensions, its attitudes and its aspirations.
WAPOR has twice moved officially towards developing such a Barometer of World
Opinion. At its Lake Forest convention in 1950 it recommended that UNESCO take steps
towards establishing a "Barometer of International Tensions". 5 Again at Pocono in 1956, in
response to UNESCO's invitation to its Non-Government Consultant Organization to propose
agenda, WAPOR voted for "Project Evaluation." The details of this project were summarized in
the following resolution: "Resolved: that WAPOR recommends, for UNESCO's adoption, a
project to evaluate UNESCO's problems by periodic sampling of the relevant peoples, with a
pilot subproject to establish techniques."
In scientific affairs, a world demoscope would give the social sciences a new
instrument, comparable in power to the telescope in astronomy or the microscope in microbiology. Research in the behavioral sciences could observe for the first time the exact
difference and similarities, the cultural and biological processes, the expressible feelings,
knowings, and doings of all living people. It could overcome current limitations of the human
sciences to observing parts of humanity. It could progress to develop a fuller and more exact
science of man.
In practical world affairs, an opinion barometer could help build one world by
democratically amplifying the voice of the people in United Nations councils. The distribution of
opinion on any world issue could be quickly, accurately, and cheaply determined to help guide
international decisions of statesmen. World government could more truly fulfill Lincoln's
definition of a democracy as government of, by, and for the people.
Notes
1. Dodd, Stuart C., "Towards World Surveying," Public Opinion Quarterly, Winter, 1946-47.
2. Buchanan, William and Cantril, Hadley, How Nations See Each Other, University of Illinois
Press, Urbana, 1953
3. Dodd, Stuart C. and Jiri Nehnevajsa, Technique for World Polls, 1957, 300 pp. (to be
published)
4. Dodd, Stuart C., "The Transact Model — a predictive and testable theory of social action,"
Sociometry. December, 1955.
5. Dodd, Stuart C., "A Barometer of International Security," Public Opinion Quarterly.
Summer, 1945
#7. Developing Demoscopes for Social Research
Reprinted from American Sociological Review, Volume XIII, No. 3, June, 1948
The aim of this paper is to forecast four new or further applications of demoscopes 1 in
basic social research of the future. By a demoscope is meant any scientific instrument for
surveying or polling samples of people. By basic social research, pure scientific research is
meant which is not primarily concerned with solving immediate problems nor immediate
guiding of administrative decisions but which seeks social laws. Basic social research here
means the search to generalize ever more widely the currently limited uniformities that have
been measured in interhuman behavior and in their attendant conditions. Such generalizations
give man in the long run more basic ability to predict and control human relations.
The argument, in this paper, for developing demoscopes as promising instruments for
observing social facts, involves the usual assumptions of scientists. More exactly, it assumes
that:
a) Social progress depends in part on social research;
b) Social research depends in part on observing facts;
c) Observing facts depends in part on better instruments.
With these definitions and assumptions about demoscopes made explicit, their development will be forecast.
I. Pan-Sampling
In forecasting further applications of demoscopes, consider first extending the topics
surveyed to all the social sciences in what may be called "pan-sampling." A pan-sample is a
sample of case studies, or study of a sample of persons each of whom is observed in
hundreds or even thousands of ways instead of the dozen or so ways represented in the
questions of the usual poll or survey. In pan-sampling, a representative panel of persons must
be recruited to be interviewed and tested, each for several hundred hours (which may be
distributed in time as in one hour sessions weekly for several years). At each interview the
interviewees will be questioned and tested and observed in new ways — towards cumulating
eventually every measurable index known to any social science on one and the same set of
persons. A pan-sample thus aims to become a sample of all characteristics of the persons in
the sample. The essence of the pan-sample is the large number of variables or characteristics
measured as well as their being measured on the same persons. The label "pan" is chosen to
denote both all measurements possible on people and on one panel.
Pan-sampling aims to accumulate the most thorough and complete measuring of
human beings and their relations that is currently possible. Thus psychologists would
contribute intelligence and personality tests, tests of skills and information, personal history
questionnaires, as well as a vast array of opinionaires. Economists would explore buying,
selling, and working habits, desires, and values of the panel on all economic matters which can
be determined by individuals, whether key individuals or individuals representative of the
masses. Political scientists would explore all voting and other political behavior and attitudes of
every conceivable kind towards government in every department and area from municipality to
United Nations. Ministers would explore religious affiliations and aspirations of the panel.
Educators would measure attitudes, knowledge and skills resulting from schooling. Medical
specialists would explore physiological functioning from psychiatric aspects to glandular,
muscular and other organic functioning of human beings. Sociologists would inquire
extensively and intensively into family and sex matters, recreational attitudes and behavior,
and into the other institutions such as welfare work, art, science, communication.
Anthropologists, both physical and cultural, would determine the thousands of physical and
residual cultural traits not covered by the other social scientists. In short, all the human and
social sciences would have to cooperate in observing, with planned priorities, everything they
currently know how to observe on that one panel of persons.
Consider the advantages of pan-sampling which would result in proportion as the
number of variables observed on any one population grows larger from its present meager
number towards the eventual vast number needed to comprehend all human traits and
relations.
For the first time in the history of social science pan-sampling would make exactly
known all the conditions under which any generalization held good. Heretofore, from proverbs
to the best current empirical laws, the specific conditions for their holding and for their being
applied in the future are only vaguely and incompletely known. Pan-sampling, in measuring all
concurrent conditions, would determine the conditions under which every one of the myriad
variables varied. The conditions for each variable are simply all the other variables that show
correlation indices with it. Thus knowledge in the social sciences would accumulate
acceleratingly. For every new variable each week that is added to the growing set of n
previously observed variables adds 2n-1 new correlations between 'variables and so 2n-1
additional sources of prediction and control of every one of the n prior variables. Thus every
study on the pan-sample could build on, and also build up, all previous studies. This is a
desirable state of affairs which has been sadly lacking in the social sciences, hitherto. The
thousands of published social studies and the several social sciences have been like separate
bricks, rarely built together into a coherent structure. This cumulating of knowledge provided by
pan-sampling would make the pan-sampling research center the Mecca for social scientists
and develop these sciences in a long leap forward.
This would consequently tend to unify the social sciences. Their separate interests
would become best developed by close cooperation in collecting data and correlating them.
Social science would begin to see and interpret society as a whole instead of the present
myriad facets — usually of unknown relationship to each other. Thus the inter-correlations (n2 n in number) of every variable with every other one would become determinable. From these
the multiple correlation of each variable with the set of all the other variables, each optimally
weighted, becomes determinate. From this, the multiple regression equations can estimate (or
predict if applied to the future) the values of any of the variables observed. The accuracy of
these predictions may be expected to become increasingly high — well over 90% wherever
multiple correlations of .95 or higher are built up by pan-sampling. Studies such as Thorndike's
on the goodness of American cities intercorrelating 300 variables demonstrate that multiple
correlations around .95 are possible. With social prediction raised to such a high level the most
manipulatable variables can be manipulated and so increase social control — man's control of
human society.
Nor is this increase of social prediction (and consequent control) a speculation or even
a probability. It is a certainty. For it has been mathematically proved that multiple correlation
increases with
a) the diversity, and
b) the relevance of the variables that are observed.
Thus when the United States set out to get 60,000 airmen and eventually sampled the
aptitude of more than a million men for this purpose, those aptitude tests which predicted
ability as pilot, bombardier, etc., were built up by multiple correlation techniques. Many diverse
test items were found, each correlating with an index of the airman's ability in actual flying. The
most relevant items were thus picked. But the more diverse these items were, the better. For
differing items have less overlap in that each predicts something about the airman's ability that
is not measured by other items. Hence their predictive contributions add up to a higher index of
prediction. This enabled the authorities to control the selection of candidates so as to build an
air force efficiently.
Thus as pan-sampling increases the diversity of variables (i.e. the number of variables
with low intercorrelations) and their relevance (i.e. finds some variables correlating high with
the criterion variable at issue) the prediction of that criterion variable must increase as
measured by the multiple correlation formula. The only unsettled question is as to the amount
of increase in predictability of social phenomena. How nearly will the multiple correlations from
pan-sampling approach unity? (Correlations of unity mean perfect predictability.) Only trial in
research and better research2 will answer this question.
Pan-sampling has many difficulties. But techniques for coping with these difficulties are
becoming known and lack of funds has chiefly prevented overcoming them. Thus a pansample will shrink in size through deaths, migration, disinterest among interviewers,
incomplete records, and other causes. These require
a ) starting with a sample large enough to stand such losses;
b ) preventive steps to reduce such wastage; and
c ) constant renewal of the sample with new interviewees interviewed on the content of
all previous surveys with all computations involving these additions appropriately
corrected or duplicated.
Again, even though the panel is constant in composition, it will change in age; it may
become bored and uncooperative; or practiced and unduly skilled or articulate; or influenced
by pressure groups with vested interests; or economically unrepresentative under changed
conditions of a depression or war. All these dimensions are measurable with research and
hence such errors in pan-sampling can be corrected.
The one essential condition, not met at present, is that the demoscope which is
developing a pan-sample shall have ample financial support. This means annual funds of the
order of a million dollar or so — not a few thousand dollars of current subsidies for research on
polling. Pan-sampling will require funds comparable to building one battleship, or a dozen
bombers or a one tenth of one per cent share of the atom bomb. 2
Although pan-sampling is expensive, its cost can be reduced by various devices,
administrative or technical. One administrative device, for example, would be to charge for the
use of the pan-sample's panel. Every research project, since it intercorrelates its findings with
the findings of previous surveys, should pay a share of their cost. For another example, a
technical device reducing the cost of pan-sampling is an electronic scanner. This is a statistical
tabulating and computing machine which may displace the present punched cards and contact
brushes as completely as these displaced hand tabulating. The scanner uses 1,000
magnetizable points on a meter of wire (or alternatively 1,000 opaque millimeter spots on a
frame of movie film) to record the data of up to 1,000 items on each person. With 100 such
meters of wire' representing 100 persons stretched on a scanning table simultaneously,
frictionless "electric eyes" could read and electrically record almost instantaneously the per
cent of persons having each of the 1,000 items of data. Similarly, correlation coefficients could
be instantly scanned and photographically recorded. This swift intercorrelating of hundreds or
thousands of human variables has been hitherto mechanically prohibitive in time and cost but
electronics should make it possible. By this, millions of items per minute can be handled by
frictionless harnessing of either electrons or radiant energy.
This electronic scanner, which would facilitate a demoscope in developing pansampling, needs large funds for its development. At present, the author has developed a
statistician's specifications for the scanner, but electronic engineers have still to work out these
or better devices.
A practical beginning4 of pan-sampling would be to build up, through an ample ten-year
grant, an existing polling agency in cooperation with the social science faculty of a university in
some one State. Only a large university could supply the many specialists in measurement in
every human and social science. Only a polling agency limited to one State could combine a
sample that is representative of a whole population and reliably adequate in size (of the order
of 1,000 persons or more) with accessibility by auto within one day's working radius such as to
keep interviewing costs down.
Trucks equipped as mobile laboratories taking the diverse technician interviewers or
testers to the interviewees would be required and travel cost for these on a national scale
would make the budget ten times or so greater than required for a pan-sample in one State.
A representative State in the United States would require for thorough scientific
purposes a sample of some 2,000 persons — calculating the size of sample as the square root
of the population (or perhaps three million or so) and allowing for losses. To keep these 2,000
persons continually cooperating in weekly interviews for years will require elaborate techniques
to build up their interest. It may prove necessary to pay them and so command their services,
as laboratory material, by employment. Thus individual contracts could be made for 100 hours
of testing a year, payable only on completing it all. This may require $100 per interviewee, or
$200,000 a year for the panel of 2,000 interviewees. Such sums for human research are not
large compared to sums currently spent on research in agriculture, chemical plastics, or
electronics.
The large scope of pan-sampling would require sponsors with, large resources. An
average of a million dollars a year for ten years might be required. This means enlisting the
support of large scale agencies such as the Social Science Research Council which could
coordinate or the new federally financed National Science Foundation, were it to include the
Social Sciences eventually. The Army in its post-war research proposals for a basic inventory
and study of "human factors in total war" might adopt the pan-sample technique. Or the Navy's
office of Scientific Research and Development might finance it. Perhaps the surest way of
adequate financing free of any bureaucratic limitations on free scientific enquiry would be for a
group of large industrial corporations to pool appropriations for social research matching their
past or present appropriations for physical research. Thus the Standard Oil of New Jersey with
its more than 100,000 employees, its 160,000 stockholders, and millions of customers is
finding its human relations of avoiding strikes and building up goodwill among consumers and
suppliers (especially in foreign lands where its oil wells are located) fully as important to its
profits as physical relations of the products of oil cracking plants. Basic research to improve
these human relations of big business is vital to their long run prosperity and independence.
Funds invested in a research Institute of Human Relations to pan-sample a population should
yield rich eventual returns in increased predictability and control of collective human behavior.
II. Organizational Sampling
Pan-sampling is one major line of development of demoscopes in the future.
Organization sampling is another major line of further development. Pan-sampling extends the
number of observables; organization sampling intensively observes the functioning of a human
organization. It measures the complex, interhuman relations within an organization such as a
government or a large corporation as far as these are determinable by questioning, or
observing, sample sets of persons. It goes beyond current surveys in that it explores the
intangible but all important interrelations of persons to each other such as interrelated workers
and foremen, executives and office workers, or the organization's representatives and the
public. This is the newly developed field of sociometry — the measurement of interperson
relations in any group, and the interrelations of groups in a larger "grouping." Surveys of
interperson relations have used tests such as Moreno's for the attitudes of attraction-repulsion
between persons in specific situations as companions for eating together, for working together,
for rooming together, etc. Another recent though unvalidated type of measurement of
interperson relations is Chapple's "Interaction Chronograph" which yields five indices which
correlate with, and so predict, different leadership abilities such as of executives, foremen and
salesmen. The chronograph plots the seconds of silence and of talking, or other response, for
each of two persons during a standardized half hour interview and adds up these time intervals
to get indices indicating dominant or submissive personality traits, degree of social initiative,
speed, persistence and adaptability in interacting. Other types of tests, and indices recording
them objectively and quantitatively, are being developed in sociometry towards measuring
inter-personal relations — which is the stuff of which every group and every human
organization is built.
But organic sampling goes beyond sociometric observing of interperson relations in that
it observes the further relations between a person and a group. These person-group relations
may become very numerous and complex, according as they vary:
(a) In time from one period to another
(b) In space from one region to another
(c) In population composition:
(1) as to overlap; from groups with no members in common up to groups with all
members in common.
(2) as to hierarchy; from one level of similarly interacting persons (as in a
conversation "grouplet") to many levels (as in any large industrial organization
subclassified into departments which are subclassified into sections and so on
into hierarchy, or "grouping" of subgroups).
(d) In the indices which record the functioning and relations in all the possible kinds of
human groups.
The person-group relations can be readily studied by known techniques when they vary
in time, space, or in population composition, but techniques are little developed as yet for
studying them through the varying indices which indicate and record the many differing kinds
of person-group relations. Some of these indices which symbolize and record person-group
relations are things, or activities, such as a building which symbolizes the group using it, or a
flag which symbolizes a national group, or a name which symbolizes any formal group, or
one's work which symbolizes the group of one's coworkers. Some other of these indices which
symbolize person-group relations are the observed relations between people and officials (who
may be defined as persons formally representing a group in some way). These relations to
officials will be discussed a little further here to give a clearer picture of this one sub-type of
little explored person-group relation.
An office (i.e. formal role) is the behavior expected of some class of interacting person
such as a policeman or any government official, a father in a family group, a specialist in some
internally differentiated group, or a formal leader in any group whatever. Every person in an
industrial organization has at least one formal role, or office, that is, some behavior that is
expected of him as a worker, a foreman, a clerk, an executive or whatever may be his type of
formally expected interacting with other persons.
Now it is an obvious and yet basic fact that persons may behave differently as an official
and in non-official capacities and again may have one attitude towards an office and another
towards its incumbent. Thus in every pair of persons, A and B, where B is also an official, who
in this role may be called C, there are three possible sets of interperson relations. First, there
are the usual A to B and B to A relations, between two persons as currently studied in
sociometry. Second, there are the further relations of A to C and C to A, the relations between
the official and another person. Third, there are the relations (studied in psychoanalysis
especially) of B to C and C to B, i.e., the relations within a person between himself as an
official and as a nonofficial. Thus he may have attitudes of pride, ambition, indifference,
timidity, etc., towards his role or office; and as an official he may consider himself an excellent,
or inferior, or peculiar, or other kind of incumbent of that office, etc. Now these three sets of
relations among A, B, and C — among two persons and one-of-them-as-an-official-representing-a-group — are the essence of person-group relations and so of the relations within
any human organization. Research is needed to measure these relations separately in order
that from their combination we can predict and control increasingly all the human relations
within an organization.
A recent theory for dealing with organized human groups in all their possible complexity
has been developed by V. Cervinka with the aid of matrices and matrix products. His
"dimensional theory of groups" assumes two types of elements, namely people and "criteria"
which are symbols representing the group in any way. His criteria may include other members
of the group, officials, activities, things, places, names, in short anything whatever that
symbolizes something about the group to its members and is therefore group-forming in some
respect.
Each given observed criterion, recorded as an index, defines a group as those persons
having some degree of attachment greater than zero to that criterion. The degree of
attachment is measured by a highly generalized attitude scale which Cervinka has invented.
All the persons when recorded as row headings and all the criteria when recorded as column
headings and their interrelating attachments when recorded as cell entries form a
comprehensive matrix. The indices which summarize this matrix can measure the solidarity of
the group, i.e. its ability to endure and resist shock, disruption or decay. By substituting
different, indices, other than indices of attachments in the cells which interrelate the people
and their group-forming criteria, any dimensions of a group other than its solidarity can be
measured.
Organization sampling means, then, to sample not only the person-person relations that
hold in any grouplet but also the person-person relations, which include the triple set of
relations among officials and persons. Next, these person-official relations will pyramid
whenever person A is an official and person B is a different official, thus introducing into that
pair of persons two triple sets of relations. Then further, persons may have more than one
office as in being both secretary and treasurer of a society, or a foreman and trainer in a shop,
or a worker and sole expert in a certain operation. When multiple offices and less formal roles
in one person are combined with many persons in an organization the inter-human relations
become enormously many and complex. This is partly why organization research has
accomplished so little to date — but recently theory and techniques such as those of matrix
algebra have been developed for handling this complexity.
Thus organization sampling will use as one technique the matrices of matrix algebra, a
branch of mathematics much needed in statistical research. A matrix is a table arranging items
in rows and columns. "A first degree" matrix consists of one row, listing something for every
person in the sample. This is ordinary sample surveying, or polling, today. More penetrating
social surveying requires a second degree matrix which lists each interviewee as heading a
row and again as heading a column and enters in the cell where the row of person A crosses
the column of person B the observed interrelation of A to B. But organization surveying goes
still further in requiring a third degree matrix of rows, columns and vertical arrays. The vertical
arrays may represent the officials or person-in-formal-roles representing the group in some
specialized or partial way. Thus the entire second degree matrix is repeated once for each
official. If N is the number of officials in a group of persons, P in number, then the number of
interrelations is the product of P x P x N. Thus organization sampling must involve at least
three factors, arranged in a third degree or solid matrix, namely (I) a set of persons (2)
interrelated with each other, and (3) with officials. The operational test of organization sampling
is that it requires such a third degree matrix to record all the relations that have been observed
by appropriately representative sampling in the organization. Obviously such organization
sampling has never yet been systematically done or even attempted. But it must be developed
by research if the intricacies of human organization are to be so scientifically observed as to
yield generalizations and laws of interhuman relations which in turn will enable us increasingly
to predict and control our highly organized society.
III. World Sampling
In addition to pan-sampling and organization sampling, a third line of further
development of demoscopes is world sampling. World sampling means simply the attempt to
sample representatively all the people currently living anywhere on the earth. It is sampling
humanity. Its findings could generalize about human beings unrestricted to one region and its
culture as all surveys have been restricted hitherto. World sampling would thus enable the
social sciences to transcend space and local culture. It would tell us what is currently true for
all people and show what further is different for peoples in different regions. Social
generalizations would outgrow any existing provincialism and tend to become universal social
laws.
World sampling is not a mere dream for the distant future but is being approached
today. Thus the number of people interviewed in the current surveys in one month has already
become much greater than needed for a world sample. 5 But their distribution is geographically
unrepresentative at present since these surveys are all in some thirty of the more developed
nations which include less than half the people of the world.
Towards geographically representative world sampling a world association of sampling
surveyors is being formed. The international conference of this association in Williamstown in
September, 1947, adopted its provisional constitution as a "World Congress." Its functions may
include progressively organizing indigenous surveying agencies in countries now lacking them,
and coordinating national agencies in an international network which will cover all the countries
which permit or can afford such agencies. Most agencies are interested in participating in an
international network which could service the United Nations agencies, especially UNESCO.
The problem is mostly one of financing a network and of promoting agencies in unsurveyed
areas and setting up and enforcing minimum standards of scientific surveying.
World sampling has practical uses, as well as theoretical use in universalizing the data
of the Social Sciences. For the world today wants security from another world war. World
surveys could serve as a "barometer of international security." 6 They could gauge each month
the rise or fall of various international tensions whose breaking point is war. They would warn
of increasing tension with events and reflect and lessening of tensions when remedial steps
were taken. Along this line, a study of "tensions affecting international understanding" has
been started by UNESCO in the form of a comprehensive survey of research projects needed.
Several hundred psychologists and social scientists of all countries have been invited to
cooperate on this wide ranging collection of research projects on international tensions. For all
these researches on tension the one essential tool is an international demoscope. And this
demoscope must be developed towards world sampling if it is to warn of world tensions.
There are many other practical ways in which demoscopes, by sampling the world's
population, can service the United Nations and make a scientific contribution towards the goal
of One World.7 Every U.N. agency and commission requires information in its field about the
people of the world and much of these facts can be gathered by the sampling of a world
demoscope. Thus the Public Information Department and the Statistical Department of U.N.'s
Secretariat will need a world demoscope either in their own organization or available for
contractual surveying. The world association of public opinion agencies noted above aims to fill
this need in contracting to survey as wanted by U.N.
Finally, world sampling has broader implications. In proportion as world polls are
achieved the voice of the people of the world will be accurately amplified and directly heard in
the United Nations. World polling is thus a contribution from the social sciences for
implementing democracy on the new world scale. It builds a bit more of world government
which is by and for the people of the world.
IV. Time Sampling
Much of the pan-sampling, organosampling, and geo-sampling described above must
be resampled in time. Resurveys at regular periods are essential to observe current change, to
analyze past causes, and to forecast future trends. Dynamic phenomena — the living of
people — require more than one survey on one date. A time series must be observed
thoroughly for predictions of the near future. Analysis of the correlations of these time series is
our best hope for learning more of their causation. Social laws of causation can only be
induced from careful checking and rechecking of sequences and correlated celerations under
varying conditions, as observed by periodic resampling. Therefore demoscopes with long term
programs are needed; and for this, financial support must be assured for a decade or more
ahead. Demoscopes of all kinds, from public opinion institutes to market research agencies,
from radio listener or magazine readership to social surveys of the people's diet, health or
recreation habits must increasingly organize themselves so as to routinely repeat their
inquiries and cumulate periodic statistics just as the census does. In fact, census bureau and
sampling surveying agencies of governments may be expected to merge increasingly in the
future.
Time sampling means more, however, than simply repeating surveys in consecutive
periods. Time sampling may also mean surveying only a sample of the sub-periods within an
overall period. This means economizing effort by selecting representative periods such as the
activity of five minutes in each hour of a day, or of one day in each month, or of a typical month
in the year.
Time sampling applies theory to a time universe. It seeks to estimate the amount of
some dynamic phenomenon in a whole period by observing it in only a part of the period.
Provided the part represents the whole well, the part may be very small and thus make the
observing cheaper and quicker. Thus, for one example, the usual formulas for standard errors
of sampling cannot be applied to time samples since observations in time periods are not
usually independent of each other as assumed in random sampling. The principles of random
sampling must be modified in time sampling due to the fact that the values of indices from
consecutive periods are apt to be correlated.
Resampling in time, if at regular intervals, measures social forces. For differences between dates can measure velocity and celeration of change -- acceleration if the rate of
change is speeding up, deceleration if the rate of change is slowing down. Then the product of
the amount of celeration and the population celerated can define an effective social force.8
Thus defined, social forces become measurable and demoscopes can measure political
forces, economic forces, educational forces, health forces, religious forces and any other type
of social force. Each force is expressed in units of the observed index of change. For
comparing them, forces must be converted into comparable units, such as percentages,
standard deviations, money, time, or simply all-or-none attributes. Social forces thus become
determinate with an accuracy proportional to the accuracy with which that type of social
change is observable. (See our theory of measurement above for all-or-none, ordinal, or
cardinal degrees of accuracy.) This theory of social forces then goes on to define operationally
a causative social force, or simply "a cause" as any antecedent index which under the
specified conditions of observation correlates with the effective force. The extent to which
antecedent events, A, are causes of consequent events, B, becomes determinable
increasingly with research.9
But here again more funds for social research are needed if the complex conditions for
a given social force to operate are to be disentangled — as by means of pan-sampling.
Analysis of social causes in time sampling thus depends on pan-sampling. And further, if
prediction of organized universal human behavior is desired, then social researchers must also
be financed for organization sampling and for world sampling.10
Notes
1. A "demoscope" is a social fact-finding instrument which denotes much more than the term
{`survey" since a demoscope includes the organization of personnel who plan, execute, and
report a survey as well as the physical equipment of office, tabulating machinery, schedule
cards, communicating and transportational and all other equipment. A demoscope means
the complete instrument for scientific observing of a human population including its
component functioning of survey planning, questionnaire construction and pretesting,
sample selecting, interviewing, recruiting, training and supervising of interviewers, editing,
tabulating, computing, reporting, and publishing reports. A demoscope may observe
people's opinions, or their information, or their behavior or their condition in any respect and
is thus more inclusive than a "poll" which usually observes opinions. It is less inclusive than
a "survey" as it means only "social survey" and not geologic surveys or other types of
surveys. "Surveys" further imply only the action of observing people and not all the other
parts of the instrument defined as a "demo-scope." The four essentials of any demoscope
are that it be
(1) a scientific instrument;
(2) for observing facts;
(3) about a representative sample.;
(4) of a specified human population.
2. Better research involves research which is guided by hypotheses to be tested. Such
purposeful research will keep pan-sampling from becoming a mere mechanical amassing
of thousands of frequency distributions and intercorrelation coefficients.
3. Lack of funds has only recently become the chief deterrent of pan-sampling. Until recently
adequate theory was also lacking. But in recent years the necessary undergirding of theory
has been built up, namely:
For comprehensive dealing with many variables, symbolic logicians have developed
the calculus of classes and of relations and statisticians have developed the statistics of
attributes and matrix algebra.
For precise and reliable dealing with each variable the theory of measurement and
the theory of sampling have been greatly developed: The theory of measurement begins
with calling any observable an "attribute" (i.e. an all-or-none variable) and assigning it
values of r or o for its presence or absence in any situation studied; then distinguishing
ordinal degrees of "some, more and most" of the attribute; then in standardizing a cardinal
unit of it and counting the number of such units in the situation. By this theory of
measurement anything observable by man can be quantitatively measured (with varying
degree of precision and reliability, of course).
The theory of sampling develops the laws of adequate size and of
representativeness and indices of sampling error, all of which have enabled inferences,
with specified degrees of reliability, to be made from samples small enough to be practical.
For relevant dealing with many variables, multiple correlation and factor theory have
developed powerful tools for analyzing and synthesizing systems of variables and for
measuring the approach of these systems to becoming closed systems within which
phenomena are perfectly predictable.
4. State polling agencies have been started in Iowa, Texas, Minnesota, California, and
Washington and centers for basic social research developing towards pan-sampling in
comprehensiveness have been started in Cambridge, Massachusetts, Ann Arbor,
Michigan, and Seattle, Washington, for example.
5. Using the rule-of-thumb that a sample should be about the square root of the population
sampled, gives 40,000 interviewees, as the square root of the world's adult population of
some x,600,000,000. The surveying agencies of all the various countries are estimated to
interview more than 100,000 people every month.
6. For fuller statement of this barometer proposal see: Dodd, S. C., "A Barometer of
International Security," Public Opinion Quarterly, Summer, 1945. For discussion of world
surveys more generally see: Dodd, S. C., "Towards World Surveying," Public Opinion
Quarterly, Winter, 1947•
7. The current diverging of Russia and the Western nations is expected to limit joint surveys
and publicity about surveys, but not the actual use of demo-scopes. Russia is reported to
be using sampling surveys for her internal purposes. Any scientific technique will be used in
the long run by any regime simply because the scientific method assures that the technique
works. International surveys in non-controversial realms may be expected to develop after
isolated national development and then become gradually extended as much as the
various governing regimes permit.
8. If DI denotes the difference in some index observed on one date and reobserved on another
date and T denotes the time interval, then DI/T = V = the velocity, or time rate, of change. A
difference between two velocities, DV, divided by the time interval defines the celeration,
DV/T = C (or C = DU-2). Multiplying this by the celerated population, P, defines an effective
social force as F = CP. This is expressed in units of the index of social change per period
per period for the whole population.
9. For greater accuracy such "causes" are better called "precorrelates." For the semiphilosophical word "cause" has conflicting or vague meanings while the theory of statistical
correlation permits rigorous reasoning and prediction via the regression equation B = rAB A
(A and B being in standard deviation units). This theory of causation and of social forces is
more fully and exactly developed in the author's "Dimensions of Society," (Macmillan, 1942,
P. 944) and "Systematic Social Science," a textbook offset for critical development and obtainable from the Department of Sociology, University of Washington, Seattle.
10. Readers interested in systematic theory or methodology, may note that the four lines (or
"dimensions") of development of demoscopes described in this paper are rigorous
deductions from the author's dimensional theory in Sociology. (See "Dimensions of
Society," Macmillan 1942 p 944, or the more inclusive volume, offset for critical revision,
"Systematic Social Science," 1947 P 785). This methodological theory, which has been
called the "S-system" of concepts, asserts that anything observable by human beings is
classifiable into four "sectors," namely time (T), space (L), people (P), and indices of all
residual characteristics (I). These are then subclassified in hierarchic levels as finely as
needed. Thus, a second level of subclasses is based upon the thoroughness of observation
which is measurable by the mathematical exponent, as follows:
S-symbol
Meaning in general
Meaning as applied to demoscope
I0
any set of data, i.e. recorded observations
qualitative characteristics
I1
I2
quantitative characteristics
correlated characteristics
P0
P1
P2
P3
a person, a case
a plural, a class of persons
a grouplet, a set of interacting persons
a groupage, a set of persons,
grouplets
and groupings an individual
L0
L1
L2
a spot, a point a line,
a length
an area
T0
T1
T2
an instant, a date
a speed, i.e., divided by a period
a celeration, i.e., twice divided by time
demoscopic data, data from a
sample survey
different questions =
"comprehensiveness" of
content
intervals, or degrees, of an
answer
intercorrelation indices
= pan-sampling
an individual interviewee
samples of (unrelated)
interviewees
samples of person-person
relations of interviewees
samples of person-group
relations, = organization
sampling
the number of demoscopic centers
the distance from centers to
interviewees
the areas surveyed = world
sampling at maximum
surveys at one date
speed of change on resurveying
celeration of change on
resurveying = time sampling of
social forces
(Exponent
denotes
degree of
the
matrix)
S
S-formula, of which every set of data from every survey whatever is a particular case, is
S = Tt; Ll; PP; Ii (where the semicolon denotes any mathematical or logical operator as +, -, x,
, =, ‫כ‬, etc.). This is the "quantic S-formula," where the S-formula subclassifies data only as far
as the second level and where the degree of the exponent (called a quantic in mathematics) is
the basis for subclassifying each of the four sectors. The four forms of sampling described in
this paper were deductively derived from this S-formula as its special case when:
(a) the data, S, are limited to sample surveys, and
(b) the exponents are higher than unity.
The maximal exponents specify extremes as yet unattained in sample surveying. The
dimensional symbols yielded the four types of sampling described in this paper as rigorous
deductions from the S-formula — even if the deducer knows nothing about polling. This
usefulness of dimensional analysis extends further in predicting future types of sampling
surveys which are described by still higher exponents. But these are not even imagined as yet
by most polling practitioners since our folk language has not developed words for
communicating such subtle and intangible meanings about human relations.
Section 2 on Studies on Techniques of Polling
Focus on two factors of pollers’ Acts, A and Materials, M.
U:56-141
#8. The "Steps-And-Parts" Model for Polling
a methodological theory for public opinion research
by
Stuart Carter Dodd
And
Chahin Turabian
I. Introduction
This paper presents a methodological theory of public opinion research which we call
the "steps-and-parts" model for polling. This theory is an analysis of polling, not an analysis of
opinion. It deals primarily with the researcher's role-actions in polling and not with the
respondent's reactions in the interview.
But any polling also involves the opinions or speech behavior of the respondents. The
findings of any poll are always a complex product of the two factors: the unbiassed opinion of
the respondents, and the polling instrument for observing it. The public's opinion as observed
in a poll always has the variation of the demoscope, or instrument error, combined in unknown
degree with the unbiassed variation of the opinions of the respondents.
This model for polling seeks to standardize the instrument so that its error or biassing
effects will be least and the variance of the public's opinion will be most of the total observed
variance. The existence of the instrument variance (which includes both random sampling
errors and systematic observational errors) is the reason for needing a methodological model.
An appraisal of the sources, kinds, and amounts of variation due to the instrument is
essential to the comparability of international polls. A methodological theory of polling, which is
a set of generalized and tested techniques for calibrating the instrument, is prerequisite to the
development of international polls. No Barometer of World Opinion can be established as a
scientific instrument until its degree of accuracy is at least known.
Trying to refine our model scientifically may involve the follow ing procedures:
1.
2.
3.
4.
5.
Specifying the variables;
Specifying the relations assumed among the variables;
Specifying the formula derived from these variables and assumptions;
Specifying the experimental testing of the model;
Specifying the statistical fitting of the model.
These five procedures will be developed in the next five sections.
II. Instrument Variables at Issue
A. The 24 Steps in Polling
Since an operational definition is essentially a statement in verbs which names the
operations performed (including all materials used and relationships of all orders), defining
polling operationally is analyzing it as an activity, as a set of behaviors in the behavioral
sciences. It means describing polling, always as some kind of acting, some sort of steps taken.
To emphasize this we use the present active participle ending in "-ing" wherever possible.
Thus we speak of the eight stages or major processes in polling as
1) "administering" for "administration";
2) "designing";
3) "questioning";
4) "sampling";
5) "interviewing";
6) "analyzing";
7) "reporting"; and
8) "interrelating" all the parties for public relations.
As a consequence, the major feature of our "steps-and-parts" model is a standardizing
list of the sequence of steps or actions the poller takes. Thus the eight stages in polling are
each split into three Substages: early, middle, and late. These sub-stages are three subperiods of time (past, present, and future tenses of the mid-stage). But they are usually spoken
of by what they specify, such as the "preparing, executing, and completing" of the stage, or
they are spoken of 1as dealing with the "causes, contents, and consequences" of the stage at
issue.
These eight stages, when subdivided each into three sub-stages, yield the twenty-four
basic "steps" in polling which — our theory proposes to standardize. With these twenty-four
steps having standard names, any variations of merging steps together, or subdividing a step
into sub-step or modifying a step in any way can be easily determined and communicated.
As further consequence, all the itemized actings in our model can be observed,
repeated, tested for any of the desired qualities of scientific studies such as reliability, validity,
universality, accuracy inclusivity, and predictivity.
The twenty-four steps are shown in the first 'column of the matrix of step-part cells,
shown as the General Table below.
The Variables in the Model:
Its Central Hypothesis:
Its Testing:
Stages or
periods
Sub-Stages or
Steps
Every poll (A) is analyzed into 2 factors, i.e., the public’s opinion (α) and the
poller’s procedures (a): A = α x a.
The opinions (i.e., speech reactions in a poll) proposed as behavioral criteria (B)
for (i.e., this model internationally are: assertions (as in a census) of age, sex,
schooling, income, occupation, nationality, religion, and languages spoken.
The procedures of polling (i.e., role-actions of pollens) are analyzed here into 8
stages and 24 substages called "steps" in polling. Each step may have 8 type
parts (which are the dimensions of any human behavior) yielding 192 "stepparts" in the cells.
Standardizing the step-parts improves the comparability and validity of polls.
Reliability (rAA’) (primes indicate reobservings) and validity intraclass correlations
(r
AB, BA') from repolling the criteria' opinions, with each step-part in turn varied
alone over poll and repoll values, can test how standardizing may improve
validity. r AA’, A’A = 1 if σa = 0 and r BA. = r BA',AB
Eight Type-Parts
Acts Actors Values Time (dates, Space
(Behav (pollers, (wants, sequences, (places,
-ior) pollees, goals, durat-ions, distances
publics) desideretc.)
, areas,
ata)
etc.)
A
P
V
T
L
What is By
Why?
When?
Where?
done? Whom?
jj1
Administer Preparing 11j
-ing an Executing 12j
agency
Completing 23j
Designing Preparing 21j
a poll
Executing 22j
Completing 23j
Question- Preparing 31j
ing
Executing 32j
Completing 33j
Sampling Preparing 41j
Executing 42j
Completing 43j
Interview- Preparing 51j
ing
Executing 52j
111
jj3
jj4
jj5
M
VV
With
With
Which
Which
Things? Symbols?
jj6
jj7
C
How
Else?
Jj8
118
The 192 “step-parts” cells in this matrix are derived by cross
classifying the 24 steps with the 8 type-parts or factors. Each
cell entry or “step-part” is defined as a logical product or joint
occurrence of ins rows and column entries. The cells spell out
an operation definition in three tenses for each of the eight midstages by telling who does what, when and where, why and
how.
Each step-part represents a standardizable unit of operating
procedure in polling.
Each cell’s code is a three-digit number, jjj, where: j stands for
each digit, or subclass, in turn; j**, the hundreds digit, stands for
the stages; *j*, the tens digit, stands of the substages; **j, the
units digit, stands for the type-parts; “0” stands for the absence
of subclasses; and “9” stands for mixtures of subclasses
Completing 53j
Analyzing Preparing 61j
statistically Executing 62j
Completing 63j
Reporting Preparing 71j
Executing 72j
Completing 73j
Interrelat- Preparing 81j
ing all
Executing 82j
parties Completing 83j
jj2
Materials Symbols Residual
(equip(words, Conditions
ment and numbers, (i.e., all
funds)
records)
else)
831
838
B. The 8 Type-Parts of Polling
As elsewhere in life, every act in polling is behavior-in-a-situation and always has a
context. It has actors. It has causes and consequences in time. It is done somewhere. It is
conditioned to varying degrees by behavioral, material, symbolic, or other circumstances. The
new contribution in our dimensional analysis has been a systematic attempt to factor all such
contexts into a standard, few, yet all-inclusive categories which we call here "type-parts."
We propose to use the following eight "type-parts":
Table 3 – The 8 "type-parts or Sectors in Polling
1 Acts - - -
i.e. behavior
2 Actors - - -
i.e., persons -- who appear here in any of three forms:
a) the public sampled, whose opinion is wanted;
b) the respondents, as reactors who represent a public;
c) the pollers, as research role-actors who make the poll.
3 Values - - -
which are "desiderata" defined as defined as whatever people say
they want in the standardizing situation of a poll.
4 Time - - -
which appears as dates, occurrences, tenses, sequences, durations,
speeds, or celerations.
5 Space - - 6 Material - - 7 Symbols - - -
which appears as locations, distances, or areas or densities;
which appears as equipment and funds;
which appear as words, statistical variables, or other symbols, and all
records; and
which complete the context.
(such as "wartime," "peacetime," oral or written interviewing,
anonymous or signed, etc.) which complete the context.
8 Residual
conditions - - -
These eight type-parts spell out an operational definition of polling in specifying what is
done? by whom? why? when? where? with which materials and symbols? and how else?
The four prime type-parts: acts, actors, time and space are necessary factors of any
human behavior-In-its-situation. For every act requires an actor and a finite time and space in
which to occur. The four other type-parts always appear but in many alternative forms. In
extensive testing of their use in the social sciences by our dimensional analysis, as reported in
our Dimensions of Society and Systematic Social Science volumes (Refs. 1 and 2), we have
found them the most useful set of concepts for both practical and theoretical purposes.
The last category of residual conditions is the logical complement to all the others and
brings forth an entirely inclusive system. The main problem in any science is to push back the
frontier of the unknown by subclassifying this residual category. These eight type-parts seem
to date an adequate set of basic concepts for polling methodology.
However research may modify them in the future to develop a more predictive set.
C. The Step-Parts
Let the twenty-four steps and the eight type-parts be cross-classified in a matrix of all
possible pairs of their row and column entries. This yields 192 cell entries which we call the
"step-parts" of polling as each cell specifies one type-part of one step. Each step part
represents a standardizable unit of operating procedure in polling.
The General Table exhibits this step-parts matrix. Each cell's code is a three-digit
number, yyy, where: y stands for each digit, or sub-class, in turn; y**, the hundreds digit,
stands for the stages; *y*, the tens digit, stands for the sub-stages or steps; **y, the units digit,
stands for the type-parts.
First, the twenty-four steps are in a time sequence. There is, however, flexibility in
overlapping early and late steps but not in overlapping middle steps (except for possible
interchanging of questioning and sampling stages). Secondly, the type-parts may be complexly
intercorrelated with the steps. Thus the same actors do many of the steps, introducing
correlation between the "actors type-part" and those steps. Thirdly, each step forms with each
type-part what logicians call a "logical product" which, in terms of behavior, means a joint
occurrence.
The specifications for each cell in the Table are neither exhaustive nor final. Cell
contents can grow and change with better knowledge and greater specialization. The cells at
every order can be further subdivided, thus allowing for unlimited development of this model.
This methodology should be refined by trial among pollers in order to specify the categories
with least ambiguity.
III. Opinion Variables Taken as Criteria
To validate our model on a worldwide basis, a representative set of opinions should
serve as criteria. We propose the set of eight questions listed below which stratifies the
universe of opinions by the seven chief social institutions -- the most universal forms of culture
in any people, time, or place. Furthermore, these eight questions are the most utilized, some of
them appearing on almost every poll made anytime anywhere.
Code Variable
Institution
represented
(Any)
Family
Health
Education
Economic
0
1
2
3
4
(Unclassified)
Sex
Age
Schooling (years of)
Income
5
Occupation
6
Nationality
Economic (and
social class)
Political
7
Language(s) spoken
Media
8
Religion
Religion
9
Mixtures of above
Combinations
Formula Notes on logical or statistical
form of each question
(A0)
(Any)
A1
All-or-none
A2
Cardinal, potentially exact
A3
Cardinal, inexact
A4
Cardinal, inexact and often
biased in reporting
A5
Qualitative varieties; difficult to
standardize
A6
Less than 100 qualitative
varieties; well standardized
A7
Hundreds of qualitative
varieties; well standardized;
multiple in some persons
A8
Many qualitative varieties;
often difficult to ask
A9
Composite
The answers to these eight questions are among the most verifiable in polling since
many of them can be checked against census and other records. They are as truly opinion
variables as any questions probing attitude, for they record the speech behavior of the
respondent in response to a standardized question in a standardized interviewing situation ―
which is our definition of a "polled opinion."
To serve as criterion variables this or any other set of questions would be asked in a
poll and a repoll in which ideally just one step-part varied while the other 191 step-parts were
held constant. The responses to each criterion question in the poll and repoll would be
correlated with each other to measure the reliability of that step-part and with an accepted
authoritative index (such as census results) of each criterion variable to measure the relative
validity of the polling as it varied over the two values of the step-part at issue.
IV. Relations among the Variables
At present we see three kinds of relations among the instrument variables:
1) Temporal relations of sequence among the twenty-four steps (partly created by
definition of the sub-stages).
2) Product relations as logical pair products or joint occurrences between the steps and
type-parts (created by definition of the step-part cells). When these are quantified,
the logical products of classes become intercorrelations of variates. Some
correlations are known and controllable, others uncontrolled, and probably still
others unknown as yet. The most important is the zero correlation because
uncorrelated variables are equivalent to controlled variables.
3) Alternative relations as logical sums among sub-steps and sub-type parts wherever
alternative ways exist.
We have tried to carry the analysis in this model down to step-parts which are logical
products (i.e., combinations of necessary factors) and stop short of sub-steps and sub-parts
which are often logical sums (i.e., combinations of optional and alternative addends). Thus
"interviewing,' is a necessary factor in any poll while alternative forms of interviewing exist as in
interviewing face to face, by phone, by mail, in groups covertly, etc.
Each of the eight stages is a necessary factor not an alternative addend, in polling. The
complete polling is their logical product, not their logical sum. Thus we can predict that every
poll will always have the eight stages in some form, however rudimentary, vague, or inexplicit;
but we cannot predict that any one of the alternative forms of each stage will always occur in
polling.
We think that our step-parts analysis comes nearer to separating cleanly necessary
factors from alternative addends of lower probability of occurrence than other existent theories
(with similar explicitness of detail).
The best test for this distinction is the "vanishing result" which is an important but ill
recognized principle in social science. Just as in arithmetic 2 x 0 = 0, while 2 + 0 = 2, so in
logic any logical product vanishes if any factor of it vanishes. Can polling occur if the is no act
of asking a question? if there are no pollers if the time period's zero? with no space? with no
values or stimuli to cause it? with no materials whatever? with no words or other symbols? with
no context of any other sort? Obviously the answer to these questions is "no" and therefore the
type-parts are factors, not addends, and polling is their product, not their sum.
The distinction between a logical product and a logical sum is vital for improving
prediction in social science. In a product every factor being always present in some non-zero
amount, predictive statements with 100 percent probability can be made; whereas in a sum
any addend being either present or absent, predictive statements have any degree of
probability from zero on up.
Turning from the instrument to the opinion variables, among the whole world's living
adults we may expect:
sex to be uncorrelated with age, family income, nationality, language, and religion; more
correlated with education; and highly correlated with occupation.
age to be positively correlated with education among youth, with some occupations, and
with income, especially among males.
education to be somewhat correlated with all the other seven variables, etc.
But most of these correlations are local or speculative until we get representative data
on all humanity. The problem of their undetermined intercorrelations can be avoided by taking
each opinion variable in turn as criterion for testing the instrument variables. An average or
suitable generalizing of the results of such tests is all that pollers can hope for until a world
demoscope provides better data to theorize upon.
Finally, the external relations expected between the opinion variables and the
instrument variables are the crux of this matrix model for polling. We hypothesize that the true
opinions and the instrument are two factors forming some kind of product (and not forming a
sum).
For that we have only a few experimental facts such as in our logistic diffusion (Ref. 3).
But we have a growing appreciation from experience with dimensional analysis that an
algebraic sum represents the separate acts of individuals in a plural ― as when the poller
counts the "yes" responses to a question and reports that sum ― while an algebraic product
represents the interaction of members of a group — as when people tell and retell some news
and so multiply the number of knowers of it. The principle can be epitomized as: "Sums
represent plurals; products represent groups."
V. Formulas Deduced
Then the joint occurrence of the unbiassed opinions and the polling instrument is
expressible as follows:
A =axα
Eq. 1
the hypothesis that observed data are
products of instrument times phenomena
Here A = the polled opinions as observed
a = the instrument variable, the simple or complex act or set of acts of the poller —
whether all 192 step-parts or just one of them
x = some form of product, the generalized multiplication sign
α = (alpha) the "true" or "unbiassed" opinion(s) of the respondents (i.e., the opinion if
observed with, a perfectly reliable instrument)
In this general formula each of the three variables may be a logical class, or a sentence,
or a statistical variate, or a set of variates, or any combination of them. The form of logical or
mathematical product must then be appropriate to the form of the variable, i.e., a logical
product of classes, or conjunction of sentences, or statistical product (such as a productmoment) of variates, or a product of vectors, or of matrices, or of lattices, or even forms of
mathematical or semiotic products not yet invented.
If the instrument were perfectly standardized, a would be constant. Let this constant be
taken as a unit. Then α = A and:
1) The observed opinion would be the public's true opinion. This is what pollers want.
Then rAα = 1.0, but this correlation is not directly observable in practice since α, the
true opinion, is unobservable without an instrument, a.
2) The observed opinion would correlate at 0 with the instrument, as in a poll and a
repoll, i.e., rAa = 0 since a constant has no correlation with a variable.
3) The observed opinion would correlate perfectly with itself on repolling at such a short
interval that the true opinion may be assumed not to have changed: r AA’ = 1.0 (where
a prime denotes a repoll).
The last two statements can be verified by repolling. In proportion as any criterion
opinion shows a correlation around zero with a step-part which is varied over at least two
values in a poll and re-poll, the second statement is shown to hold, i.e., the measured
variation in the instrument does not affect the opinion. In proportion as polls and repolls
show intraclass correlations around 1.0 the last statement is shown to hold, i.e., the polling
is free of either- sampling or observational error.
VI. Experimental Testing
When reliability is defined as agreement of reobservations and error as disagreement,
experiments may be systematically designed to measure them in a poll and repoll. To do this,
ask the eight opinion questions in a poll and then repoll with one step-part varied whilst the
other step-parts are held unchanged as rigorously as possible.
Then compute the fourfold correlation coefficient, rAa, between the two poll and repoll
values of the one step-part a at issue and each opinion (dichotomized) in turn as A. This
correlation will tend to be zero insofar as the varying of the instrument does not affect the
opinion. This measures the degree to which each step-part (or larger section of the instrument)
affects each of the eight polled opinions, and tells where the demoscope is most in need of
better standardizing.
The steps-and-parts matrix (Table 1) can also generate experimental designs for
improving the instrument. Prediction in its broadest senses being the main objective of
scientists, improved predicting of public behavior (i.e., a higher predictive correlation between
a poll's forecast from a sample to a whole population's later life behavior) is the criterion for
any proposed improvement in the instrument.
Many of these step-parts have been or are being tested in a growing body of studies
including several from our Public Opinion Laboratory. (Refs. 5-11)
VII. Statistical Fitting
The final procedure is to measure the closeness and the significance of the fit of the
model to the experimental data.
Both descriptive statistical indices and sampling statistical indices are needed. The first
measures the degree of agreement which may be a percent of error, or a correlation index
including both systematic and random errors. The second measures the probability of
recurrence of that degree of agreement which may be a significance level from a chi square or
other parametric or non-parametric test.
The poller has to decide on his standards which depend on his experience, the
resources and the state of technical research in his subfield. He may call a correlation above .9
between the model's expectations (hypotheses) and the observed data (the facts) and a chi
square significant at the 5 percent level as indicating satisfactory fit. With appropriate
descriptive and sampling indices in hand he can decide whether the model should be accepted
or rejected, tested further or modified.
VIII. Practical Uses of the Model
For office purposes, the classification in the matrix can constitute the basis for filing any
sort of materials for polls, from polls, or about polls, by code number telling its stage, substage, type-part, or combination of them.
Correspondence planning polls, filled-out questionnaires, sample maps and instructions,
interviewer correspondence and records, IBM cards and tabulation sheets, drafts of reports,
news clippings and fan mail, all have their orderly places under the hundreds digit for the
stages of polling.
Pamphlet files and library cataloguing can use this coding to as fine or as broad a level
of subclassifying as desired.
Textbooks and manuals on polling could follow this standard outline. Our canvas (Ref.
4) of the methodological literature which resulted in this step-parts model demonstrates how
well this code fits.
In market research, as in all public opinion research, assurance is constantly needed as
to the standards of the polling. Our steps-and-parts breakdown can provide standardizing
specifications, as detailed or as summary as wanted, for the accuracy and other aspects of the
polling. The polling profession, the client, or the polling agency cart at any time specify what
they demand in respect to each stage, step, and type-part.
The administrator uses most, if not all, of our eight stages and eight type-parts in
organizing an agency and in directing each polling operation. He spends all his time weaving
together the activities and objectives, the personnel and equipment, the time schedules and
spatial factors, the records, and miscellaneous business ― which are the eight type-parts
whatever their labels. Our matrix model is an outline with standard headings of a job analysis
for any polling agency.
IX. Summary
A fuller statement of this methodological theory is contained in a report to be made to
UNESCO in 1957. The theory emerged in applying the senior author's dimensional analysis
(Refs. 1 and 2) in its recent "transact" form (Ref. 4) to the data of a cross-cultural survey of
polling techniques for which UNESCO had contracted with the Washington Public Opinion
Laboratory (WPOL) via the World Association for Public Opinion Research (WAPOR). This
contract research sought further standardizing of international polling. Such improved
reliability, in turn, is a step, to the eventual establishment of a Barometer of World Opinion, a
transcultural instrument whereby the social scientist may observe the speech behavior of
representative world samples under standardized condition.
The essence of this steps-and-parts model for polling, or methodological theory for
demoscopes, is the comprehensive, operationally defined, and testable expectation hypothesis
that: Insofar as the twenty-four steps and eight type-parts of polling = standardized, crosscultural comparisons of polls and their prediction of public behavior will be improved.
The twenty-four steps in polling are defined as the preparing, executing, and
completing, or early, middle and late substages of the following eight standard stages in
polling, i.e., administering, designing, questioning, sampling, interviewing, analyzing, reporting,
and interrelating all parties. The eight type-parts are taken as the following necessary factors
(not addends!) in all polling: the acts of people for desiderata in time and space with material
and symbolic means and under residual circumstances. Cross-classifying the twenty-four
steps and eight type-parts in a matrix yields 192 "step-parts" as logical products or joint,
occurrences in the cells. Each step-part can be isolated, manipulated and measured so that
the contribution of its variance, as between a poll and a repoll, to the total variance of any
polled opinion can be experimentally assessed. All the step-parts together constitute the
polling instrument through which any opinion or speech behavior of a public is observed. To
observe opinion most exactly in different populations and cultures, periods and places, for
whatever purposes or circumstances, requires appropriate constancy or standardization of the
instrument. Valid polls (where polled responses agree with living behavior elsewhere) depend
in part on reliable polling (where instrument variation or error is small).
Author’s Bibliography Cited
1. Dodd, S. C
2. Dodd, S. C
3. Dodd, S. C
4. Dodd, S. C
5. Dodd, S. C
6. Dodd, S. C
7. Dodd, S. C
8. Dodd, S. C
9. Dodd, S. C
10. Dodd, S. C
Dimensions of Society, Macmillan, 1942. 944 pp.
Systematic Social Science, (offset), University Bookstore, Seattle, 1947, 788
pp.
Rainboth, E. D., and Nehnevajsa, J. Revere Studies on Interaction, (in
preparation), 1956. 600 pp.
and Nehnevajsa, J. Techniques for World Polls, (in preparation) 1956. 500 pp
"On Reliability in Polling," Sociometry, No. 3, August, 1944.
"Standards for Surveying Agencies," Public Opinion Quarterly, Vol. II, Spring,
1947.
"Developing Demoscopes for Social Research," American Sociological
Review, Vol. XIII, No. 5, June, 1948.
“On Predicting Elections or Other Public Behavior,” International Journal of
Opinion and Attitude Research, Vol. III, No. 3, Fall, 1949.
"Predictive Principles for Polls," Public Opinion Quarterly, Vol. XV, No. 1,
Spring, 1951.
"Word Scales for. Degrees of Opinion," Public Opinion Quarterly, 1957.
#9. Dimensions of a Poll
Reprinted from the International Journal of Opinion and Attitude Research, Vol. 3, No. 3,
Fall, 1949, Donato Guerra 1, desp. 207, Mexico, D. F.
A job analysis, or operational definition, of a poll is often called for. Practitioners want
to train employees for their jobs, social agencies need to coach all volunteers in a
community survey, and teachers want a complete analysis of polling for their students in
courses on public opinion. The analysis given here will serve as a rough guide to be modified
as needed for the various types of polling. The model here is for a "quickie" poll taking two
weeks from inception to report, or 1000 man hours of work. It is a volunteer type, involving
no budget, by using University students. It included 300 interviews, areally sampled, in a
section of a city. (It happened to poll interracial attitudes in a White district newly confronted
by an influx of Negro tenants.) The plan, outlined below to serve as a check list for a poll
director, was made up as presented here before the poll started and was carried through on
schedule (though the number of man-hours are only approximate). It is thus a case analysis
of an actual poll as well as a generalizable guide to the steps and materials needed in
polling.
The degree of itemization is arbitrary. It is itemized here into fifty steps under the six
standard processes of designing; sampling; question drafting, interviewing, tabulating and
reporting.-Each step is an action for which someone is responsible. Its man hours, degree of
skill, and consequent budget can thereby be planned.
The "tensions" or "motivation" column lists the incentives for a volunteer survey. If the
personnel are paid, this column would list items of budgeted or expected expense and,
alongside, the actual expense account.
For each step, its deadline and location can be fixed, the persons assisting or interacting
with the responsible agent for that step can be set, the materials needed and documents to be
used or produced can be specified. These are the dimensions of a poll. They define a
demoscope. They can be written, as below, in algebraic formulas integrating polling practice with
systematic theory in Sociology. They can also be written in folk terms as answers to the questions:
Who? does what? with what? with whom? when? where? why? and how? in polling.
(See "Dimensions of Demoscopes"
on the following pages)
Technical Note:
The job analysis above develops systematic sociological theory in building on:
1. The four basic dimensions of time (T) ; space (L); people (P) and all residual
characteristics (I) in any recorded situation or set of data (S); a poll is thus a particular
case of the dimensional formula (without scripts) for any human situation: S = T; L; P; I.
2. Compounding various social processes and forces as the accelerating (T-2) of
administrative changes in people;
3. Compounding two characteristics — desiderata (V) and intensity of desire (D) — to
define tension (E) measuring peoples motivation or internal stimulation;
4. Compounding social control as the triple correlation (here expected to exceed .9) of
three indices (I8), namely:
a. The intentions of the controllers (the Committee of local civic leaders) to poll their
community;
b. The instruments used by the controllers; i.e., the demoscope here dimensionally
analyzed;
c. The influence on the controlees (the interviewees) in getting them to assert their
opinions.
5. Compounding social organization as the specialized interacting, in a system of social
controls by three parties (P3), namely:
a. The agents (PA ), the pollers
b. The clients (Pc ), the interviewees
c. The public (Pp), the citizenry interested
6. All compounding up to the dimensional formula for any poll (or demoscope more fully)
namely: S = T— 2 ; L2; P3; 13 yielding the classificatory quantic number of 8; 2; 3; 3.
For this dimensional analysis developed in full see: Dodd, S. C., Systematic Social
Science (Seattle: University Bookstore, 1947, 792 pp.).
•
#10. Sociomatrices and Levels of Interaction for Dealing with
Plurals, Groups and Organizations
Reprinted from Sociometry, Vol. XIV, Not 2-3, May-August, 1951
This paper is intended to show how concepts classifying people into levels as:
persons
plurals
groups
organizations
(social elements)
(sets of persons)
(plurals of interacting persons)
(groups of persons in roles)
can be operationally defined, quantitatively refined, and systematically interrelated by the use
of sociomatrices.
I. Persons as Units, P0
The fundamental entity of interest to social science is the individual human being
capable of interacting with other human beings. Thus human individuals are the basic units
from which we shall construct our interperson matrix or sociomatrix, and define the different
types of human aggregate.
If we symbolize the population of people by P, then our units can be denoted by P 0
(people-to-the-zeroth power) since, algebraically, anything to the zeroth power equals unity.
II. Plurals as Sums, P1
People may have various characteristics in common with each other, or they may differ
from each other in various ways. Thus we may sub-classify the class "persons", P. A "plural"
can be defined as any set of people who have some characteristic in common which
differentiates them from other people. This distinguishing characteristic may be a common
political orientation as shared by the plural, Republicans, for example; or a common hair
pigmentation as shared by the plural, redheads; or mere spatial propinquity as shared by "the
people in this room".
Any plural, then, is a sum of individual people having some common t characteristic.1
Such a sum can be regarded in matrix terms as an array, listing one' person after
another, either in a row or in a column (but not in both). It is a population-to-the-first-power, P1.
III. Groups as products, P2
When the persons comprising a plural interact among themselves, stimulating and
responding to each other they become a group. Interaction, of some specified kind, is what
transforms a plural into a group. Just as a plural is the sum of the individuals comprising it, so
a group is the product of the interactions of its members. Mathematically, multiplying a set of
persons by a set of persons denotes social interaction so that a mathematical product of
factors denotes a group of interacting persons.
The persons of any plural, by interacting with each other, are psychologically
"multiplied" by each other. At the plural level, John Doe and Richard Roe are merely added
together as, say, two men who speak English. But when they interact by speaking to each
other, the resulting group (a dyad) is more than just John Doe plus Richard Roe; it is the
product of which these persons are the factors; it is a pair speaking English together.
A group may be analytically symbolized in a cross-tabulation or matrix by putting the
same array of people on two axes, where one axis denotes actors and the other denotes the
same persons as reactors and each cell of the resulting matrix is the product of the row and
column persons. This may be viewed as the logical product of logicians since it is the class of
things pertaining to the actor and the reactor jointly.2 Since it is not necessary that each person
interact with all of the others, some of the cells in the matrix could just as well be empty; or
they might represent any degree of interacting (of the one kind to which each matrix is limited);
it still would be a 2-matrix, people-to-the-second-power, a group.3
Whenever, then, cross-tabulation in a matrix, rather than simple enumeration in an
array, is necessary to depict the situation, we are confronted with a group instead of a simple
plural, or P2 instead of P1.
IV. Organizations as Powers of Persons, P3
The next level of matrices, involving three axes, can be used to define operationally an
organization of people. We begin from the concept of an organization as "a group having
integration of differentiated interaction and members." The differentiated behavior of a person
in an organization may be called broadly his role in that situation. Within the homogeneously
interacting group of friends who eat together, an organization develops when persons
differentiate in roles.
The roles may be formalized as offices with officials called a chairman, a secretary, etc.
The roles may be informal expectations based on habitual behavior such as the role of most
talkative, funniest, dullest person, etc. Roles may grow up almost unnoticed as in the role of
being the youngest child in a family or may be carefully defined as in a job analysis itemizing
the duties of a person in some occupational role. Differential behavior, whether of a dominating
type in leadership, or an exchanging type, or a specializing type, or any other type which
contributes to the whole organization, is included in this concept of role. Roles may be thought
of as behavior expected of a person in a group-defined situation largely because it is habitual
behavior for people in that situation. We believe that any kind of differentiated interhuman
relation within an organized group can be re-expressed as a "role" and entered in some
appropriate cell of a suitably defined matrix.
It is evident that roles thus represent specialized interrelations between members of a
group. (Interrelations include all interactions which are the dynamic subclass of interrelations.)
To be an organization the group with its roles must have some unity or perform some one
function such as a factory which makes automobiles.
The matrix is a technique for spelling out the roles by providing a cell for each person in
each role. These role cells can make a third axis perpendicular to the two axes of the group in
the matrices described above. If a simple group's two axes cross-tabulated on one "page" are
called the actors (heading the rows) and their reactors (heading the columns), then the third
axis might be called the role-actors (heading the pages). Then the cells on "page A" would
record the direct interrelations of one kind between the actors and the reactors (who may be
the same or a different set of persons). Then let "page B" represent the interrelations of the
group in terms of one member's role, let us say, as foreman of a work group. The cells in the
first row and column of "page B" interrelate the foreman as foreman with each actor and
reactor in the group. The corner cell in "page B" relates the foreman as foreman to himself as
non-foreman. Each other cell in "page B" provides for recording the interrelation, if any,
between an actor and reactor as influenced by the foreman. By comparing the two pages one
can tell how the actor and reactor behave differently on "page B" because of the foreman than
they did on "page A" when unaffected by the foreman.
This matrix scheme can provide a "page" (i.e., a cell on the role axis for any and all
possible roles including multiple roles of one person or role of a subset of persons such as of a
committee or a department, etc. The cells then provide a place to record (and compel more
exact and reliable observing thereby) every possible interrelation of persons and persons-inroles, and all their combinations. The third axis repeats the listing of the members (in part or
entirely) according as each member's interacting is differentiated by some role or formal office
or custom-expected behavior when interacting with others. This "role axis" can provide for all
possible differentiations of interaction within any group by assigning each differentiation a cell
on this axis (which becomes a "page" when expanded by the first two axes of the group).
Syndromes of differentiations can form sections of such matrices and can be
summarized by bordering arrays which cross-tabulate any persons against persons-withsyndrome A, etc. The most Inclusive syndrome is the whole organization. Thus the outermost
arrays of a matrix, whether of a group or of an organization, can always deal with the
interrelations of any parts with the whole. The outermost corner of the matrix would express a
relation of the whole to itself as when an organization decides to change its name.
Summarizing indices, appropriately defined, especially from these outermost arrays, can
measure degree of integration of the organization. The matrices thus arrange their
interrelations of people in orderly ways as basic data for computing indices of action or of static
relation at any level. Even though most of the cells may he empty, the matrix specifies just
which of the potential interrelations have been observed in a given set of data.
For one example of how the matrix can help to measure the degree of organization of a
group consider a community. Its members specialize on different occupations, exchanging
their products. The limit of this differentiation of labor is when every person is a specialist of a
unique kind. The matrix will then show one different kind of occupational service index in each
array. A summary index can measure the degree to which each array is pure or is mixed with
indices of other arrays — and so measure the degree of organization in this occupational
respect as a percent of its maximum.
These three axes are taken as factors in an algebraic product which expresses
mathematically their psychological interactions or products. The axes might be named the
"actor axis" as in plurals, the "reactor axis" which defines a group, and the "role axis" which
defines an organization. This asserts that a human organization is at least "people-to-the-thirdpower." This means it is a product of people repeated in three ways; once as actor -again as
reactors to each other and again as role-actors.4
In Sociology an important special case of organization is the institutional organization—
familial, scholastic, economic, political, religious, etc. Here the three chief sets of interactors
are the public, the "agents" or specialists (such as parents, teachers, producers, officials,
clerics, etc.) and the "clients" (such as children, pupils, consumers, citizens, laymen, etc.).
Here the public can be written in the actor array of the matrix and often may need only its
summarizing cell for the whole people. It is observed in any poll which is a representative
sample of the public. The institutional agents are those persons from the public who are
repeated in their roles as specialists in carrying on the institutional interactivities. They may be
listed along the role axis as role actors. The clients are again a part of the public who are
explicitly repeated in the matrix as reactors with the agents. They may be listed along the
reactor axis. Then the 3-matrix of the public, agents, and clients as the actor, role-actor and
reactor factors, respectively, can spell out all the possible interhuman relations that constitute
the institutional organization.
All possible relations among the human parts of the organization can be recorded in
matrix form. The matrix is only one possible definition and arrangement out of many. And
within matrices, instead of three axes, either two axes or four or more could be used. For two
axes, the roles could all border the matrix of actors and reactors making a very large and
unwieldy 2-matrix. Or again the first role of each person could compose the third axis and
second roles could compose a fourth axis, etc. Both of these arrangements, we believe, are
more complex in practice than arraying organizational relations along the three axes of actors,
reactors, and role actors. In short, organizations require at least three axes as a minimum, but
may have any number more.
In summary, matrices seem a simple, comprehensive, standardizable and reliable way
of operationally defining the terminology of groups and organizations so as to increase their
observability and predictability.
The sociomatrix puts the chief possible relations among people in society into a form
which can be operated upon mathematically. The mathematical powers of people, or levels of
interacting, may now be summarized in a generalized exponent script, thus:
(Po, a person
)
1
(P , a plural
)
(the dimensional powers
PP =
(P2, a group
)
of people)5
3
(P , an organization
)
V. Matrices for Predicting Interaction
The question is often asked: Does dimensional analysis, such as the foregoing, only
describe and classify social phenomena or can it also help to predict them? Can it go beyond
classifying people by levels of the exponent corresponding to levels of interaction as plurals,
groups, or organizations? Can it develop new relations or insights? new hypotheses or tests
for them? new induction or deduction of laws? new principles for better prediction and control
of any social phenomena?
We believe dimensional analysis can be a creative tool. We offer two bits of evidence
6
here. One is the deduction of the logistic growth curve; the other is the deduction of the
formula for interactance or demographic gravitation. Both deductions are gathering a weight of
empirical confirmation which may class them in the future as social laws.
To deduce the logistic growth curve take the interaction matrix where each of a set of P
persons are cross-classified against each other. Let the record a 1 or a 0 according as the row
person acts on or does not act on the column person. The interact might be telling a rumor for
example. Let p denote the proportion of persons or section of the matrix who know the rumor
and q denote the rest. (So p + q = 1) Let all the people interact or talk to each other with equal
probability mathematically or effectively equal opportunity socially. The social interacting is
denoted mathematically by multiplying the set of persons by themselves (P + P = P 2
dimensionally). So we multiply together the two proportions of knowers of the rumor and non-
knowers, as follows:
p+q
p+q
2
p + 2pq + q2
getting
which represent four quadrants of the matrix. p2 represents the proportion who know the rumor
talking to each other — with no increase in the number of knowers. q2 represents the
proportion of non-knowers talking „with each other — with no further spreading of the rumor.
2pq represents the proportion of knowers and non-knowers talking together which is where
the rumor grows. 2pq then represents the most probable increment of knowers in a unit period.
On adding up (or integrating in calculus) these pq increments for successive periods the
logistic S-shaped curve results, namely:
P0
Pt = __________
(Editor’s note: May be incorrect due to quality of original paper)
P0 + q0ek/4t
Where Pt = proportion of knowers at
t = anytime
t0 = starting time
e = 2.718, the base of natural logarithms
k/4 = a constant, the slope of the midpoint and middate, which shows the general
speed of growth
This curve predicts that, under conditions of equal opportunity, the all-or-none rumor will
spread slowly at first, then faster then slowly again as only a few hard-to-reach people remain.
Its parameters, p„ and k tell the rate of growth at any moment relative to the total period and
total population eventually told. This mathematical deduction from the dimensional definition of
a group as P2 leads to expecting or predicting a logistic growth for an all-or-none act whenever
relevant opportunities to interact are equal.
Hornell Hart's findings of logistic spurts of cultural growth of many kinds and the PearlReed curves of population growth are the sort of empirical evidence that tends to confirm this
deduced expectation or logistic hypothesis (Ref's. 17, 19, 20). Such empirical evidence shows
when the mathematical logistic law "holds" or fits social data. Such evidence shows how
homogeneous the conditions for growth were in the observed social situation.
Another example of deduction from the interaction matrix is the interactance hypothesis
which is also called the principle of demographic gravitation. (Ref's. 3, 12, 14, 16, 30).
This principle (which seems to this author to be likely to become established as a basic
social law) states the amount of interaction expected between two groups. This might be the
amount of telephoning between cities, or migrating between states, or intermarrying between
occupational classes or communicating news between concentric zones outward from the
center, etc, etc. The amount of interaction (Ipp) expected in a period (T) is proportional to the
product of the number (IP) of acts of each group divided by their intervening distances (L).
Ipp = kIA PA IB PB T L-1 = the interactance, where k is a constant for each kind of
interaction, PA PB are the number of people in groups A and B, and I A IB are the per capita
activity of each group. Whenever this per capita activity cannot be observed, approximate
weights for each group may serve; if no weights are written, unit weights are implicitly used. k
is the reciprocal of the number of acts in all groups; it is the probability of one act, k = 1/Σ IP. In
practice it may also contain another factor adjusting for the size of the units used in the other
factors.
This expected amount of interaction, called the interactance, is deduced from the
Interaction matrix whose rows and columns represent frequencies of acts among the groups.
The relative cell frequency is expected by the law of joint probability to be proportional to the
product of its row and column relative frequencies. This may be tested by expecting the
interaction matrix to be a contingency table with a contingency coefficient of zero. In short, the
interactance formula states the most probable amount of interacting (when the distance and
time factors are also allowed for). The interactance hypothesis then asserts that the expected
interactance will agree closely and reliably with the observed interacting within each pair of
groups. The mounting evidence (Ref's. 3, 12, 14, 30) of close and good fits is tending to
confirm this hypothesis as a scientific law of human interaction. Here again the interaction
matrix, when probability principles are applied to it, permits deducing and predicting an
important social principle.
These two examples of new relations deducible from dimensional analysis may be but a
foretaste of a richer banquet awaiting researchers who use such analytic tools
Notes
1. If the number of persons is denoted by P, then the plural can be symbolized by any of the
following alternative expressions which differ chiefly in explicitness of detailed to summary
symbolizing.
P
P
Formulas: P ≡ P1 ≡ Σ1 (1) ≡ Σ1 P0
P ≡ AP0 + BP0 + CP0 … + PP0
≡ John plus Henry plus Tom … plus Dick
≡ a set of persons
≡ a sum of P persons of one kind
≡ a plural
Any particular plural may be named by a subscript such as PA meaning the A kind of
Persons if this is needed to identify it. Let the pre-superscript, AP, name a person, while the
post-subscript PA, names a plural.
2. A single product of two persons called a pair, may be symbolized by writing (Editor’s note:
Equation cut off in available original)
3. In more mathematical language: If P denotes a person, AP0, BP0, CP0, etc., being names for
P
persons, A, B, C, etc., and if AP0 + BP0 + CP0 … + PP0 = Σ1 P0 denotes a plural, a sum of
persons, an array of a matrix, then:
P
P
(Σ1 P0)( Σ1 P0) = P1 • P1 = P2 denotes a group, a product of two plurals, a 2-matrix. In
extended algebraic form:
+ BP0 + CP0 … PP0 (= 1 + 1 + 1 + 1 to P terms), times:
AP0 + BP0 + CP0 … PP0 (= 1 + 1 + 1 + 1 to P terms), gives:
………………………………………………………………………………………………
……………
AAP0 + ABP0 + …. + BBP0 + BAP0 + …. + PAP0 + PBP0 + PPP0
= 1 + 1 + 1 …. + 1 to P2 terms = P2 .
AP0
Since every Person is unity P0 = 1) and every pair of persons or product of two persons
AP0 BP0 = ABP0 =1 is also a unity, the terms, P2 in number, in the product of a plural with itself
add up to P2. P2 then is a suitable mathematical symbol for a human group. It can
summarize the kinds and amounts of interacting of its members. The algebraic product of
two plurals above may be written in still more expanded form as a matrix of logical products
as follows:
AP 0
BP0
AP0
AAP0
ABP0
BP0
BAP0
BBP0
…
PP0
X=P
…
PAP0
X=P
Σ PX0 • X=1
Σ PX0 =
X=1
…
PBP0
…
…
…
…
…
PP0
APP0
BPP0
…
PPP0
XXP0
in expanded notation, or
[P1 • P1 = P2]
Each cell of the matrix contains XXP0, as a factor stating that whatever entity or
quantity is written in that cell it is an entity or quantity pertaining to the row and column
persons. The XXP0 a logical class making logical product t with any entity written in the cell.
The XXP0 qualifies the cell entry or names it as the cell of its row and column. Any value of a
statistical index (I) of the amount of interaction of one kind may be multiplied by the cell
entry and the product denotes an amount XXI (= I: XXP0) pertaining to the row and column
pair of Persons.
Note the usefulness of the zero exponent in combining the qualifying that is stated
by logical classes and sentences with the quantitative expressions of algebra. Both
qualities and quantities are handled together by means of the zero exponent, when
combined with suitable subscripts, according to the rules of ordinary algebra.
The kind of entity, "person", "pair of persons", etc., and numbers of them amounts of
their characteristics can thus be handled with equal rigor in unified expressions by means
of the zero exponent.
4. Calling these three factors, PP, PQ, PR, respectively, we can write the formula as:
(Σ PP0 ) (Σ PQ0 ) (ΣPR0 ) = PP1 PQ1 PQ1 = P3 an organization
Dimensional formulas in Physics and in Sociology state the relations between basic factors
(i.e., asserting them to be a sum of products or powers of the factors) without regard to the
absolute size of the units. Thus the dimensional formula P2 or P3 means that a population is
taken twice or three times as a factor even though each factor may be numerically different.
P2 means a product of two P's in dimensional formulas and not necessarily one P times
itself.
5. The dimensional powers of people here are but one sector out of five sectors in our
comprehensive dimensional formula for systematizing the social sciences. (Rfs. 6, 7, 11,
16). The five sectors (or classes of dimensions, or basic factors in human data) most used
in the system are time, space, people, desire, and the complement class The dimensional
formula, defined essentially as a sum of products of powers of basic factors is:
S = ΣTtLlPpDdIi
These may be expanded by their chief powers to suggest their meaning as follows:
Dimensions
Subclassified
By powers
By sectors
Space
L
Time
T
Nullity
X-00
L-00 =
No space
T-00
no time
Quality
X0
L0 =
point
T0 =
date
*People
P
Desire
D
P-00
no people
D-00
no desire
P0 =
person
D0 =
kind of desire
Other indices
I
I-00
no index
Any Sector
Zero
I0 =
qualitative
index
A class
Quantity
X1
L1 =
line
T1 =
duration
T-1 =
speed
P1 =
plural
D1 =
amount of
desire
I1 =
quantitative
index
A variate
Relation
X2
L2 =
area
T-2 =
acceleration
System
X3
L3 =
volume
T-3 =
evolution
factor
P2 =
group
D2 =
correlation of
desire
I2 =
correlated
indices
A correlation
P3 =
organizations
D3 =
system of
desires
I3 =
systemic
indices
A formula
* Array studied here.
Our dimensional analysis, which this paper describes a bit tries to re-express the data, the
concepts, and the principles of the social sciences in dimensional formulas compounded
out of the sectors and their powers above together with their other scripts and with the
logical and the logical signs for the operations upon the other symbols.
6. See Refs. 6-17 for further evidence of the uses of dimensional analysis.
References
The articles listed here show some of the further operations on sociomatrices that are
beginning to develop in the literature. Most of these deal with groups (2-matrices) and not with
organizations (3-matrices) as yet. Most of these studies develop summarizing statistical
indices from data ordered in matrices and do not make use of mow algebra as yet.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
Beum, Corlin O, Jr. and Brundate, Everret G. "A Method For Analyzing The Sociomatrix"
Sociometry, Vol. XIII No 2 May 1950.
Bock, R Darrell and Husain, Suraya Zahid, "An Adaptation of Holzinger's B-Coefficients
for the Analysis of Sociometric Data", Sociometry, Vol. XIII, No. 2, May, 1950.
Cavanaugh, J. A., "Formulation, Analysis and Testing of the Interactance Hypothesis",
American Sociological Review, Vol. XV, No. 6, December, 1950.
Cereinka, V., "A Dimensional Theory of Groups", Sociometry, Vol. XI, Nos. 1-2, FebruaryMay, 1948.
S. Chabot, James, "A Simplified Example of the Use of Matrix Multiplication for the
Analysis of Sociometric Data", Sociometry, Vol. XIII, No. 2, May, 1950.
Dodd, S. C., "Dimensions of Society", Macmillan, 1942.
Dodd, S. C., Systematic Social Science, American University of Beirut Social Science
Series, No. 6, 1947 (University Bookstore, Seattle).
Dodd, S. C., "A Tension Theory of Societal Action", American Sociological Review, Vol.
IV, No. I, February, 1939.
Dodd, S. C., The Interrelation Matrix", Sociometry, Vol. III, No. 1, 1940.
Dodd, S. C., Analyses of the Interrelation Matrix by Its Surface and Structure",
Sociometry, Vol. III, No. 2, 1940.
Dodd, S. C., A Systematics for Sociometry and for All Science", Sociometry, Vol. XI, Nos.
14, February-May, 1948.
Dodd, S. C., Interactance Hypothesis: A gravity model fitting physical masses and human
groups", American Sociological Review, Vol. XV, No. 2, April, 1950.
Dodd, S. C., "Developing Demoscopes for Social Research", "American Sociological
Review, Vol. XIII, No. 3, June, 1948.
Dodd, S. C., "A Measured Wave of Interracial Tension". Social Forces, Vol. 29, No 3,
Mar. 1951
Dodd, S. C. "Historic Ideals Operationally Defined" Public Opinion Quarterly, Fall, 1951.
Dodd, S. C., "Dimensional Analysis in Social Physics", (to appear). Read at AAAS
Conference, Sections IC & M. Cleveland, December 1950.
Dodd, S. C., "All-or-None Elements and Mathematical Models for Sociologists,” American
Sociological Review, April 1952.
Forsyth, E. and Katz, L, "A Matrix Approach to the Analysis of Sociometric Data",
Sociometry IX: 340-7, November, 1946.
Hart, H., "Logistic Social Trends", American Journal of Sociology, March, 1945.
Hart, H., Depression, War and Logistic Trends", American Journal of Sociology, Vol. LII
No 2 September, 1946.
Hemphill, J K and Westle, C M "The Measure of Group Dimension", Journal of
Psychology, 1950.
Hertz, D. B., and Livingston, R. T., "Contemporary Organizational Theme, Human
Relations, Vol. HI, No. 4, 1950.
23. Institute for Social Research "Human Relations Program of the Survey Research Center",
University of Michigan, September, 1950.
24. Katz Imo "Punched Card Technique for the Analysis of Multiple Level Sociometric Data",
Sociometry, Vol. XIII, No. 1, May, 1950.
25. Lieneau, C. C., "Quantitative Aspects of Organization", Human Biology, 1947.
26. Longmore, T. W., "A Matrix Approach to the Analysis of Rank and Status a Community in
Peru", Sociometry, Vol. XI, No. 3, August, 1948.
27. Rashevsby, N., and Householder, A. S, "On the Mutual Influence of Individual in a Social
Group", Psychometrika, 1941.
28. Rashevsby, N., and Householder, A. S, "Contribution to the Mathematical Theory of
Human Relations, Psychometrika, 1942.
29. Rashevsby, N., and Householder, A. S, "Further Studies on the Mathematical Theory of
Interaction of Individuals in a Social Group" Psychometrika 1942.
30. Stewart. J. Q., "Demographic Gravitation: Evidence and Applications", Sociometry, Feb.May, 1948.
#11. Predictive Principles for Polls
Scientific Method in Public Opinion Research
THE prediction of human behavior from observed data is both the purpose and method
of public opinion research, as it is of all social science. In this article, the author restates the
basic methodology of prediction from polls and offers a set of twelve tentative rules by which
such prediction may be operationally improved.
This article is an expansion of a paper delivered at the Regional Palters Session of the
Fifth Annual Conference of the American Association for Public Opinion Research held at Lake
Forest, Illinois in June 1950. The author is Director of the Washington Public Opinion
Laboratory at the University of Washington.
Public Opinion Quarterly, Spring, 1951
In any science a summary of the current situation is useful. For experts, it systematizes
knowledge that was won piecemeal; for laymen and students, it provides the winnowed results
of many researchers. This paper aims to restate the chief principles and techniques for
improvement of prediction from polls to later mass behavior. Within the limits of a 5000-word
review, it will try to outline the steps of scientific method applied to demoscopes and to do this
in terms familiar to any science. It will emphasize not the specific techniques and tools (which
vary with each science) but the logical steps (which give sciences their unity).
Prediction is seen as the central function of science. The function of science is often
stated as man's quest to "understand, predict and control phenomena." But prediction is the
test which distinguishes those descriptions of phenomena that are useful or will work, as
distinct from those descriptions that merely give a warm glow of "understanding," by using
familiar terms with pleasant emotional associations.
Prediction further includes control as that subclass of predictions where the predictor
variables can be manipulated by man for his own ends. The choice of ends transcends
science. For science deals only with means. The typical scientific statement is: "If you want
end X, then Y is a means to it (under the specified conditions)." In graded terms (instead of allor-none terms such as "if X, then Y") this typical statement becomes "X correlates with Y." If
the correlation is regularly unity, then the relation of X to Y is called a scientific law. Thus every
scientific law stated in an equation reflects a correlation that is unity, or near unity, between the
variables on the two sides of the equation. The nearer to unity the correlations are in any field,
the more that field is an exact science. In trying to predict better, scientific pollers are thus
working, whether consciously or not, towards the distant goal of developing laws in the social
sciences.
In trying to improve his predictions, the scientist seeks the highest correlation between
his predictor variables (or "predictors" for short) and the variable to be predicted (the
"predictee"). This multiple correlation, K, or equivalent index, relating predictors and predictee,
we shall call the "predictance." The prime objective of all prediction research reviewed here
should be to raise the predictance, which measures the excellence of a prediction. The
secondary objective of research on prediction should be to raise the reliability or statistical
significance of that predictance index when observed in many similar samples. The polling
profession, by and large, have overemphasized the specific prediction — the percentage of
people intending to vote for X, etc. — and neglected the predictance — the warrant for the
prediction. The theme of this paper is how to raise the predictance and so gradually achieve
predictions which become more than transient, local and approximate facts; which tend, rather,
to become scientific laws predicting classes of phenomena with high generality and exactness
under specified conditions.
One further bit of defining terms and aims is needed to avoid confusion. Prediction
means chiefly foretelling in time. But "prediction" is sometimes also used to mean "estimating"
from a sample to a universe, an inference involving the population dimension. "Prediction" is
also used to mean "validity" — an inference involving other behavior dimensions.
Psychologists define validity as the correlation between a trial index of some behavior and an
accepted index of that behavior. In polling, validity would mean the correlation between
"speech and action," between verbal response in the poll situation and molar response in other
situations in life. A poll is valid to the extent that its respondents do in life as they say they do
to the interviewer. This may not involve foretelling since the molar behavior may precede the
polled responses. "Prediction" is sometimes still more loosely used to mean "geographic
generalizing" as in saying "as goes Maine, so goes the nation." Such inferences involve spatial
dimensions (along with temporal dimensions) by inferring that what happens in one area
happens in other like areas.
Prediction from polls may, then, involve any or all of these acts, viz: foretelling in time,
estimating in a population, validating in other behavior, and generalizing in space, since all
these may be correlated together. All of these types of prediction occur, for example, when a
poll uses the verbal responses of a sample in one area and date as the predictors, and tries to
infer there from the predicted variable which may be the molar behavior of the parent
population at a later date and in an area larger than the sampling centers. The four major
dimensions of time, space, population and behavior are thus often jointly involved in predicting.
But they should be distinguished and handled by somewhat separate techniques. Thus
foretelling involves techniques of fitting trend and cycle curves and measuring residual
fluctuations. Estimating from a sample involves techniques of either random or systematic
sampling and estimating the probability of errors of specified amounts, at a specified
confidence level. Validating and geographic generalizing involve techniques of correlation,
whether simple, partial, multiple or other form of correlation, between poll responses and
different life responses and between these in different geographic areas.
We shall include any one or more of these four types of inference in the "predictance."
The predictance is any index of multiple correlation (or equivalent statistic) which measures on
a 0 to 1.0 scale the accuracy with which the predictors predict the predictee. We seek here the
principles and logical techniques for improving the predictance. By "improving" we mean both
raising the predictance or correlation index and raising its reliability on resampling. 1
I. Levels of Analysis
Research which uses a demoscope as its chief instrument of observing, in common with
all scientific research, may involve one or more of at least four levels of analysis. These levels
are:
1) the qualitative,
2) the quantitative,
3) the relative, and
4) the systemic.
At the qualitative level phenomena are analyzed into kinds. At the quantitative level
these kinds of phenomena are further analyzed into degrees or amounts of each kind.
Quantitative analysis thus always follows and builds on qualitative analysis. At the relative
level, two or more qualities or quantities (which may be called the "relatees") are related to
each other. This relational analysis builds on the previous two levels. It may assert a qualitative
relation, the existence of some kind of relation; or it may also assert a quantitative relation, the
correlation coefficient between two relations — and so on in further levels of compounding. At
the systemic level, these relations (and their relatees) are compounded. A system is a set of
many relations and their relatees, both of which may vary in kind, in amount, and in level of
compounding.
In polling, the statement of objectives, the breaking of these objectives down into
variables to be polled, and the phrasing of these variables into questions to be asked verbatim
is qualitative analysis. Measuring the percentages of response, or finer scaling of each
question, represents quantitative analysis. Most of statistical analysis, as in computing cross
tabulations, correlations, trends, curve fits, probabilities, etc. relates at least two variables and
is therefore relative analysis. A predictance is always at a relative level as it is a correlation of
two or more quantified qualities. A full poll involving fifty to a hundred questions on one topic
and its relevant conditions, written up as an integrated study and attempting to describe,
explain, and interpret would be systemic analysis. It involves sets of variables, always more
than two, with all their relations of many kinds, amounts and orders.
These four levels represent four degrees of increasing thoroughness of analysis from
the superficial naming of things at the qualitative level to the profound, complete interrelating of
them at the systemic level. To distinguish between these levels in practice and to work towards
completer analysis, a suitable notation or symbolism is helpful in the social sciences just as it
is in the more developed and exact sciences. The poller who wants to predict better will find
that an exact notation, or more precise language, will increase the precision of his predictions.
In qualitative analysis, the symbols used are pictures, or words and sentences. This is
the field of general Semantics and of Symbolic Logic for rigorous treatment of language
symbols. The poller who wants better predicting will have to master the techniques of
Semantics and Symbolic Logic, especially its subfields of the calculus of classes and the
calculus of propositions. In quantitative analysis, the symbols used are chiefly arithmetic and
algebraic quantities, with geometry and mathematical calculus less involved. Here the puller
needs to master the techniques of test and scale construction and of the statistics of single
variables. In relative analysis, the symbols used are largely those of statistics of two or more
variables. In systemic analysis, the statistics for sets of variables, especially matrix algebra,
are the best symbolic tools we have. These are still very inadequate to deal with the
complexities of the organic structure and functioning of the incipient group behavior which we
call the public's complete opinion on some topic.
These four levels, and further compounded levels, are integrated in our dimensional
notation. This notation is a complete system of symbols which can combine all the levels of
analysis, with equal logical and mathematical rigor, in unified expressions. To facilitate rigorous
scientific research at all four levels, a notation is needed which can express each with equal
objectivity and exactness. Our dimensional notation accomplishes this.
The qualitative level is denoted and specified by a zero exponent so that XA0 denotes
the A class, or sentence, or kind of thing.2 The quantitative level is denoted and specified by a
non-zero exponent, usually unity, so that XA1 means the A variable (or numerical constant also).
The (quantitative) relative level is denoted and specified by an exponent of 2 as a relation
2
always involves at least two "relatees" to be related. Thus XAB
,, means some sort of product of
variables A and B, such as a mean product (which, when in standard units, is the correlation
coefficient). The systemic level is denoted by a dimensional exponent of 3 or more, depending
on the number of factors whose product of some kind constitutes the system.
These exponents of 0, 1, 2, 3+ denoting operations of qualifying, quantifying, relating, or
systemizing, respectively, are only one part of our dimensional notation. 3 But these
dimensional exponents have very large application and usefulness. They do much more than
provide operational definitions for Kant's metaphysical categories of quality, quantity, and
relation. They include as special cases the first four moments of any frequency distribution,
such as from the responses in a poll, i.e., the zeroth moment, a proportion or probability =
ΣX0/N; the first moment, or mean, = ΣX1/N; and the second moment either as a variance =
ΣX2/N or as a correlation = Σxy/Nσxσy; and third moments such as the skewness = ΣX3/N, and
___
Σxy
the little explored triple product moment = N . The exponents provide the operational defining
of many sociological concepts such as the standard social processes of "cooperation,"
"competition," "conflict," etc. They can standardize the major methods of solving social
problems. When applied to the population dimension, the exponents specify the basic
classifying of people (P) into individuals or persons (P0), plurals or categories of persons (P1),
groups or plurals of interacting members (P2), and organizations or groups with differentiation
and integration of its members (P3). When applied to spatial dimensions, the exponents specify
the familiar points (L0), lines or distances (L1), areas (L2) and volumes (L3). When applied to
the time dimension, the exponents specify momentary events or dates (T 0), durations and
ages (T+1), speeds of change or events per period (T -1), and accelerations and decelerations of
change (T-2). These applications of the exponent are glimpses of what dimensional analysis
does in defining concepts and ordering our knowledge in the social sciences.
II. Steps of Analysis
The four levels of analysis in research may be further analyzed into some twelve major
steps or somewhat standard and distinct bodies of research procedure. These steps may be
described as:
Mostly qualitative analysis:
1.
2.
3.
4.
Choosing a prediction problem.
Listing the variables, both predicters and predictees.
Defining each variable operationally.
Interviewing the respondents.
Mostly quantitative analysis:
5. Scaling each variable alone.
6. Sampling the population.
7. Re-observing at two or more dates, with two or more interviewers, in two or more
samples, with two or more phrasings, etc. Mostly relative analysis:
8. Hypothesizing relations among the variables.
9. Correlating the variables in pairs.
Mostly systemic analysis:4
10. Factoring and compounding the variables to form better variables, both elementary
and complex.
11. Systematizing the variables in sets.
12. Verifying to apply and test the system.
For the purposes of prediction research, this set of 12 steps elaborates such standard
formulations as
a) John Dewey's five steps in problem solving;
b) the steps of scientific method which grow out of Dewey's steps; or
c) the six administrative steps in polling. These variant formulations may be
compared in parallel columns as follows:
Steps in predicting Administrative steps
in polling
1. The predictance Designing
problems
2. Listing the
Questionnaire
variables
constructing
3. Operational
Questionnaire
defining
constructing
4. Interviewing
Interviewing
5. Scaling
6. Sampling
Sampling
7. Re-observing
8. Hypothesizing
Further designing
relations
9. Correlating
Tabulation
10. Factoring
11. Systemizing
12. Verifying
Reporting
Dewey’s steps in
reasoning
Felt difficulty
Defining it
Scientific method
steps, in general
Problem
formulation
Observation of
relevant facts
Suggesting remedies
Hypothesizing
Implications thought
out
Generalizing
Decision and trial
Experimental
testing
In actual practice the sequence of the steps will vary from this pattern since researchers
in practice may weave from one step to others and back again in developing an integrated
attack on a problem. In general, the sequence within each level is apt to be as listed here and
the levels are apt to be dealt with mainly in their order as above.
The steps may be stated in several ways. The paragraph above states them in topics.
They may also be described in sentences. They may be more operationally specified as in the
"rules" in the next section which tell the poller what to do in order to improve the predictance.
They may be stated as methodological hypotheses which expect a correlation between each
step and an improved predictance. Polls could then be designed to test each methodological
hypothesis experimentally by observing the size of the correlation under specified conditions.
This would increasingly make the practice of polling more of a science and less of an art. Thus,
for example, the topic "complete listing of predicters" could be recast into the hypothesis that
"in a set of predictions, n in number, under specified conditions, the correlation between the
number of variables in each set and the predictance from each set will exceed .5 (at the 5 per
cent confidence level).'" By selecting n similar prediction situations such as forecasting
elections and by varying the number of variables from 1 to n in the successive situations, this
methodological hypothesis can be crucially tested.
III. Some Rules for the Steps
The steps listed above may be operationally expanded by offering some "rules" for
carrying them out. While these rules may be familiar to experienced pollers and statisticians,
they may nevertheless be helpful to the less experienced pollers. They are still a tentative set
which should be refined by methodological research. They vary from firmly established rules to
more controversial ones. Thus, the rule that multiple correlation tends to improve as the
predicters each correlate highly with the predictee and as they correlate little with each other is
mathematically demonstrable. But the rule in polls that a cardinal variable will predict better
than an ordinal variable is less sure. These "rules of thumb" along with the firmer rules are
recorded here partly in order to cumulate the experience of pollers and partly in order to
propose methodological hypotheses to be tested when resources for such research become
available.
Rule 1, Mass Behavior. To improve predictance, prefer those predicted variables which
measure mass behavior, such as voting, buying, radio listening, reading mass media, working,
eating, sleeping, playing, or any acts which many people do and do often. For if the predicted
variable measures the behavior of only a few individuals or groups, the number of observed
cases will tend to be small relative to the number of unknown determining influences. The
underlying mathematical rule that seems indirectly involved is that, for fully determinate
solutions, a set of n unknown variables must have at least n known relations connecting them.
Of course, one's problem may be to predict the behavior of a few leaders. In such a
case this Ruler warns that lower predictance may be expected.
Rule 2, Complete Listing. To improve predictance, observe as complete a set of
predicters as resources permit. The more numerous the predicters (if Rule 9 is also fulfilled)
the higher the multiple correlation is likely to be. The upper limit of this rule, i.e., the point at
which diminishing returns may cancel it, is unknown in general. Usually two predicters are
better than one, three are better than two, and so on with smaller increments in predictance,
up to half a dozen or a dozen variables at most. A check list of 82 possible predicters helpful in
carrying out this rule will be found below.
In trying to make one's list of predicters for a given predictee more complete, the
principle of "the closest predicters" will help. The predicters are those variables which are
closest in time, closest in space, closest in the population, and closest in behavior to the
predictee. (These four dimensions of closeness somewhat parallel Aristotle's principles for the
association and recall of ideas.) Thus, as pollers well know, predicters close to the predictee in
time will predict better than where a longer time intervenes with its probability of more
uncorrelated events attenuating the predictance. So also the predicters closest to the predictee
in space, in the population and in the kind of behavior predicted are very likely to predict better
than predicters from another area or from another population, or from very dissimilar behavior.
Of course the chief resource in picking predictor variables (in advance of experience
with them) is thorough knowledge of semantics, psychology, anthropology, general sociology
and the particular social science most relevant to the predictee. Thus, to predict voting, the
more the poller knows about the political behavior of persons, when conditioned by groups in
his culture and expressed through words, the more likely he is to select predictor questions
that predict well.
Rule 3, Operational Defining. To improve predictance, define each variable
operationally. In polls this means specifying:
a) what the interviewer says to the respondent (by standardizing the question
phrasing);
b) how the interviewer says it (by standardizing his instructions and training in all the
techniques and conditions of interviewing);
c) who the respondents are (by carrying out the sampling design exactly);
d) when and where the interviews are made;
e) why the respondent answers (as far as his motivation or stimulation to respond can
be standardized as by a face-to-face questioner, with auspices, topic, personal
appearance and opening remark and manner, etc., all of which jointly induce 95 per
cent or more of respondents contacted to respond).
In short, this rule means that good polling practices constitute most of the operational
definition of public opinion. (For we define "public opinion" for technical purposes as the
distribution of responses in a poll. Opinion observed in other ways may be called "popular
opinion.")
Rule 4, Interviewing. To improve predictance, select, train, instruct and supervise faceto-face interviewers. The manuals on interviewing specify this comprehensive rule in fuller
detail so that comment here is unnecessary. The need for it is evidenced by the cumulating
studies which show that some kinds of questions are biased by what the interviewer is or does.
Rule 5, Scaling. To improve predictance, scale each variable alone. In polls, scaling a
variable means expressing the responses to a question such that they represent a) more than
two, b) cardinal, c) unambiguous, d) discriminating units, e) of one kind.
a) The number of units must exceed the dichotomy in "yes or no" questions and may
run from 5 to 10 in most opinion scales. Twelve class-intervals are the most an IBM
card column can handle and will keep the error from coarse grouping to less than 1
per cent. On the other hand, people usually have trouble discriminating an attitude
more finely than in five degrees or so.
b) Units may range in precision from all-or-none (1 or 0) through ordinals (1st, 2nd,
3rd—nth) to cardinals (1, 2, 3, etc.). Research begins in observing a quality which
taken together with absence of it defines the primitive all-or-none quantity, 1 or 0.
Thus every quality, or kind of entity, can be quantified. The problem is to quantify
with higher precision as in the equal and interchangeable units represented by cardinal numbers. The Guttman scaling technique yields ordinal units; the Thurstone
scaling technique approaches cardinal units.
c) Ambiguity in a scale means that its points are "blurred." A case which reads as
though it were at an exact point may really deviate or wobble around it. The
interquartile dispersion measures the ambiguity in the Thurstone technique while the
reproducibility measures it in the Guttman technique.
d) Discrimination in a scale means that the scale points distribute or separate the
population well, preventing a "pile-up" of cases in a few class-intervals.
e) The units being of one kind means that the scale is "unidimensional" or
representable by one straight line. Thurstone's irrelevance criterion tests this
indirectly; Guttman's reproducibility index tests it better; and our subrelation index6
tests it still better (in the least squares sense.)7
Rule 6, Sampling. To improve predictance, sample the population of respondents
adequately and representatively. Wherever stratified or systematic sampling is not suitable,
use randomizing. This rule summarizes the large literature on sampling with its well developed
mathematical and field techniques. What is an adequate size can be specified when the
degree of accuracy, confidence level, and number of breakdowns desired or variances
observed are specified. A rough rule of thumb is to note that sample size in practice tends to
be about 30 per cent of the square root of the finite parent population. Most polls range
between 500 and 5000 respondents.
What the pollers need to know to assure representativeness in population sampling may
be summarized in four principles.
a) Representativeness occurs in any degree as measured by a correlation between the
frequencies of certain characteristics in the sample compared with their frequencies
in the parent population.
b) Representativeness is always with respect to specified characteristics.
c) Representativeness is unnecessary if these characteristics are uncorrelated with the
polled responses.
d) Representativeness is best approximated for all characteristics jointly by drawing a
large random sample.
Rule 7, Re-observing. To improve predictance, use the most reliable questions. Reliable
questions are those whose responses agree with themselves when re-observed (as measured
by a t test, or chi square, or correlation, or other index appropriate to the data). The reobserving may be either:
a)
b)
c)
d)
in another sample of respondents ( = "resampling reliability")
at another date on the same sample ( = "retesting reliability")
by another set of interviewers ( = "interviewer reliability")
by another phrasing of the questions ( = "rephrasing reliability")
Unreliability that is due to random errors always attenuates (lessens) the predictance by
an amount which can be estimated through computing the "correction for attenuation" formula.
Rule 8, Exactness of Hypotheses. To improve predictance, state exact hypotheses.
Each hypothesis should state:
a)
b)
c)
e)
f)
which variables are expected to be related,
with what kind of relation,
with what amount of it,
with what probability on resampling, and
under what conditions.
This Rule 8 is a step at the relational level which should be taken along with the two
preceding levels. For hypotheses which assert certain kinds of relations between certain kinds
of variables are qualitative relations and need to be specified in the qualitative level of analysis;
and hypotheses which assert certain amounts of one kind of relation between certain amounts
of each kind of variable are quantitative relations and need to be specified along with the
quantitative level of analysis.
This Rule 8 can itself be restated as the methodological null hypothesis: "In a specified
set of predictions, n in number, the rectilinear correlation between X and Y is expected to
exceed 0 by .2 at the 5 per cent confidence level; where each value of X is the predictance
index between a response to question Q in a poll and the later relevant behavior, and where
each value of Y is a rank in exactness of stating each hypothesis (to test which the question Q
in the poll was constructed)." Obviously, such an involved hypothesis as this statement in
quotes has not been experimentally verified as yet. Rule 8 at present depends on the alleged
experience of some pollers and scientists that the more carefully they state their hypotheses
and then draw up questions on polls and indices of later predicted behaviors to test these
hypotheses, the more (they believe) their polled questions do predict such later behaviors.
For a simple example of hypothesizing quantitative relations suppose a correlation of .4
(rab = .4) had been observed in a sample between
a) the respondent's asserted intentions to act and
b) such later acts (whether turning out to vote, voting for X, buying X, listening to X
when broadcast, etc.). One might then hypothesize that (in similar samples and at
the same confidence level) the adding of
c) the intensity of the intention-opinion, in a five point equal-interval scale, to the predicters, would raise the (multiple) correlation to .6 (Rb.ac ≥ .6).
Rule 9, Correlating. To improve predictance, choose the set of predictors with the
highest zero order correlations with the predicted variable and the lowest intercorrelations with
each other. These two conditions are most exactly measured in combination by the multiple
regression weight for each predicter.8 These multiple regression weights and their multiple
correlation (the "predictance") with the predicted variable are the important indices to report in
prediction studies, instead of the prevalent piecemeal reporting of percentage points of
discrepancy in a single all-or-none prediction.
Rule 10, Factoring and Compounding. To improve predictance, find the most
parsimonious set of revised predicters. This set is the smallest number of revised predicters
which give substantially the same multiple correlation as the largest set studied. A revised
predictor variable may be a factor (i.e., a subvariable) or a compound index (i.e., a supervariable). Whether such revised predicters will improve the predictance in further samples is
not sure. The conditions for revised variables predicting better in new situations are somewhat
obscure still. In general, since correlation measures the degree of functional similarity or
overlap of two variables, the revised predicters may be expected to predict better in proportion
as they can be made more similar to, or have more in common with, the predicted variable
than were the predicters which had not been revised.9
Rule 11, Systematizing. To improve predictance, systematize the predicters in a formula
or mathematical model. One such formula is the multiple regression equation in which the
predicters when in standard units are each multiplied by an optimal weighting and then added
to yield the prediction. But the influence of the predicters on each other may not be additive; it
may be multiplicative requiring the adding of log variables. Thus the diffusing of an opinion
has been found to be best predicted by an "interactance" formula that is a product of seven
factors.10 This formula or mathematical model for group gravity systematizes the basic
dimensions of time, space, people, and their behavior into a theory of demographic
gravitation, with predictances mostly above .8.
Rule 12, Verifying. To improve predictance, repeat the whole predicting in further
situations. This is the acid test in science: an experiment applying all the inductions and
systematizing in Rules 1 through 11. In proportion as the predictions hold in all new situations,
the generalizations in Rules 1-11 are verified and tend to become methodological laws. This
assumes the basic postulate of all science. This is the principle of the uniformity of nature —
colloquially stated as "like conditions, like results." In proportion as the new situations are like
the situations previously studied, the principles of prediction stated in Rules 1-11 should give
like results. Insofar as the results are unlike, unlike conditions are inferred. The scientist then
starts a re-search to re-list them, define them, etc. to a closer approximation.
IV. Possible Predicters of a Specified Mass Behavior
A study of this list should help the less experienced pollers to observe a more complete
set of predictor dimensions.11 The 20 dimensions most used in polls are starred (*); those
included in Gallup's Quintamensional design are flagged (#).
Dimensional Index
I. BEHAVIOR – respondent’s asserted past behavior – relevant to the
predicted mass behavior. (General question: What has he done?)
A. Individual behavior (General question: What has he done mostly
alone?)
*1. Experience, i.e., behavior like the predictee
I1
2.
Aptitude
I2
3.
Intelligence
I3
4.
Skill
I4
5.
Training
I5
6.
Accomplishments, performance
I6
7.
Exposure
I7
8.
Connectedness
I8
9.
Abnormalities—psycho-neurotic behavior, etc.
I9
10. Possessions as indices of the above
B. Institutional behavior (General question: one mostly in groups?)
11. Reputation — appraising behavior of the respondent I11
I10
I11
*12. Schooling: amount and excellence
I12
*13. Income
I13
14. Family status
I14
15. Employment record
I15
16. Occupation—industry and job
I16
17. Political affiliation
I17
18. Religious affiliation
I18
19. Health record
I19
20. Welfare record
I20
21. Military record
I21
22. Recreational activities
I22
23. Participation in voluntary groups
I23
24. Membership in associations, etc.
I24
25. Artistic activities
I25
26. Scientific activities
I26
27. Linguistic activities (including languages learned)
I27
II. ATTITUDES — respondent's asserted present readiness for the future
predicted behavior. (General question: What is he likely to do?)
A. Intentions — respondent's motor readiness for future behavior
(General question: What will he do?)
#*28. Statements of his intention to act relevantly to the predictee.
#*29. Statements of his purpose, goals, values, ideals, ends, reasons
why" with future reference, etc.
30.
Statements of his plans, programs, means-to-ends, etc.
I28
I29
I30
31.
Commitments—agreements, contracts, registrations, etc.
I31
32.
Expectations—hopes, estimates of likeliness, degrees of
certainty, etc.
Responsibility—reliance on own or outside forces
I32
33.
34. Reward seeking (or punishment avoiding)
B. Feelings—respondent's affective readiness for future behavior
(General question: What does he feel?)
*35. Pro-con opinion — the prime opinion, or content or direction of
opinion, or substantive opinion, about which the intensity,
formedness, etc., may be asked
#*36. Intensity of opinion, expressed in adjectives, etc. "I feel very
strongly," etc.
*37. Intensity of desire indicated by willingness to pay for ..., to try
for ..., to work (or ..., to sacrifice for ..., to exchange Y for X,
etc.
38.
39. Satisfaction-dissatisfaction
40. Degree of liking-disliking
41. Interest in the predictee
42. Importance of the predictee to the respondent or his in-groups
43. Desirability of the predictee to the respondent or his in-groups
44. Emotional tone of the pro-con opinion
45. Ego-involvement
46. Rigidity — resistance to pressures to change the pro-con
opinion
47. Identifications with the predictee
48. Polarity (feels all is black or white, no grays)
49. Suggestibility (tends to say "yes," or responds differently with
anonymity, secret ballot, etc.)
50. Saliency—(what's uppermost in mind)
I33
I34
I35
I36
I37
I38
I39
I40
I41
I42
I43
I44
I45
I46
I47
I48
I49
I50
51. Abnormal complexes
52. Prestige
53. Taboo
C. Information — respondent's cognitive readiness for the future
predicted behavior (General question: What does he know or recall?)
*54. Formedness of opinion or degree of structuring—
respondents are ready, hesitant, need probing, or "don't know"
#*55. Informedness on issues I55
56. Knowledge of the past conditions
57. Knowledge of possible future conditions or implications
58. Linguistic informedness—knows the language and all words
used by interviewer
59. Knowledge (and approval) of the poll's auspices, purpose,
conduct, etc.
III. CONDITIONS — other variables which may vary with the respondent's
predicted behavior. (General question: What correlates?)
These conditions may vary:
I51
I52
I53
I54
I55
I56
I57
I58
I59
a. By natural design, as things happen within the sample; or
b. By question design in asking conditional questions of the
form "If A, then what?"
c. By poll design in extending the polling so as to vary any
specified conditions.
A. Spatial conditions (General question: Where?)
60.
61.
Point locations of spot polls, sampling centers, respondents'
addresses, etc.
Distances between —
Areas — sampled or asked about, densities of people or
anything per unit area
63. Volumes — crowding in housing, floors per building, quantity
of materials, etc.
B. Temporal conditions (General question: When?)
*62.
0
L60
1
L61
2
L62
3
L63
0
64.
Dates
T64
65.
Durations
+1
T65
Tenses — past, present, future — durations relative to the
present as origin
*67. Sequences — ordinal durations, 1st, 2nd, 3rd; means-andends; cause-and-effect; "reasons why" if asserting
sequenced conditions, etc.
*68. Ages — cardinal durations from variable origins to the present;
young-old; new-ancient, etc.
69. Intervals — durations from any origin to any terminal date; shortlong; soon-later; recent-long past, etc.
*66.
-,0,+
+1
T66
t T +1
67
t,0
t
T68
+1
t,0
t
T69
+1
70. Speeds — occurrences per period, rate of change, etc.
-1
T70
71. Celerations— accelerating or decelerating of a speed or of a
process
C. Population conditions (General question: Who?)
-2
T71
*72. Persons — the respondents own behaviors and attitudes listed
above expand this heading
*73. Plurals —categories of persons such as the sample and the
parent population; the respondent's sex, race, etc.
*74. Groups — plurals of interacting members of any kind; all
conversational groups (including interviewer-interviewee); all
formal membership groups, etc.
75. Organizations — unified groups with specializes' members; all
institutional organizations
D. Residual conditions — further indices or combinations of any above
0
P72
1
P73
2
P74
3
P75
(General question: How?)
76. Physical conditions of any kind relevant to the predictee
77. Biological conditions of any kind relevant to the predictee
78. Human physiological conditions of any kind not itemized above
I76
I77
I78
79. Cultural conditions of any kind not itemized above
*80. Stimulus conditions specific to the referent of each question
*81. Semantic conditions, specific to the phrasing of a question and
the relation of symbol to symbolized; the definitions of terms for
interviewers and coders, etc.
*82.Situational conditions — specific to all the variables in the
interview; instructions to interviewer, or to respondent, etc.
I79
I80
I81
I82
To use this check list of predicters the ideal operations, to be carried out as far as
resources permit, are:
Select predicters for trial by means of a small group of experienced judges who have
the objectives of the poll and the predictee variables clearly defined;
Phrase these predicters in pollable questions which can yield correlatable indices;
Pretest them on 5o or so persons properly sampled with skilled interviewers who follow
up the questions with probing in order to develop better phrasings;
Correlate each predicter and predictee, in four fold tables, to choose the most predictive
variables and to estimate roughly the level of the multicorrelation or predictance to be
expected.
These operations complete the scientist's job in a pretest, namely, to estimate optimally
the costs and consequences of any behavior — which in a pretest is "predictive polling."
Notes
1. One of the most basic studies of prediction principles is Paul Horst’s "The Prediction of
Personal Adjustment," Social Science Research Council Bulletin No 48, 1941, pp. 455.
2. Thus any quality (X0) or quantity (X1) or relation (X2) or system (X3) may be qualified by
''multiplying" it by a specified quality. This "logical product," as logicians call it, asserts
something (XA) with a condition (X C0 ) qualifying it (XA X C0 ). Thus a percentage from a poll
(XA1 ) might be qualified by being a percent for the second response, to the fourth question
0
of Poll number eight of the W surveying agency (X1 X W:8:4:2
) when asked with a card (X C0 )
providing visual supplementing of the oral stimulation, by a trained (X T0 ) set of ten
interviewers (10PX T0 ), of respondents selected by a specified sampling design (X S0 ), etc.
1
0
0
0
0
The algebraic formula XA X W:8:4:2 X C (10PX T )0 X S specifies all this and specifies it in such a
way that rigorous logical and mathematical inferences can be made. In short, we believe
the zero exponent to be one of the most powerful symbolic tools scientists have developed
since the digit zero made the decimal system and modern mathematics possible, in polls or
in any scientific research, to combine qualities and quantities with equal rigor in the same
algebraic expression. One can thus solve for qualitative or quantitative unknowns using
appropriate rules of algebra.
3. Fuller evidence on the sociological use of the exponents and dimensional analysis
generally is presented in the author's Dimensions of Society, Macmillan, 1942, pp 944, and
Systematic Social Science, University Bookstore, Seattle, 1947, pp. p88.
4. From a strictly logical viewpoint these practical steps in scientific research are not pure
analysis. They also involve synthesis increasingly, especially at the systemic level.
5. .5 is an arbitrary guess to yield a definite statistical test, until experience makes better
guesses possible. The correlation will be curvilinear since the increment in predictance will
tend to becomesmaller as the number of variables in the set gets larger — the diminishing
returns principle.
6. See Dodd, S. C. and Buechley, R. W., "On Subrelation — the Statistics of Part-Whole
Relations," ms available from the Washington Public Opinion Laboratory until published.
7. A useful technique much used by psychologists and statisticians but apparently little known
to many pollers is to express the variables in standard scores, i.e., in standard deviation
units. This makes polled responses more comparable and greatly facilitates correlating and
many further analyses. For a dichotomous question scored 1 for presence and 0 for
absence of the attribute and where p is the mean or proportion present and q = 1-p, the
standard score for "1" is √q/p and for "0" is -√p/q (which may be read from tables).
8. A simplified formula for multiple correlation to aid in finding the predictance where there are
many variables in a large sample and where IBM equipment for matrix multiplication is
available is the following;
2
R C • n = 1 - | σp X 1n+C
p
σ X n+C
| / | σp X 1n
p
σXn|
where C denotes the predicted variable or criterion;
c
" the set of predictor variables, n in number;
X " a response by one person to one question;
" a response in sigma units;
σX
pX
" the response of p persons;
Xn " the response on n questions;
p
" the matrix of p rows and n columns of standardized
σXn
responses;
X1
"
the transpose matrix (i.e., rows and columns
interchanged)
Xn+C "
the matrix with the C array added
| |
"
the determinant of the matrix
In these terms machines can compute predictances rapidly. The brevity and the
standardization of this multiple correlation formula in terms of standard scores on IBM
cards is increased by the dimensional scripts. This script notation has been standardized
by us for all its uses in Logic, Statistics and the Social Sciences. To anyone knowing this
highly general notation the key here is unnecessary since this multiple R formula is only
one of a myriad possible instances of oar dimensional analysis.
9. Rough approximations to factor analysis and regression weights may be computed from
N2 - n
standard scores, σX, without the labor of computing the _____
raw intercorrelations of
2
the n variables in a poll. List the standard scores in p rows for the p persons in the sample
and in columns for the n variables or questions to be studied in a poll. Add the entries along
n
each row to get the sum, XΣ = I=1
Σ σXI, of all the n variables (or of any subset desired). If σΣ2
denotes the variance of this sum, the average intercorrelation is simply:
σ2 -n
Σ
_____
r = n2-n
The correlation of any two variables including sums of variables, when in standard
scores. σX, is easily computed as their average product (of two columns in the matrix):
P
rab = Σ1
σXA σXB /
p
Manipulation of these two formulas can give the poller rough but quick indications
somewhat analogous to factor loadings or to multiple regression weights of what each
variable in a given set contributes to predicting any predictee.
The actual prediction in standard scores is the simple regression equation:
σXC = rAC σXA
where A denotes the predictor, C the predictee and r the predictance, and a denotes that
the responses, X, are in standard deviation units,
10. See Dodd, S. C. "The Interactance Hypothesis,” American Sociological Review, April, 1950,
Vol. 15, No 2, and "A Measured Wave of Interracial Tension" (expected to appear in Social
Forcer Spring, 1951).
11. To help develop a more complete set of predictors for any given variable to be predicted
this list is offered. Study of the list may suggest predicters otherwise unthought of. The list
is incomplete and tentative. Some of the variables are ill-defined still and some may
correlate so highly with others as to differ chiefly in name. Much research is needed to sift
this list and standardize predicters which best predict the largest number of predictees at
most times and places and under most varied conditions.
#12. Scientific Methods in Human Relations
The American Journal of Economics and Sociology, Volume 10, April, 1951, Number 3
I. Scientific Method of Solving Human Problems
Can Science Save US? This question is discussed by Professor George A. Lundberg of
the Department of Sociology of the University of Washington in his article and book of the
same title.1 His argument is that science, being the study of how things actually work or how
people behave, is the most hopeful way of learning how to predict and control such behavior.
Other methods of solving problems, such as depending upon traditions from the past appealing
to the supernatural for aid, or relying on leaders with current knowledge only are proving less
effective than scientific solutions. Science observes thoroughly and exactly what the factors are
in each situation, how they work, how these uniformities can be stated as 'laws', and how they
can then be manipulated to get whatever man desires. Science is thus a means for discovering
what works, or what will solve our problems in practice.
The ends for which science may be used, of course, lie beyond science, since the
problems man wants solved come within the realm of value judgments. These value judgments
can themselves be observed and measured scientifically in observing what men actually
choose or prefer (as distinguished from what they "ought" to choose).2
Increasingly during the past 300 years, men have found that the scientific attack of
observing how things actually work, instead of how they have been believed to work by
tradition, increases men's power to manipulate phenomena according to the rules they
discover, and thus their power to get the results they want. Sociologists believe that these
same pragmatic methods applied to human behavior and human relations will increasingly give
us control over them, just as these methods have given us control over physical phenomena.
The objection often advanced that human beings are different, that they have a
personal equation, that they have a soul, or in some way are not amenable to being observed,
is the sort of objection that has been advanced against science at every step since it began.
To be sure, human beings are highly variable and complex, but that only makes the observing
more difficult. Sociologists are increasingly finding that when they observe human beings in the
mass, regularities emerge, just as in physics where molecules, etc., are always observed in
enormous masses, never as tagged individuals. The accuracy of all insurance, for instance, is
evidence of the regularity of the behavior of human beings in the mass where predictions can
be made to several decimal places with extremely high probability of their being correct.
In short, the working faith of the younger sociologists is that if they were given the
resources in manpower, brains, funds and authority which the physicists have had for several
generations, similar improved understanding of human relations, and consequent ability to
predict and control them, would emerge. Hitherto, society has devoted millions or even billions
of dollars for a 200 inch telescope, for research on plastics or on atomic fission. But funds of
only a few thousand dollars are usually available for research on human behavior and human
relations. If we are to control our problems of periods of industrial depression and
unemployment, of racial discrimination, of ignorance and backwardness throughout a large
part of the human race, or of wars with all their attendant destruction, we must have scientific
research first and foremost, getting at the causes and conditions and methods of controlling
these. To exhort nations to be good and not to make war, or to promise in treaties to keep the
peace, is less effective than finding the causes that impel nations, often almost against their
will, into wars, and to change those causes. It has been done in removing wars between
families and between cities and subdivisions of one sovereignty to a large extent, and seems
as if it might be possible on a larger scale.
The social scientist realizes that he cannot as yet prove the effectiveness of scientific
methods on human problems, any more than the physicist could prove the effectiveness of
scientific methods on physical problems in the early days of physics. But in proportion as he
delivers the goods through use of scientific methods, society's confidence in him and in these
methods should increase, and society should progressively gain control over these problems.
It will, of course, be a slow process from which immediate results cannot be hoped for. It must
be preceded by basic research on principles which at first seem to have little application.
II. Scientific Methods Described
To develop this thesis that science can solve problems, or more exactly, that social
science can solve our human problems, let us specify what we mean by science and scientific
methods and then go on to give three examples of its application which hold out promise for
the future.
The distinctive thing about science is the scientific method. Any knowledge that is
secured by these methods we call a science, and any field of knowledge can become a
science by using these methods. The essential thing to know about science, then, is the nature
of scientific methods.
Scientific methods to some may mean anything from specialized techniques up to the
broadest procedure of logical reasoning. In general, however, there is a pattern of five steps
which serve as a somewhat standardized model for the scientific method, though there are
many variations from this model in practice. This pattern of five steps can be simply described,
as an introduction for students, as the five steps in reasoning or problem solving stated by
John Dewey. This is a primitive form of scientific method which all of us, including animals, use
in daily life in trying to solve the problems that confront us. Dewey's steps are:
1. A difficulty is felt. The organism finds its habitual behavior unsatisfactory in some
respect and senses a difficulty.
2. The difficulty is defined. The organism observes more exactly what the difficulty is,
whether it is a thorn pricking or a question of what career to adopt.
3. Suggestions arise as to a possible solution for the difficulty. The richness and
adequacy of the suggestions, of course, will depend upon the intelligence and
experience of the organism, and the extent to which it has a cultural fund of the
experience of others to draw upon, such as the rules of modern science.
4. Implications of each suggestion are thought out. With lower organisms they may
actually be tried out in trial and error attempts. In higher organisms they will be
covertly tried out within the nervous system in thinking of the costs and
consequences of each suggestion and comparing them before acting.
5. Finally a decision is reached, after comparing the implications of various
suggestions. One of the suggestions is adopted for trial. If it succeeds, the problem
is solved. If it does not the process has to be repeated.
The scientific method simply makes these steps more exact and formalized, with all
sorts of instruments and techniques for carrying out each one in a thousand and one different
situations. Thus we might give a generalized description of scientific method in five steps
paralleling Dewey's steps in problem solving, as follows:
1. Problem formulation. Every piece of scientific research must start with the
formulation of a problem, whether it is in the developed form of a hypothesis statable
in an equation to be crucially tested, or in more exploratory form of an attempt to find
out how phenomena in a certain field behave or even in the instrumental form of
developing an instrument with which to observe better and attack more effectively a
more ultimate problem later. Thus it took many decades for the microscope and
telescope to be developed, and it is similarly taking time for demoscopes or polling
agencies observing human populations by sampling techniques to become
developed to a similar degree of precision.
2. Observation of the facts. The scientist then observes, with the help of the best
instruments available all the facts relevant to his problem. It is often difficult to
determine what is relevant, and frequently much time is wasted in irrelevant attacks
through the use of concepts that are not fully observable. Semantic difficulties are
great, especially in the social sciences, where our folk words for such things as
"conflict," "competition," "co-operation," do not have exact meanings and therefore
lead to fuzzy observing. Discovery of the best concepts or categories by which to
guide and organize the observing takes many decades, as physicists and chemists
found in following up famous false leads, such as in the pursuit of 'phlogiston' before
oxygen was discovered, and in the pursuit of alchemy.
3. Generalization. Observing the facts yields principles which, if uncertain, are called
hypotheses, and if well established or verified are called laws. These are brief
statements of how phenomena behave under specified conditions, verified by
obtaining identical results in many particular instances. Generalizing takes many
forms. One form is classifying the phenomena in some orderly scheme. Another
form is calculating averages, and other statistics, that summarize the phenomena.
Other forms of generalizing are stating a relationship between two or more variables,
such as the volume and pressure of a gas, malarial parasites and malarial fever, and
perhaps absence of international authority and international wars under conditions of
expanding national interests.
4. Deduction. Deducing goes hand in hand with inducing of generalizations from
particular cases. Deducing means going from a principle to particular instances of it,
from a class of behavior to predicting how members of that class will behave. A set
of principles is given, whether derived from empirical observation; or derived from
theorems, proved from earlier principles; or derived as axioms assumed to start a
theoretical system. The implications or consequences of these principles can then
be deduced by use of the rules of symbolic logic.
Logic is the science of deduction. It develops a set of rules for valid deducing
which are used in common by all the empirical science. Mathematics is a further
extension of logic in developing rules for drawing inferences about variables or
numbers, whereas logic deals primarily with qualitative symbols such as words and
sentences. Both are combinable, and also are combinable with the phenomena to
which the symbols refer, as in dimensional analysis. Dimensional analysis in physics
and in sociology is a system for dealing with phenomena by the exact rules of
symbolic logic and of mathematics.3
A typical outcome of the deduction is "if principle X holds for phenomena Y
under conditions Z, etc., then observing a case of phenomena Y under conditions Z
will show the behavior X." The conditions may be an experimental set up or a
statistical selection of similar conditions happening in astronomy or in sociology, in
neither of which is experimental manipulation always possible.
5. Verification. After such deductions are formulated, trials are made, and if the
phenomena behave as deduced or predicted, we say that the principle is verified
under those conditions. It is important to note that every scientific law without
exception, even the theorems of mathematics, holds only under the conditions under
which it is derived (which includes the terms in which it is stated). Man strives to
generalize these conditions as much as possible. But at any given time he can make
predictions or use scientific laws only within limits of their appropriate conditions.
Thus gravity on our earth has never yet obeyed the exact mathematical law. This
law holds only under conditions of a vacuum, which has never been perfectly
achieved in human experiments. Euclid's theorems in geometry hold only under
conditions of the axioms and definitions of terms in his type of geometry. A large part
of scientific method then is the search for the relevant conditions under which the
generalization holds good.
The above steps are not by any means always taken in the order stated here. Like a
moving staircase, many researches begin anywhere along the line. The steps may be in
varying combinations. They are, however, a useful, standardized pattern, since all the steps
have to be taken in some form or other at some time or other in scientific problem solving.
Another note should be made that it is technology which solves problems by applying
known principles to particular situations. Pure scientific research usually does not solve
immediate problems but discovers principles which can be manipulated more powerfully and
generally to solve specific problems in the future. A major difficulty in social science is that the
public is impatient to have immediate problems solved and does not realize the necessity of
spending decades — and the efforts of thousands of able researchers equipped with hundreds
of millions of dollars of resources — to develop through pure research the basic laws on which
technology and the ultimate solution of immediate problems can be built.
II. An Example of Scientific Method in Dealing with England's Wartime Food
Problem
In England, in the early part of the War, the system of food rationing and distribution
was in a very unsatisfactory state, with much complaint from the public about it. The problem
was to reduce public dissatisfaction and provide an adequate diet for every person in the
British Isles under conditions of wartime, with the British Isles not producing enough to feed
themselves and having to import much of their food with limited shipping through submarine
infested areas.
The second step of observing the relevant facts consisted of setting up a system of
ration points and cards which assured every citizen of an adequate diet, with no waste. Issuing
these cards to all the population then determined the exact amount of total foodstuffs needed
every day for the entire country. Every grocery store and food distribution center kept daily
records of every commodity and No. 11 Downing Street knew the entire consumption of every
commodity every day through every grocery store in the country. They also knew the exact
amount of shipping tonnage and places from which food must be imported, and the competing
need for using shipping space for munitions and other war purposes.
All this gave the facts for the past up to the current moment, but it was necessary to
forecast the immediate future. How much would the housewives want of spinach and carrots,
etc., six months away in the summer time? Orders for allotting shipping space, farm quotas,
etc., must be made six months to a year in advance. The government developed a surveying
agency, or demoscope, to interview a representative sample of housewives. When this sample
of a few thousand yielded results on actual consumption that agreed within a percentage point
or so with the known consumption of all foodstuffs in the British Isles, they considered their
instrument for observing was sufficiently accurate to use in forecasting the future.
Then the interviewers asked the representative sample of housewives how much they
expected to use of each commodity one month, six months, etc. in the future. The percentages
and averages made up were the generalizing of the particular data from interviews. From
these, tables of forecasts and deductions were made as to how much of each commodity
should be brought in from each source on each ship on each voyage.
This scheme was increasingly put into practice and verified the previous inductions and
deductions. During the latter part of the war, the excellence of the food administration became
one of the most satisfactory parts of the war effort to the British public. The forecasting through
polling not only adjusted the supply to the demand in detail but assured maximum satisfaction
of the maximum number of housewives under the given conditions. It is a remarkable and little
known triumph of social science that the economy of the British Isles of nearly fifty million
people almost gave up, for distributing essential foodstuffs, the use of money in determining
prices, demand, and supply. It used the artificial system of ration points as a currency,
determining needs through polled expectations months in advance of the actual demand in the
stores. In other words, the ability to observe, generalize and predict the complexities of an
entire distributive system were found to be within our present state
Of course, there were costs for this. Food purchase and production had to be
subsidized. Certain freedoms of buying and selling where and when one liked had to be
sacrificed for the security of getting food without waste and getting it equally for all. These were
values accepted by the British nation in order to win the War.
IV. An Example of Scientific Method in the G. I. Bill of Rights
After World War I and many previous wars, demobilization created a great problem. If it
was carried on too fast, transportation facilities were glutted, business could not absorb the
returning soldiers, and there was much unemployment and bitter dissatisfaction from the
returning men who had no jobs and crowded housing. If it went too slow, there was resentment
among the soldiers, and frequent rioting over the delay in getting home.
The American government foresaw this problem and sought to return ten million or more
service men to civilian life as promptly and smoothly as possible after World War II. In addition
to planning the transportation facilities, etc., a major plan for a G. I. Bill of Rights to provide
funds and educational opportunity was proposed to ease the expected unemployment.
No one knew in 1942 and 1943 how to draw up such a Bill of Rights, or what it would
cost. How many of the twenty million eligible men would want to go to school instead of
marrying or hunting jobs? Estimates of costs varied from three to twenty-four billion dollars.
To solve this problem, more exact facts or observations were needed. The Army
requested the Morale Branch, a surveying agency, to ask a representative sample of the troops
in all theatres what they intended to do. This sample survey (duly corrected for the intensity of
the answers and the probability of married men and men at different levels of schooling actually
doing as they said they intended to do) revealed that some 8 per cent of the Armed Forces
would use the G.I. Bill of Rights if drawn with certain specifications. This generalization was
acted upon and the Bill was drawn up. A point priority system was developed for demobilizing,
based upon the troops' own weighting of such factors as the individual's number of years in
service, number of dependents, combat experience, etc. Trial of the system, as a verification
step, showed less dissatisfaction with the demobilization system in this war than in perhaps
almost any other war in our history. There were no riots. The expected unemployment problem
did not develop (due not only of course, to the educational opportunities but also to the
reconversion program and many other factors). Always in a social situation many factors are to
be expected and problems are to be solved by attack upon more than one. Solutions are to be
expected only in progressive degrees as more factors are more completely taken into account.
The final verification of the whole experiment was dramatic. Here was a case of social
scientists making an exact prediction that 8 per cent of some twenty million eligible service
men and women would behave several years later in a new way (since none of them had had
to make a decision about demobilization before in their lives). The writer was told about a year
ago that the outcome had been that some 8.1 per cent of the eligible G.I.'s had used the Bill of
Rights since the war. Here is a degree of precision in predicting the behavior of many people
well in advance which can compare favorably with predictions in the physical sciences.
A warning should be noted that all prediction has a certain error, from sampling or
experiment or observation, and the more precisely the science can estimate the size of this
error and control it, the more it is entitled to be known as an exact science. Predictions of
human behavior from polls should be stated as a certain amount, plus or minus a certain probable error. This error, when due to sampling, can be calculated by mathematical formulae. The
best scientific work, whether predicting elections or something else, should state a figure with
the caution that it may wobble on either side within probable error limits, and thus not lead the
public to expect greater precision than the instrument can currently deliver. In polling, this error
in sampling is well known, and while it may cost thousands of dollars and take thousands of
hours to compute, with a mathematical formula that will cover a whole blackboard if expanded,
yet it is possible to get any degree of precision if sufficient funds are available. The sampling
error can be reduced toward zero, of course, by using more refined but more costly
procedures.
V. An Example of Scientific Method in Basic Social Research—the
Hypothesis of Group Gravity, or Interactance
The above examples of scientific methods helping us with our problems of human
relations are unusually favorable ones. A completer survey of the situation would show a vastly
larger number of human problems still unsolved. The scientists' working hypothesis in such
cases is that we need more basic research digging up fundamental principles from which, in
time and often in unexpected and indirect ways, solutions to immediate problems may be
forthcoming. Have the social sciences evidence of such basic principles or incipient laws
developing? Again, out of many possible examples only one can be given here. It is selected
as one which is potentially of great importance and which is at an early stage of development
and therefore shows the methodology and some of its uncertainties.
This example has been called the "interactance hypothesis," or the principle of
demographic gravitation between human groups. Following up the work of Zipf, Stewart and
others, our Public Opinion Laboratory is contributing a number of studies to test it. 4
A. The Problem. The problem is "What is the quantity of interaction between two
groups; or, how many of the interacts of members will be interacts between the two
groups?" Thus, in the United States, if the interaction is telephoning, what will be the
expected number of telephone calls between any two cities? Other examples of
interacting that have been studied are migrating of people between states of the
United States; travel of passengers by bus, plane or train between states or cities;
students from different states attending universities in other states; post office
money orders between Seattle and other cities; automobiles by states visiting Mount
Rainier Park; people in one town intermarrying with people in other towns; spreading
of a rumor about an interracial rape case; and a considerable number of other kinds
of actions between people.
The problem may be put as: Is there any principle expressible in a
mathematical formula that would predict the most probable amount of one kind of
interacting of any two groups under given conditions?
B. Observation of Facts. In order to get the facts systematically observed, a matrix
must be used to define a group. A matrix is a rectangular arrangement of numbers
or items in rows and columns. It is a mathematical name for a tabulation. In the case
of telephoning in the United States, for example, the matrix will have a column for
each city and a row for each city in the same order. The cell where a row and
column intersect will give the number of telephone calls between the row and
column cities. The cities are subgroups of the total U.S. group. The telephone
company has given us the data of the number of phone calls in each cell, thus
specifying the actual amount of interacting between each sub-group and every other
sub-group in the entire U.S. group. Further observation by Zipf at Harvard and
Stewart at Princeton and others have indicated that this number of actual telephone
calls is highly correlated with three factors. These are
1) the total number of telephone calls in city A,
2) the total number of telephone calls in city B, and
3) the distance between those two cities — for every pair of cities.
For one period of time, the most probable number of phone calls or interacts
between any two cities is expected to be proportional to the product of the number of
phone calls of city A (whether within itself or outward bound) times the number of
phone calls (of all kinds) in city B, divided by their intervening distance, and
multiplied by a constant k, which adjusts the units. This constant k can be proved by
dimensional analysis to be the reciprocal of the total number of phone calls in all
U.S. cities. (It is the analogue of the physicist's constant of gravity, g, at the surface
of our planet.) The formula in simplified form is
Ie = k(PA PB) / L (The Interactance Defined)
where P is the total number of phone calls for the sub-group identified by the subscript, and L is the intervening distance, and I is the expected number of phone calls
between sub-groups A and B. From this formula it is possible to calculate the
expected number of phone calls between every one of the possible pairs of cities.
The number of pairs is the combination of n things taken two at a time. This means
that if there are n cities there will be (n2 - n2)/ 2 pairs of cities, or observations of
actual telephoning to compare with the expected telephoning.
C. Generalizing. The interactance formula above is a generalization drawn from sets of
similar data which social scientists have reported in the last few years. It is
somewhat simplified, for there are actually more variables than are shown here in
the rough or first approximation form of the formula. The hypothesis of interactance
is that interactance as defined by this formula will correlate highly with observed
amounts of interacting of the sub-groups in pairs.5
Alternative generalizations or hypotheses may be constructed to be tested to
see whether they fit the data or agree with the actual interacting better than the
hypothesis above does. Thus at first the hypothesis was tried in a simpler form,
namely, that the product of the two populations of the two cities divided by their
intervening distance would be proportional to the actual amount of interacting. It was
found that if the two cities had different levels of interacting (i.e., different amounts of
per capita acting of the given kind) this introduced a correction or refinement in the
formula. The expected interaction given by the formula then correlated more closely
with the amount of observed interacting. This revised the formula to be as above,
where the population of each city multiplied by its per capita amount of telephoning
becomes simply the total number of telephone calls of that city. Other variations in
the formula have been tried, and many more probably will be tested.
D. Deduction. From the induction or generalization expressed in the formula above, the
deduction can be made that interacting between two groups in some particular way,
such as in sending first class mail to each other, will be proportional to the factors in
the formula above.
It is possible in this case of group gravity, however, to start deduction further
back. Instead of beginning with the generalization derived from observation, we may
begin with assumptions or first principles. Thus by the law of joint probability, the
probability of two independent events occurring simultaneously is the product of their
independent probabilities. The probability of any telephone call in the United States
being a telephone call involving city A is the ratio of A's telephone calls to all
telephone calls in the United States. Similarly, the probability of city B's telephone
calls is the proportion that B's calls are of all calls. The joint probability of an interact
between A and B, then, is the product of these two probabilities. It may be written in
the cell where the A row and B column intersect as the product of the telephone calls
of each community, divided by the square of the calls in the United States. Every cell
will have this square of calls in the United States in the denominator so that it is a
constant throughout the matrix and is absorbed in the constant, k. Then the
expected number of calls between any two cities appears in the matrix as the
product of the calls of the row city and the column city. Thus by deduction from the
mathematical law of joint probability, the product of the two P’s appears as the most
probable or expected number of telephone calls (when further corrected by dividing
by the intervening distance).
By similar mathematical reasoning it can be shown that the distance operates
as a divisor taken to the first power. This has been empirically shown by solving for
the exponent on the distance, or L factor in the formula, and finding that this
exponent is close to unity. It can be deductively shown by geometric proofs from
theorems about the areas of concentric circles. Thus the factors in the interactance
formula can be either induced from observed acts of cities or deduced from
algebraic and geometric principles.
E. Verification. To verify the interactance hypothesis, we calculate the correlation
coefficient between the observed and the expected numbers of interactions between
cities in all possible pairs. If this correlation coefficient is near unity, it means high
agreement of observation with theory. The formula defining the expected interacting
fits the observed interacting in proportion to the size of this correlation coefficient. In
the case of the telephone data, the correlation coefficient is over .90, which is
extraordinarily high for data in the social sciences.
In general, in the studies made to date, the correlation coefficient runs in the
.80's or .90's in proportion as the conditions hold under which this interactance
hypothesis is expected to operate. One of these conditions is that there must be a
large number of acts so that the theory of probability can work out smoothly. In
physical gravity, the smallest dust particle or observation ever made about gravity
involves many billions of molecules and consequently the probability works out very
smoothly, giving very close agreement between the formula for gravity and the
actual observed gravitational energy between two masses.
Further sociological conditions are not well known yet, but it has been
suggested that if the acts are evenly distributed in the population, in space (a
surface), and in the successive time periods, then the formula will be in its simplest
form. If the acts are not evenly distributed in these three ways, the formula may
become more complicated. These further conditions are further hypotheses,
however, and are only dimly discerned as yet, with almost no experimental testing of
them reported in the literature to date.
The principle of group gravity may become as important for human relations as physical
gravity is in physics. It is an interesting fact that they both have the same formula or
operational definition, so that the energy of physical gravity and human interactance of groups
can be thought of as two special cases of a more generalized principle. The energy of gravity
is proportional to the product of the two masses divided by their intervening distance. The two
masses are proportional to the number of molecules in each, if the molecules are all of one
kind.
The generalized principle may prove to be something like the following: Given
independent particles which interact with each other in some way, then any two aggregates of
them will interact in proportion to the product of the number of particles in each aggregate and
inversely to the distance between the aggregates. If the particles are molecules, that principle
becomes the law of physical gravity; if the particles are persons, the principle states the
hypothesis of interactance of human groups; if the particles are dice, the principle states the
mathematical law of joint probability. The particles, however, might be bacteria or mice or any
independent particles whatever. If this principle of human interactance is more fully verified by
testing on more kinds of interacting and in more situations and by more scientists so that it
becomes an established law, then it would be an outstanding case of scientific law in the social
sciences and, for that matter, a law unifying or holding in common between the physical and
the social sciences.
Before this principle can be more than a hypothesis, however, it must be retested many
times and under many conditions. Any graduate student for a thesis could take some kind of
interacting (such as registered mail between cities or between nations, or a clientele visiting a
store, library, doctor's office, church or other center) and test the factors in the interactance
hypothesis one at a time, in isolation first and later in combinations. Here is a virgin field of
great promise in social science, wide open to graduate students.
A warning is needed about further experiments in this field: that in proportion as the
number of cases is small; the agreement of theory and observation will be low. In proportion as
the number of cases is large and representative, the agreement should be much closer. It is to
be expected that in dealing with small samples, random fluctuations and many other factors
will overlay the gravity principle, just as they do in the physical sciences. When such
discrepancies from the model or formula are found, they become problems to measure and
explain by some further principle. Thus science progresses by successive approximations
developing a system of laws. These laws explain and predict particular events with ever increasing precision.
These three examples of scientific methods applied to human relations are only a
glimpse of what some sociologists see as possibilities in the future. To realize even a fraction
of these possibilities, however, society must support social research on a par with physical
research. Young men of talent must respond to the challenge and devote themselves to
applying scientific methods to human problems with the objectivity and persistence and
exactness hitherto shown by students of the more exact sciences. In such ways the sociologist
believes that science can contribute in the long run to resolving, to some extent, the crisis in
our modern culture. So we conclude: If we use them rightly, scientific methods can help to
save us.
Notes
1. George A. Lundberg, Can Science Save Us? New York, Longmans, Green & Co., 1947.
2. Social scientists are increasingly analyzing every "ought' sort of statement into an "if....
then" type of scientific statement. To say "A ought to do X" can be recast by the scientist
into the statement "If A does X, then some implicit desideratum will probably follow."
Depending on how strongly A wants that desideratum in the given situation and on his
estimate of the probability of its following, the "ought" has strong or weak obligatory force to
him.
3. Dimensional analysis in sociology is developed in detail in the author's Systematic Social
Science (Seattle, University Bookstore, 1947, 987 pp).
4. See "The Interactance Hypothesis" by S. C. Dodd, expected to appear in The American
Sociological Review in April 1950. Mr. Joseph Cavanaugh has worked out the tests made
in Seattle.
5. This interactance hypothesis is stated in the inequality: r >.95, where r is the correlation
coefficient between the expected and observed numbers of phone calls. A correlation
above .95 is very high, showing close agreement of the theoretical and the actual. If r = 1.0
the expected and the observed numbers of interacts are identical.
#13. On Reliability in Polling
A Sociometric Study of Errors of Polling in War Zones
Abstract
At the request of the Allied authorities, surveys were conducted in Syria and later in
Sicily to test the reliability and utility to administrators of public opinion polls in occupied
territories.
Reliability was operationally defined in terms of agreement of reobservations. The
various dimensions of fallibility were observed as errors correlated with
1) informants,
2) interviewers,
3) their interrelations,
4) schedules,
5) media,
6) dates,
7) residual errors.
Interrelations were neither analyzed into differences of language, sex, sect, status and two
degrees of acquaintance as in the informant and interviewer being either personal friends or
strangers. Novel among the indices to measure each dimension was the probability, P, of
goodness of fit of two distributions from a survey and a resurvey, as this measured agreement
a) for all statistical moments simultaneously and
b) between schedules of miscellaneous qualitative items mixed with quantitative items.
Experimental designs eliminated each error or isolated each in turn for measurement.
This report emphasizes the statistical techniques for measuring the more strictly sociometric
errors; especially the effect of friendship read its correlated sincerity of response upon these
polls among former enemies.
The outstanding finding on almost all questions was of imperfect reliability when
reobserving the individual (63% - 81%) but almost perfect reliability (99%) when reobserving
plural.
American University of Beirut
I. The Problem of Unreliability in War Zones
Several polls of public opinion have been held in the Middle East and Mediterranean
Theater, in part for the purpose of determining the feasibility of polling in liberated,
cobelligerent, or enemy territory. The question at issue was whether valid and reliable civilian
intelligence, needed by the administrative and propaganda authorities in wartime, could be
gathered by public opinion polls in a population, of whom many were enemy sympathizers.
The present paper is not concerned with the content of the enquiry, nor with its main purpose
of exploring the military utility of polls in occupied territory, but only with the methodology. It is
a report on techniques developed to isolate and measure the chief factors of reliability in
interviewing. These techniques apply to sociometric research generally and transcend these
specific surveys.
For the first poll, radio listening habits were chosen as a non-political issue, to be polled
in Syria and Palestine which were territories with pro-Axis factions and accessible at the
beginning of 1943. A ten percent sample of the radio public was surveyed. No public opinion
poll had ever been held among this Arab population. The people in those countries have much
less of a tradition of free speech, conductive to frank and fearless replies to a stranger’s
questioning, than in Anglo-Saxon countries where polling has been developed. Part of the area
had recently been a battle-ground and occupied by a liberating army, and the tensions of war
and restriction of censorship, with political concentration camps in the vicinity, were expected
to increase the tendency of these people in calculate possible consequences to themselves
and to consider what the surveyors might want them to say before answering the interviewer’s
questions. Another source of sincere responses which were expected to lower the reliability
and validity was certain nationalistic feelings against the two mandatory powers, whose radio
station, and governments joined in sponsoring the poll.
As a result of the success of the Syrian poll, further exploratory surveying in Sicily was
requested by the authorities. In a three-month trial a hundred Italian interviewers and clerks
under Anglo-American officers were trained, and a polling organization developed. Eight
surveys were carried out on a tenth of one percent sample of the four million Sicilians. They
covered shelter conditions after the bombing, clothing needs, food rationing and distribution,
confidence in public officials, public security, public information, radio listening, and
cobelligerency.1
II. The Dimensions of Reliability
Reliability was defined as the degree of the agreement among reobservations under
specified conditions. Its complementary aspect is unreliability, or error, which is whatever
decreases agreement among reobservations. Each specified conditions of reobserving defines
one dimension of reliability. The length of each dimension is composed of two segments — the
degree of agreement and the residual degree of disagreement. If perfect agreement is called
100%; each dimension will comprise a percentage of agreement and its complementary
percentage of error. Indices for measuring the length, of a dimension are described below.
In sociometric situations the chief dimensions of reliability are most observable as the
errors, or variation, in:
1
The informants
Populational dimensions
2
The interviewers
Populational dimensions
3
4
5
6
7
Their interrelations
The schedule cards
The media of communication
The dates of the interviews
Residual factors
Indicatory dimensions
Indicatory dimensions
Sensori-spatial dimensions
Temporal dimensions
Indicatory dimensions2
1. As the informants vary, their individual differences yield sampling errors whenever
data about a whole population are inferred from a part of that population.
2. As the interviewers vary, individual differences yield observing errors due to the
human instrument
3. As each interperson relation between the interviewer and the informant varies, this
variation yields interrelation errors. This dimension of reliability has been largely
neglected and unmeasured hitherto. The most important types of interrelation were
conditioned by:
a. Differences of language. Among t he many languages in Syria and Palestine the
interviewers must among them be able to speak the language of any informant
b. Differences in sex. In Moslem households a male caller many not talk with a
woman and cannot enter if the menfolk are absent.
c. Difference of sect and nationality. Foreigners, Jews, Arabs, Moslems and
Christians were best approached by members of their own in group.
d. Difference in status. The interviewer was respected in proportion as his social,
economic and educational status was equal to or higher than, that of the
informant.
e. Degree of acquaintance. Under wartime conditions the informants were expected
to talk more sincerely and frankly to a personal friend than to a stranger.
Suspicion and fear were very real, with the proximity of concentration camps, full
of political prisoners.
4. The various schedule cards used in any survey specify the questions surveyed and
so define the opinion or behavior that is observed. When one schedule card or
instrument of observation is correlated with another well established one, the latter is
called a criterion and the former is said to be "valid" in proportion as it correlates with
the criteria. "Validity" is thus made a sub-class of -reliability" since validity is the
special case of reobserving with a different indicator of the same phenomenon.
5. As the media of communication vary from face-to-face interviews, to telephonic
conversations, and to mailed questionnaires, the sensory distance of the
communicators varies from being within full sight and hearing of each other, to being
partially within hearing range only, and finally to being beyond either sense directly,
and dependent on mediating symbols. As the sensory distance between the
interactors increases, the informants’ interest is apt to decrease and many errors of
polling can be best measured as correlated with the medium of communication.
6. As the dates between observations vary, their intervening events yield errors of
change, or unreliability that is correlated with the time interval.
7. A residual dimension of reliability provides for all the error which is unanalyzed as
yet, and which awaits further research to be isolated, measured and controlled.
a. For the plural:
a. The significance ratio of a difference in indices, interpreted in a probability
index, P.
b. The goodness of fit of two distributions, the chi square test interpreted in a
probability index, P.
b. For the individual:
3) The reliability correlation, r.
4) The proportion of discrepant, d.
As all of these measures the agreement of two observings of the same phenomena,
they may be called agreement indices. They have the property in common ranging in value
from zero to unity, with absence of agreement to perfect agreement, or with absence of
probability to perfect probability of agreement. The significance ratio and reliability correlation
are customary indices for quantitative variables. As the other two indices are less customary
as measures of reliability, some description of them may be useful here.
The proportion of discrepant responses, d, may be described as applied to the radio poll
in Syria. The schedule card in this radio poll hand more than a hundred items of response,
whether checking a “yes,” or “no,” or entering a number. The schedule card recording the first
interview was compared with the card recording the second interview and the number of items
marked differently, regardless how great the discrepancy, were counted and reduced to a
percent of the possible number of discrepancies. These percents were averaged for the plurals
of N persons. This percent of discrepancy for the individual measure will be called the
individual error. Its complement, d’, measures the “individual reliability” (100-d-d’).
In addition to this individual error, there is plural error, or unreliability for the plural of N
individuals. The frequency distribution of the answers to a question represents the answer of
the plural to that question. To the extent that the frequency distribution differs on
reobservation, the plural’s reliability is low. To compare the two distributions the chi square is
calculated and the goodness of fit probability, P, is read from a chi square table. This P is the
probability that the two distributions may be samples for the one parent population. It is a
convenient index to measure the degree of approach to identity in the two distributions. As P
approaches unity, the reobservations of the plural agree, and there is no plural error.
The individual and the plural errors have a one-way relation. If there are no individual
errors, there can be no plural errors; there can be no plural errors. But the converse is not true.
If there are no plural errors, there still may be (and usually are) individual errors. For these
individuals errors may cancel each other out so that the distribution curve is unchanged, even
though the individuals exchange places within it.
These two indices were used chiefly because they applied alike to both qualitative and
quantitative variables in the schedule card, which is usually composed of both kinds of
variables. Thus whether the question had qualitative answers, such as 'Which broadcasters
do you like?” or a quantitative answer, such as "How many times a month do you listen to…?”
the same index could be used to compare their reliability. 3
II. Control of Reliability
Specifying the various dimensions of reliability and the measuring of each constitutes
the identifying type of operational definition. The other type of operational definition requires
specifying the procedures and materials for producing or modifying that which defined. These
specifications are summarizing for the various dimensions of sociometric reliability in the
tabulations in Figure 1.
The topics in the middle column of the tabulation specify in outline the procedures and
materials to produce greater reliability. These specifications are expanded in detail, of course,
in manuals on social research, and in the two survey reports already cited. They have been
tabulated here merely as part of a systematic study of reliability in one investigation. In
general, these operations to produce reliability are of two types: either one selects the factor to
be as desired, or modifies one of them by appropriate administrative procedures. All these
operations to control reliability were carried out in the surveys as described in the printed
reports.
The summarizing topics in the last column specify how to measure the degree of
reliability that has been produced. This constitutes the identifying type of operational definition
since as identification becomes more exact it merges into measurement. The operation to
calculate the agreement indices, which were described above, are specified more fully in
statistical text books. But the operations, including the experimental design, by which the
variables are observed and fed into the statistical formulae, require further specifying. This is
discussed In Section D, in which the principles just outlined are applied to the wartime survey
made in Its Mediterranean Theater.
Figure 1:
Summary of Operational Definitions of Sociometric Errors
Dimension of Reliability
1. Informant of sampling
error
2. Interviewers or observer
error
3. Interrelation errors
Due to differences,
especially if with
inferiority, in
a) sex
b) age
c) color
d) language
e) sect
f) nationality
g) organization
h) status also due to
i) Degree of
acquaintance
j) Conditions of
interviewing
Operations i.e. specifying the procedures and materials
A. To produce reliability
B. To measure the reliability
Make the population
Compute the appropriate
sample
“indices of agreement”
when other errors are
controlled between
resurveys comparing:
a) Adequate in use
a) Same and different
b) Representative in
samples varying in size in
composition
composition
c) Identical if repeated, i.e.
a plural
a) Select competent persons b) Same and different
(by interviews, application
interviews
forms, aptitude tests, etc.)
b) Train the interviewers (up
to standard set by
achievement tests)
c) Use identical teams (for
comparisons of groups or
resurveys.)
a) Select interviewers of
c) Same and different interclassification identical or
relations, i.e., differences
superior to the informants,
in classification.
wherever this matters
b) Enhance interviewers’
status by dress,
introduction,
endorsements, honorific
symbols
Guide a) and b) by tests of
social distance and status
a) Select friends or strangers
as interviewers
b) Develop acquaintance by
introductions, visits,
publicity, etc.
a) Select or arrange
favorable conditions
d) Two degrees of
acquaintance (i.e.,
friends and strangers).
e) Same and different
conditions
k) Procedures of
interviewing
Dimension of Reliability
4. Schedule card errors
5. Media errors.
6. Temporal errors
7. Residual errors (still
unisolated)
a) Teach techniques of
f) Same and different
skilled interviewing up to a
techniques
standard of 1% of
interviews completed
Operations i.e. specifying the procedures and materials
A. To produce reliability
B. To measure the reliability
a) Specifying a criterion: an
g) Same and different
accepted index of the
schedules (criteria).
phenomena to be
measured.
b) Define the phenomena
operationally
i) Standardize questions
and answers;
ii) Discover ambiguously
from field tests
iii) Prepare a Manual of
Instructions
a) Select media with minimal
sensory distances.
a) Select identical dates or
minimal intervals
a) First step: Analyze and
invent hypotheses
h) Same and different
media
i) Same and different dates
j) Invent experimental
designs
III. Measurement of Reliability in the Syrian Poll
The remainder of this paper passes from theory to application in describing how the
indices for measuring each dimension of reliability were carried out in the Syrian poll of radio
listening habits. The general principle running through all these measurements was to isolate
each error as free as possible from the other five identified errors and measure it in it varied
alone. Usually its variation was still at the primitive all-or-none level of comparing the absence
of the error with its presence in some specified way. Thus false responses may be expected to
be absent between friends, but possibly present between strangers under the local war
conditions.
For most dimensions of reliability, however, the measurement of errors is more
complicated in that it requires three sets of observations ― one to establish the data at a given
point on the dimension at issue, a second set of observations to determine how much the data
change on mere resurveying on the same point on that dimension, and a third set to determine
how much more the data change for a different point on that dimension. This isolates the
difference in the dimension from other uncontrolled factors which usually vary in repetition of a
survey. Thus, for example, in the sample shown in Figure 3 three interviews of each informant
were required to isolate the acquaintance dimension and measure it separately from other
dimensions of reliability. If one survey, using strangers as interviewers, had been compared
only with a second survey, using friends of the informants as interviewers, the difference would
have measured the difference in the acquaintance relation plus other differences or errors
which are Inherent in mere repetition of a survey, even when all conditions are apparently
identical. Hence a survey by friends was compared with a second survey by the same friends
to isolate and measure the errors inherent in mere repetition. Then the difference with a third
survey by strangers measured the acquaintance plus repetition errors. Subtracting the
repetition errors from this isolates the amount of error alone the acquaintance dimension.
In generalized terms, a first survey establishes data, a: a second survey, under
apparently identical conditions, establishes data a + e, the e representing errors inherent in
mere repetition, i.e. variations escaping control between surveys and a third survey, although
apparently changing by the difference, d, actually establishes a + e + d. In order to isolate the
d, the difference between the first and second survey, a + e – a = e must be subtracted from
the difference, a + e +d – a = e + d, between the first and third surveys, giving e – d – e = d. In
the example above, d is the difference between friends and strangers, i.e., it is the
acquaintance variable observed in a primitive all-or-none way. (These formulae assume, as a
first approximation, that the errors are additive and uncorrelated. If otherwise, the formulae
become more complicated.)
A. Sampling errors
1. Adequacy of sampling
The adequacy of the size of a sample is usually measured by calculating the
significance ratio or its fiducial limit. A more rigorous test of adequacy in comparing the
goodness of fit indices from the distributions of a sample with its subsamples. To the
distribution curve of a thousand cases as a criterion, the distribution of five hundred cases, of
four hundred cases, of three hundred, of two hundred and of one hundred, were fitted on each
of the more important questions. As the median probability index, P, was .99, even down to the
sample of one hundred in both the Syrian and Sicilian surveys, the conclusion was that the
sampling could be cut to one-tenth and still kept to within one percent of the same degree of
accuracy. This indicated reducing the Syrian sample from ten percent to one percent of radio
listeners, and reducing the Sicilian sample from 1% down to .01% of the citizenry. These
findings answer the major question in these trial polls, namely as to the cost of further polling,
since about half the budget varies directly as the number of interviews.
Another more economical way of determining the adequate size would be to ask, “What
is the smallest sample whose distribution fit those of another equal sample with a probability of
99%” and then to explore successively larger samples until the standard is met.
2. Representative sampling
The answers to the more important questions were broken down in respect to region,
sex and income cases in respect to occupation and age. On these questions and in these
respects the extent to which the reliability depends on the composition of the sample is exactly
determinable. For the new means and other indices could be readily calculated for a sample of
any other composition in these respects by reweighting the components appropriately from the
breakdowns.
For a simple example of this, suppose the average number of occasions per month of
listening to the radio were found to be 20 for men and 10 for women, in a sample composed
45% of men and 55% of women. The average for that whole sample, calculated without regard
to sex automatically weight the men’s mean by the factor of .45 and the women’s mean by a
factor of .55 to yield 25.5 as the general mean:
20 X .45 = 9.00
30 X .55 = 16.50
-------25.50
But if the true proportions of the two sexes in the population were known to be .5 and .5,
the true mean for the whole population is readily recalculated and found to be 25 occasions of
listening per month:
20 X .5 = 10.00
30 X .5 = 15.00
-------25.00
Three indices were used in studying the representativeness of sampling:
1) a correlation or contingency coefficient between the answers to one question, X, and
characteristic, Y, suspected of creating bias.
2) a goodness of fit probability, P, between the proportion of a characteristic or its
subclasses, Y, in the parent population and those proportions in a sample.
3) a significance ratio, (t), of a difference between any index as calculated from a
biased sample and as recalculated from an unbiased sample, as above.
The correlation diagnosed the relevant characteristics, Y, which had to be
representatively sampled, i.e., whose proportions had to be matched in the parent population
and in the sample.
From experience to date, a correlation greater than .5 is taken to indicate a
characteristic which must be representatively sampled. Thus, if sex correlates with a question
above .5, then the sexes must be sampled in proportions matching those of the parent
population.
The goodness of fit test measured how well the matching of proportions had been done.
The significance ratio measured in standard scores, the amount of bias. A bias was
defined as any characteristic which is unrepresentatively sampled (to the extent of P being less
than .5) and which correlates greater than .5 with the answers to a question. These three
indices made it possible to deal with any bias with quantitative precision by
a) detecting any bias (r)
b) measuring its amount (P), and
c) measuring its effect in distorting the findings (t)
B. Informant and interviewer errors
The informant error and the interviewer error were measured separately and in
combination by the experimental design below (Figure 2). The informant error occurred in a
somewhat special form, namely the difference between members of one family. In the radio
poll the unit of observation was the family, and the informant error became the unreliability in
observing the family by questioning one member of it. The experimental design in the case of
the Syrian survey yielded the following results.
Figure 2:
Experimental Design for the Quadruple Sample
Informants
Interviewers
A
C
AC1
D
AD2
AC1 – AD2
B
BC2
BD1
BC2 – BD1
AC1 – BC2
AD2 – BD1
Interview discrepancies for
Constant informant:
BC2 – AD2
AC1 – BD1
Individual reliability
Plural reliability
d’ = 81%
P = 99%
N = 187 informants
Informant discrepancies for
constant interviewer:
Individual reliability d’=65%
Plural reliability P=99%
Combined interviewers
= and informant
discrepancies
Individual reliability d’ = 63%
Plural reliability
P = 99%
Here A and B represent two interviewer who visited a household together, in which they
found two informants who were both willing to be interviewed and immediately reinterviewed
by the other interviewers. Subscripts 1 and 2 represent a first and second interview. Thus AC 1
denotes the answers of informant C in his first interview, which is with interviewer A. BC 2-AD2
denotes the percentage of discrepancies in the answers of informant C interviewed by B,
compared with the answers of informant D interviewed by A. From this design horizontally
calculated discrepancies, d, measure the informant error with the interviewer the same.
Vertically calculated discrepancies measure the interviewer error with the informant the same
(but not necessarily constant in his answering!) Diagonally calculated discrepancies measure
both errors in combination. In all three cases these errors are isolated; all other errors are
controlled since the medium (interviewing), the schedule, and the interrelations between the
informant and the interviewer are constant and the time interval is zero.
The outstanding finding was the large error, (d), and the negligible plural error (P).
When the interviewer changed, a reinterview showed 19% of fluctuations in the answers. But
the two distributions curves fitted each other with a probability above 99%. When the
informants were two different members of one family the discrepancies almost doubled,
running up to 35%. But again the two distribution curves fitted each other with the possibility
above 99%. When both interviewers and informants were different in the reinterview, the
discrepancies only rose 37%, and this 2% of increase was statistically insignificant. The curves
again fitted each other with 99% probability. Apparently the discrepancies were sufficiently
random fluctuations as to cancel each other out, leaving the distribution of answers almost
identical for the same or different informants or interviewers. This high reliability for the plural is
of chief interest to the broadcaster and administrator who must act on the basis of proportions
of the population whose opinions are pro and con, and not on the basis of individual opinions.
A further technique broke down the interviewer error into one of its components, and
isolated this component for measurement. The component was the “recording error,” i.e. the
discrepancies between different interviewers in recording one and the same interview. The
interviewers witnessed one interview in common, and their schedule cards recording it were
compared, and the discrepancies calculated as a percent of all the possible discrepancies.
Early in the training period this test yields some of 10% of discrepancies, but at the end of the
training it was reduced to 3% in both Syria and Sicily on the radio schedule (and to .6% in
Sicily on a mixed schedule of 230 questions dealing with food, shelter, clothing, public officials
and news dissemination). This indicates that the differences in recording between different
interviewers were only a small fraction of the interviewer error (three out of thirteen percentage
points). The balance of the interviewer error seems to appear to the mere repetition of the
interview, i.e. the intrinsic fluctuation of the opinions in the informant. For the next experiment
found 25.5% of discrepancies in the Syrian radio schedule on reinterviews where both
interviewer and informant were identical, but where an average interval of 9.4 days interviews.
C. Interrelation errors
In the experiment shown below the main point was to measure the error due to the
interpersonal relations of acquaintance. Degree of acquaintance was measured as an all-ornone variable by observing it in the interview between friends and again in the interview
between strangers. An informant may be expected to tell his true opinion to his personal friend
more frankly and sincerely than to a stranger, especially under was conditioned in the Arab
East. Thus the interview with a close friend was taken as a criterion of true of opinion against
which to validate the opinion given to a stranger in the usual course of surveying. A sample of
374 personal friends of the interviewers was interviewed three times — twice by their friend
and once by a stranger. The sequence factor was properly rotated so as to cancel out. Here, in
the two interviews by the same friend, even though the informant, interviewers, interrelation,
schedule, and medium were constant, and the mean time interval was 9.4 days, yet 25.5% of
discrepancies were found. Compared with this, 28% of discrepancies — constituting a small
but statistically significant 2.5% excess over the 25.5% above — were found between the two
interviews, one of which was with a stranger and one with a friend, (with the same mean time
interval of 9.4 days). This differential of 2.5% is due to the two errors of differing degrees of
acquaintance of different interviewers. These two errors were inseparable in this experimental
design since the interviewing friend and the interviewing stranger had to be different persons.4
The conclusion from this measurement of the acquaintance error is that it is very small, the 2.5
points being only about 10% of the total discrepancy observed in this experiment (Figure 3).
Another statistical technique, which converges to the same conclusion, was to calculate
the correlation between the degree of acquaintance and the amount of individual error. Taking
each of these as all-or-none variables the following experimental design resulted.
Figure 3:
Experimental Design for the “Triple” Sample
Individual
error, i.e.
discrepancy in
response
Present
Absent
Present
(Friends)
13%
37%
50%
Friendship
Absent
(Strangers)
14%
36%
50%
27%
71%
R = .02
σr = ±.05
100%
N= 374
The contingency coefficient and the fourfold correlation coefficient here were both .023,
showing that the presence or absence of friendship was unrelated to the presence or absence
of error. Reliability was uncorrelated with the degree of acquaintance in interviewing. The
variation of this interpersonal relation is thus shown not to have affected the findings of the
survey.
A third statistical technique, which again converges to the same conclusion, measures
these individual discrepancies in answers by the same goodness of fit test by which the plural
error was measured. The all-or-none distribution of discrepancies observed between friends
was fitted to the same distribution observed between strangers, i.e. the first column in Figure 3
was fitted to the second column. The probability of fit was .99, showing that the two
distributions can be considered as drawn from the same universe with practical certainty, or,
stated differently that the differences batmen the two degrees of acquaintance could be
attributed entirely to random fluctuations of sampling.
All this meant that the plural of Syrian informants gave their opinions as sincerely to the
ordinary interviewer, coming as a stranger, as they did to a personal friend. This established
the validity of the public opinion poll by the sociometric criterion of friendship which is assumed
to be highly correlated with sincerity or truthfulness in the informant.
The acquaintance error was the only interrelation error whose variation was isolated
and measured. The other interrelation errors were administratively controlled and eliminated.
The interviewers were selected so eliminate unsuitable differences in respect to language, sex,
sect and nationality, and status. Furthermore, in the experiment shown in Figure 3 where the
same interviewer revisited the same informants, all these interrelation were constant.
D. Schedule and media errors
The next two sources of unreliability — the schedule and media error — were not as
fully measured. The schedule was checked on several specific questions against others and
surveys, and showed closely comparable findings. But these checks were neither general
enough nor rigorous enough in measurement to report as a contribution to systematic
techniques. For the media an experimental design, such as in Figure 2, was started to
compare interviews with mailed questionnaires when isolated from the racial-linguistic
differences between Jews and Arabs in Palestine, using, 1,516 interviews with Arabs and
2,761 questionnaires from Jews. But the usual difficulties of time and funds precluded carrying
out this measurement of cultural group differences disentangled from differences of media.
E. Temporal errors
In order to measure the temporal error, the intervals between a friend's first interview
and the same friend reinterviewing the same 'informant in the Triple Sample of 374 persons
(Figure 3) were divided at the median into two equal plurals. The mean time interval for one
group was 2.8 days and for the other was 16 days. The average discrepancies between interviews were 32% and 24%o respectively. This shows that the shorter the interval of time
between interviews the greater their discrepancy. Within the limits explored here, longer
intervals yield more similar replies. Our interpretation of this unexpected finding is the "new
conversation" hypothesis. In an immediate reinterview the informant prefers to amplify and
qualify his previous answers rather than repeat them exactly, but after a longer interval the
details of his former conversation have faded more and his replies will tend to repeat his basic
answers. This means that, for maximal reliability here, several weeks should elapse before a
person is reinterviewed.
For the plural, the index of probability of goodness of fit of the findings for the longer
interval compared with the shorter interval shows that the time interval makes no difference.
Without exception on the questions plotted the probability was .99, indicating almost complete
identity between the two interviews by the same friend, whether the interviews were half a
week or over two weeks apart. Similarly the probability of fit was .99 between the schedules
filled out by a friend and by a stranger, regardless of whether longer or shorter intervals
between interviews were studied.
The conclusion about the time interval between interviews is, then, that individual
fluctuations exist and decrease with time, but so cancel each other as to result in a stable
distribution of answers for the plural whatever the time interval (within a month).
From all these measurements of the various dimensions of sociometric reliability the
summarizing findings were that plural reliability was nearly perfect (99%) under the conditions
specified, while individual reliability ranged from 81%o at best to 75%o. It was 81% when the
interviewer error alone varied out of the six classes of identified error, and it was 75%o when
the temporal error alone varied with the other five classes of error held constant.
F. Residual errors
There seems a residual core of individual error, amounting at most to 19% of
discrepancy here, which remains unanalyzed. Towards eliminating it by further analyses and
experimental designs, two explanatory hypotheses were developed but only partly explored.
One hypothesis was that this residual error is partly due to the "new conversation" factor noted
above; the other hypothesis is that the residual error is partly due to the fluctuant nature of
some of the data. On some questions, peoples' responses are intrinsically superficial and
fluctuating. Thus when asked, "How often did you listen to program (or station) X in the last
seven days?" an informant may reply in his first interview, "Oh, it's hard to say-perhaps two or
three times"; and in his second interview, "I'm not just sure. I'd guess about twice in the last
week, perhaps." The first answer is recorded as 2.5 times a week, the second as 2, and a
discrepancy is marked up because of the shifting phrasing or slightly fluctuant opinion of the
informant.
For the "new conversation" hypothesis further evidence was avail- able. Most people in
conversing prefer to say and hear some new remarks and not repeat the remarks they have
just made. Even the bore — who is unusual — does not repeat himself verbatim. This
tendency leads an informant in a reinterview to go on to new aspects of the questions about
radio listening and not merely reiterate his former answers. In Arabic especially a conversation
is enjoyed for its "sparkle," and interest in well- phrased sentences, and not merely for its
expressing precise facts. Frequently the interviewers found this attitude in the informant explicit
as when one informant, in answer to the question on reactions to performers on the air,
remarked in the reinterview, "I told your companion here all about the speakers X and Y on the
radio, so now I will tell you more about some of the other speakers whom I like." The resulting
difference in recorded statements was counted as discrepancies by the technique used here.
Further evidence for this hypothesis of a "new conversation" factor consists in the observed
fact that the question which asked the informant, "Are there any performers on the air whom
you especially like or dis-like?" accounted for some 30%o of the average discrepancy,
although the number of its items on the average was only 10%o of all items of answer- ing. It
was the question which gave the informant freer latitude than other specific questions (such
as, "In what languages do you listen?") and therefore gave three times its due share of
discrepancies on reinterviewing.
IV. Conclusion
The answer to the methodological question, "How reliable are the surveys in war
zones?" was that the reliability of the individual was imperfect, while the reliability of the plural
was nearly perfect. The reliability of the individual person's answers ranged from 63% to 90%
depending on the variation in the informants, the interviewers, their interrelations, the schedule
card and the time interval between surveys.
On most questions the answers of a plural showed a 99%o excellence of fit between a
first survey and a second survey reobserving those answers. This high reliability held
regardless of variation in informants, interviewers, interrelation, or time interval; it held in both
the Syrian radio poll and the Sicilian polls about the radio, shelter, clothing, food, public
officials, and news dissemination. This high reliability of the surveys for the population as a
whole was the important assurance the authorities needed before using the survey's findings in
making administrative decisions.
Notes
1. The Syrian poll was reported in "A Pioneer Radio Poll in Lebanon, Syria and Palestine," by
Stuart C. Dodd and Assistants, American University of Beirut, Lebanon, 1943: pp. 103. This
is a public document as the poll was conducted by a University research staff with the
sponsorship, on behalf of the United Nations' radio stations, of the seven governments
involved in that region.
The 148-page report on the Sicilian surveys has the military classification of a
"Confidential" document, and so at present (Feb. 1944) has only a limited distribution.
Questions about the scientific techniques may be sent to the Director, Stuart C. Dodd, care
of the Publicity and Psychological Warfare Division, Supreme Headquarters, Allied
Expeditionary Force.
2. The theoretical considerations underlying this dimensional analysis are more fully
developed in the author's "Dimensions of Society," (Macmillan, 1942 pp. 944).
3. A second reason for choosing the goodness of fit test is that it seemed a more rigorous test
for the degree of similarity in all respects of the two frequency distributions from the plural
that is twice observed. For the fit can only be good, and P approach unity, when each of the
statistical moments in one distribution tend to equal its counterpart in the other distribution.
This implies that the two means must be equal, and the two standard deviations must be
equal, and the two skewnesses must be equal for P to be unity. To test the difference in the
two means in the usual way by dividing it by its standard error and finding the probability of
the significance ratio, leaving the equality of the second and of higher moments
undetermined. The means may not be significantly different, but the variances may differ
significantly. The goodness of fit seems to summarize the fit of the various moments all in
one index. Since, however, its use as an index of reliability seems unorthodox; it is
described here to invite criticism of this procedure.
4. Another technique would start with a stranger and develop friendship in semi-social
revisiting and thus permit comparison (with a time interval, however) between the stranger
relation and the friend relation, with the interviewer constant.
#14. The Standard Error of a "Social Force"
The Annals of Mathematical Statistics, Vol. 7, No. 4, December 1936
I. Definitions
In the theory of measurement of social forces certain special cues of frequent
occurrence where the population shifts from one date of measurement to the neat require the
derivation of appropriate standard error formulae.
The theory may be briefly restated1 in equations as follows: any measurable change, C,
in a population, P, may be defined as the difference in mean scores, S, from surveys or
measurements on the dates denoted by subscripts
C2-1 = S2 – S1 = Σs2/P – Σs1/P
Equation
(1)
The momentum of a social change may be defined as the product of the time rate in years and
the population that is being changed
M2-1 = PV2-1
= PC2-1/Y2-1
= P/Y2-1 (S2 – S1)
Equation
(2)
Equation
(2a)
where Y2-1 is the period from date 1 to date 2 and V is the velocity, or speed of change, in that
period. The acceleration of a social change is definable as the rate of change of the velocity of
changed
A = (V4-3 - V4-3)/.5Y(4-3-2+1)
Equation(3)
where each velocity being an average for its period, is taken as representing the mid-date of
that period.
The resultant social force which produces a measured change is now definable as that
which accelerates the change in a population. It is measurable as the product of the
acceleration and the population.2
F = AP
= P/.5Y(4-3-2+1) (S1/Y2-1 – S2/Y2-1 – S3/Y2-1 + S4/Y2-1)
Equation(4)
Equation(5)
II. The Sampling error of one case (momentum)
The formulae for the standard errors of sampling for the above concepts, social change,
velocity, momentum, acceleration and force, (C, V, M, A, and F) have been published for the
case where the population, P, is the same on all dates of measurement. But it is not always
possible to observe the ideal experimental technique of holding the population unchanged in
number nor to select out individuals common to all the surveys and to neglect the rest.
Ordinarily there will be different P's, P1, P2, P3, and P4, at the different dates.
To derive the standard errors of (2) and (4) when P shifts, each P is considered to be a
sub-sample3 of the main sample which is (P1 + P2 + P3 + P4). The orthodox view of sampling is
taken where the sub-samples may differ in size but maintain fixed proportions in each main
sample which is drawn from the "parent" population.
Let primes denote an M, or other function of (1) to (5), which is an approximation due to
the shifting of the population and the use of an average P.
To simplify and generalize the notation, let k denote the constant term compounded of
P's and F's which is associated with each S. The first subscript of k denotes the function, f,
which is any particular one of the left hand members of equations (1) to (5) and the second
subscript denotes the date of its S. Thus, from (2a)
kM1 = (-P1 + P2)/2Y2-1 = - kM2
Equation
(6)
Equation
(7)
Equation
(7a)
Equation
(8)
Then (2) may be rewritten:
' = S1kM1 + S2kM2
M 2-1
2
= Σ1 SkM
To derive the standard error of (7) the total differential is:
' = km1 d(Σs1/P1) + km1 d(Σs1/P1)
dM 2-1
If Q12 denotes the population common to both dates of measurement so that:
P1 = Q12 + Q1
P2= Q12 + Q2
Equation
(9)
And, since the differential of a sum of the differentials of the several terms, (8) becomes
Q12
Q1
' = km1/P1 (Σ
dM 2-1
1
Q12
ds1 + Σ1 ds1) + km2/P2 (Σ1
Q2
ds1 + Σ1 ds2)
Equation (10)
C2
Squaring gives
2
2
2
2
' )2 = (k m1 /P 1 )(Σds1)2 + (k m2 /P 2 )(Σds2)2
(dM 2-1
Q12
Q12
Q12
Equation (11)
Q2
Q1
Q12
Q1
+ 2 km1 km2 [Σ1 ds1 Σ1 ds2 + Σ1 ds1 Σ1 ds2 + Σ1 ds1 Σ1 ds2 + Σ1 ds1
Q2
Σ1 ds2]
On summing and dividing by the number of cases to get the expected values, the last
three terms in the square brackets vanish. Using the relation where, in random sampling, the
correlation between two variables is the same as the correlation between their means
r12 = rs1s2 = (ΣS1S2)/(Q12σ1σ2)
¯¯¯¯¯¯¯
= Σ((Σs1/Q12) • (Σs2/Q12)) / Q12 ((σ1σ2)/√Q
12 • Q12 )
Equation (12)
gives
2
2
2
2
σM'2-1 = k m1 σ1 /P1 + k m2 σ2 /P2 + (2kM1 kM2Q12σ1σ2r12)/P1P2
Equation (13)
Standard error of momentum when the population shifts
The best estimates of σ1 and σ1 are the standard deviations of the scores, s1 and
s2, and the best estimate of r12 is, strictly, the covariance of the common cases divided by
the two sigmas. Unless the selection of Q12 out of P1 and P2, curtails the range in some
way (i.e., Q12 is not a random selection), then, except for sampling variation, σ1, and σ1,
are the same in the Q12 population as in the P1 and P2 populations so that there is only a
sampling discrepancy between the ratio above and the r12, the observed correlation
between the s1 and r2, scores in the Q12 population.
III. The generalized standard error
The above standard error may be readily generalized. Any of the equations (1) to
(5) may be expressed as a simple linear sum of the products of a variable, S, and its
appropriate constant, k.
i=n
f = i=1
Σ Sikfi
Equations (14)
where f is any one of the concepts S, C, V, A, M or F defined by (1) to (5) and n is the
number of surveys, or different S's involved, and i denotes each survey in turn from 1 to n
Thus where f means F, (5) becomes:
i=4
fF' = F' = kF1S1 + kF2S2 + kF3S3 + kF4S4 = i=1
Σ kFiSi
Equations (15)
kF1 = - kF2 = (P1 + P2 + P3 + P4)/(2 Y(4-3-2+1) Y(2-1))
Equations (16a)
kF4 = - kF3 = (P1 + P2 + P3 + P4)/(2 Y(4-3-2+1) Y(4-3))
Equations (16b)
Where
In the special ease when a force, P, has been determined from only three surveys using
two consecutive periods, n = 3 and
kF1 = (P1 + P2 + P3)/(1.5 Y(4-1) Y(2-1))
Equations (16c)
kF2 = - ((P1 + P2 + P3)(Y(2-1) + Y(3-2)))/(1.5 Y(3-1)Y(3-1)Y(2-1)))
Equations (16d)
kF3 = (P1 + P2 + P3)/(1.5 Y(3-1) Y(3-2))
Equations (16e)
If the difference between two forces (or other functions, f) has been measured in either
the same or in different populations and the significance of the difference in terms of its
standard error is desired, f of (14) can also denote that difference.
fdF = Fa – Fb;
fdM = Ma – Mb; etc.
Equations (17)
It is only necessary to write the difference as a linear sum of products of S and k on the model
of (2a) or (5) to get the k-values for that particular f.
It is now possible to write the standard error formula for f in a single generalized form
that covers all the concepts and their differences as defined in equations (1) to (5), (14) and
(17). Observing that (14) is the general case form surveys of the particular case (7a) where n =
2, it becomes evident, that on taking differentials, squaring, summing, and dividing the linear
sum of the n terms of (14) there results n2 terms of which there are n that are variances (times
constants) of the sort k2σ2/P and (n2 – n)/2 are different terms each occurring twice that are
covariances (times constants) of the sort kkQQσσr/PP. From these rough considerations as
well as from rigorous derivation, the generalized standard error of (19) is found to be:
2
n2
σf = Σ1 (kfiσi kfjσjQijrij)/Pi Pj
Equation (18)
The generalized standard error.
Where i and j denote each of the n surveys in turn. There will thus be n2 terms to be summed
— the number of combinations of i with j including the cases where i = j.
The derivation of (18) as well as its computations from data and its interpretation in
special cases can all be made clearer by arranging the terms in a square array as follows:
i
j Coefficients
1
kf1σ1 /P1
1
2
n
kf1σ1 /P1
kf1σ1 /P1
kf1σ1 /P1
P1
Q12r12
Q1nr1n
P2
(
Qom.
)
2
n
kf2σ2 /P2
Q12r12
kfnσn /Pn
Q1nr1n
Q2nr2n
Pn
To get σf write the computed values of the coefficients kσ/P as captions of rows and of
columns and write each computed Qr value in its appropriate cell, noting that in the main
diagonal cells the self-correlations are unities .and the population common to both column and
row surveys, Qij, is the entire population of that survey as Qij = Pi when i = j. Thus Q11 = P1.
Next in each cell's parenthesis enter the product of three factors, namely:
a) the cell Qr term,
b) the column coefficient, and
c) the row coefficient.
2
The sum of these products in the parentheses, nr in number, is σf of (18).
From the above square array it becomes clear that whenever in (17) the difference of
two observed forces, or other functions, is derived from different populations the Q between
these populations is zero so that the entire product terms in those cells vanish. Thus in the
very simplest and familiar ease of comparing two means from different populations, n = 2, Q12
= 0, k = 1, and (18) reduces to the usual sum of these two variances of the two means
2
2
σ2 difference in means = σ1 /P1 + σ2 /P2
Equation (19)
IV. Some special cases
It should be observed that the above formulae for the standard errors when P shifts all
become identical with the simpler formulae previously derived for the case of a constant P. In
this case, every Qpq = Pp = Pq and in the square array ( in addition to k's which no longer
involve an average P), the Q or P of the cells and the P's in the row coefficients, may be
omitted as they cancel each other out.
Another special but very frequent case is where the social change is not even in terms
of a difference in means, S1 and S2, but in terms of difference in percentages, as when a
literacy rate three from 30% to 40%. A percentage can be viewed as a mean of a twocategory, all-or-none, present-or-absent variable such as: A non-A (foreign or native born,
literate or illiterate, etc), where A is assigned a value of 1 and non-A a value of 0. Then the
sum of the values of A, each times its frequency, divided by the population is both a proportion
and a mean. Its standard error in the percentage, p, form of expression is then equal to it in the
mean form:
σp = (p √ ¯¯¯¯¯¯
1.0Q-p )/√P̄¯ = σs = σs /√P̄¯
(where s = 1 or 0 and p = Σs/P = S)
Equation (20)
so that where Si in (14) is a percent p(1.00 - p) should be substituted for σi (and σj) in (18). In
this case the appropriate formula to use for getting rij in (18) depends on the nature of the
distribution of the variable that is expressed in percentage form. If the distribution is normal,
tetrachroic r may be appropriate, while if the S in percentage form is from a two point
distribution, r from a fourfold point surface may be appropriate.
In all the above cases the usual interpretation of the significance of f in respect
to sampling errors may be used in entering a normal probability table with σj from (18) and
reading the probability of such a f of occurring by chance.4
For a numerical illustration of this formula (18), consider the case of two
villages, the statistical significance of whose momentums of a social change are
to be determined. The data are from a study of Syrian villages1 where an itinerant Health Clinic
in two years changed the average hygienic status of the families in each village by amounts of
score (on a scale of I to 1000 points, devised for this study) as indicated in the table below.
Mean score in 1931 = S1 =
Mean score in 1933 = S2 =
Population (families) in 1931 = P1. =
Population (families) in 1933 = P2 =
Standard deviation of scores in 1931 = σ1 =
Standard deviation of scores in 1933 = σ2 =
Families common to both censuses = Q12, =
Correlation of scores from the 2 dates = r12
kM1 = -(P1 = P2)/2Y(2-1) =
kM2 = - kM1 =
kM1σ1/P1 =
kM2σ2/P2 =
Q12r12 =
σM' 3-1 =
'
Momentum = M 2-1
4.2
' /σM'
Significance ration M 2-1
3-1
*The calculation of this σ by (18) may be illustrated in detail:
Village B
Coefficients, kσ/P
1
16.53
2
42.05
Village
A
253
304
46
40
54
58
40
.00
-21.5
21.5
24.25
31.17
0
261
1,097
1
2
16.53
42.05
46 (= P1)
(12,571)
6.08 (= Qr)
(-4,286)
6.08 (= Qr)
(-4,266)
32 (= P1)
(58,208)
Village
B
321
526
46
32
39
70
32
.19
- 19.5
19.5
16.53
42.65
6.08
249*
4,037
16.2
The momentum of the movement towards improved hygiene achieved in village A is 4.2
times its standard error, while that of village B is 16.2 times its standard error. The excess
momentum of village A over village B is 8.1(= 2940/361) times the standard error in their
difference in momenta. Since all three of these significance mucks are well over 3 the
conclusion is that the observed momenta and difference of momenta are statistically significant
and cannot reasonably be due to sampling fluctuations. It may be noted that the significance
ratios for the amounts of this social change, the difference in mean scores, are in close
agreement with the above figures, being 4.1 and 15.9 for villages A and B respectively, instead
of 4.2 and 16.2 as above. These discrepancies of a .1 and .3 in the statistical significance of
these social changes compared with the corresponding social momenta and accounted for by
the fact that the shift in the size of the population is allowed for in our formula for the case of
momenta and is not considered in the usual formula for the case of social- change.
A minimum of three measurements of one population is necessary to determine a social
force. To determine its standard error all the correlations must be secured between every pair
of measurements, each correlation derived from the part of the total population that is common
to that pair of measurements. Obviously the data as currently reported from surveys and
censuses and statistical bureaus do not meet these specifications. More rigorous analysis of
social data and reporting of correlations in it is a prerequisite to the measurement of social
forces and their significance.
Notes
1. A Controlled Experiment on Rural Hygiene in Syria, Dodd, S. C., Publications of the
American University of Beirut, Syria, Social Science Series, No. 7, 1934, pp.336
Also, A Theory for the Measurement of Some Social Forces, Dodd, S. C. Scientific
Monthly, Vol. XLIII, No. 1, July 1936, pp. 58-62
2. Force thus defined in terms of its effect is a resultant force, i.e., the residual force after
deducting all resisting forces from the total force in the direction of the change observed.
This formula defines quantitatively and exactly the “net” force not the “gross” force
producing the change. It thus measures only the observable part of the total forces in the
situation. The fundamental problem remains, as always in science, to observe more
adequately, to devise experimental and statistical techniques for measuring the different
forces (in isolation and in combinations) which facilitate or resist the measured change.
3. The author is indebted to Mr. S. S. Wilke (Princeton) for this method of deriving these
standard errors in a fluctuating population.
4. Mr. Wilke comments here that, "there is a more exact and rigorous test for comparing the
two sets of S’s which enter into a pair of M’s or F’s which involves some recent statistical
theory but it is doubtful if the extra refinement is worthwhile at this stage of sociometric
development.
#15. The Applications and Mechanical Calculation of Correlation
Coefficients
(Presented at a meeting held Thursday, January 14, 1926)
by
Stuart C. Dodd, Special Research Fellow in Psychology, Princeton University
Statistic is the application of mathematics to the social sciences. It is the method for
quantifying complex social phenomena, and attempting to reduce their relationships to
mathematical formulae which will permit us to predict and control the phenomena. Perhaps the
most general problem in science is the search for relationship between phenomena. We never
know the intrinsic nature of a thing but only progressively define it as we know more and more
of its relationships to other things. To learn these relations in physics or chemistry, we can
experimentally vary the conditions one at a time and observe the resulting co-variation of
whatever phenomena we are studying. But we cannot experiment as easily in social,
economic, educational and many psychological and biological investigations. If we wish to
experiment upon the effect of fertilizer on crop yield, we cannot keep the rainfall conditions
constant as yet. We are forced to observe a large number of instances of varying fertilization
and resultant bushels per acre and then select for study only those instances in which the
rainfall has been equal. (I will speak of another method of accomplishing this control through
selection by means of partial correlation later.) If in addition there are many factors to be
controlled, such as chemical constituents of the soil, amount of cultivation, quality and spacing
of seeds, they may require still further selection. It can readily be seen that one must have a
very large number of observations in order to have an adequate sample after several
successive selections.
When we have controlled as many of the experimental factors as possible by selection,
we can proceed to explore for relationships by means of correlation. The correlation coefficient
tells us to what extent two variable phenomena or characteristics are related. It is the measure
of the tendency to co-vary or the tendency of two functions to be associated somehow. Since
in complex phenomena clean-cut co-variation is never found, we can only determine the
amount of the tendency of two things to co-vary or to be associated with each other in some
way. There are many ways of looking at the correlation coefficient. In addition to being an
index of co-variation or correlation, it can, under some special conditions, be regarded as the
percentage of common elements or common causes in the two phenomena studied. The
meaning of the correlation coefficient can perhaps be most conveniently illustrated by a scatter
diagram of the observations, such as in Fig, 1. Here the horizontal axis represents the different
degrees of the x-variable, which is the grade to which children have attained in school. The yvariable along the vertical axis represents the intelligence test scores of those children. If there
is high correlation the scatter of points will tend to cluster along a straight diagonal line. That
means that for every unit of increase in grade attainment there will be proportional increase in
intelligence test score. If there is low correlation, or if there are many other factors besides
intelligence determining school attainment, then the observed points in the scatter diagram will
be found in a rough circle covering all four quadrants. This means that any intelligence score
would be observed associated with any grade position and that there is therefore no
observable association. This is zero correlation while the grouping along the diagonal line
represents perfect correlation of unity. Intermediate degrees of positive correlation are
expressed by decimal coefficients between 0 and 1.00. The correlation coefficient of Fig 1 is
.76
We can go further than merely correlate or test for the presence of relationship. We can
use correlation for prediction. This prediction is subject to limits of error or may be expressed in
terms of increasing probability which are proportional to the size of the correlation coefficient.
Thus, given the intelligence test score of a child, for example the value in the middle row,
275+, we can predict his most probable school grade, 6A, as being the mean of the grades in
that row. This predicted value is stated as that mean, 6A, plus or minus its probable error (one
full grade) which is a measure of the dispersion of the observations about their average. More
precisely the probable error is that amount which includes half the cases or that value within
which the probabilities are 50-50 that a given case will fall. It may read constrict more narrowly
towards the diagonal line of relationship, and the scatter of cases on either side of their mean
will shrivel up towards the centre. This means that the probable error of the predicted value will
become smaller or that the prediction increases in precision and certainty. We can express this
efficiency of prediction as a percentage and call it E. It will vary from zero efficiency with zero
correlation or prediction based on pure chance to 100 per cent efficiency of prediction with
correlation of unity. The formula for E is E - 1 - √ i - r2. E is the percentage reduction in the
total range of possible occurrence. It is that part of the scale in which a case in all probability
will not fall. Thus in Fig. 1, if we take the array of Y marked 250±; we see that the cases only
occur in ten of the possible sixteen columns. This is 63 per cent of the range. Cases will not
occur in 37 per cent of the range and we have thus increased our accuracy of prediction 37 per
cent, over chance prediction in this particular array. On the average of all the arrays, then, the
possible grades that might be found, associated with a given intelligence score, has been
reduced 35 per cent. Actually, the precision of prediction is better than that, for the majority of
the cases will be grouped near the mean.
With this outline of the meaning of correlation, let me proceed to give illustrations of its
use in different fields. In industry, suppose a manufacturer of some product, such as tires,
wishes to know the relation between longevity of the tire and a number of other factors such as
weight, thickness, composition, quality of materials, etc. With tires varying in these different
respects and tested for wear on a road friction machine, he can determine the exact amount of
correlation or dependence of longevity upon each of these factors in turn. He can even assign
by means of the regression equation derived from the intercorrelations the exact relative
shares of each of these factors in determining the wear of the tire. He can thereupon
proportion the amount of research and factory costs which each of these factors deserve to
produce the tire of optimal longevity. He can strengthen the weak features and save by
reducing the unnecessarily strong ones until he might reach the ideal of the famous one-horse
shay, in which no part was weaker than any other, so that it lived to a good old age and finally
collapsed in all its parts at the same moment.
In insurance, all the different indices of health, such as blood pressure, previous
sicknesses, chemical tests, age, occupation, etc., may be correlated against length of life.
From this calculation we may afterwards predict within known limits of error the most probable
length of a man's life as far as these health indices-represent factors which determine the
length of life. There are probably a number of gentlemen here this evening, from whose fields I
am drawing illustrations, who could tell us much more precisely of the use of correlation
coefficients in their problems. I hope that I shall not misrepresent them and that they will
supplement these suggestive illustrations later.
An example might be drawn from the field of politics which has as yet been only very
slightly quantified. Suppose some large civic and research foundation wished to determine
precisely the factors determining the election of mayors, in the interests of improving municipal
administration. From some large group of towns in a given region, data on the mayors elected
and candidates defeated over a period of years might be collected, including such items as
age, length of training in public life, number of times previously appearing on ballots,
education, race, and religion, campaign budget, and indications of strength of the party
organization, such as frequency of meetings between elections, number and value of " plums "
distributed by the party coming into power, etc. Through multiple correlation, which is the
correlation between one variable and a group of several others, we can determine the
influence of the combination of all these factors upon election to the mayoralty. We may write
the regression equation assigning the shares to each of these influences and determine the
share of the unknown or as yet unmeasured residual influences. With this equation a civic
organization might pick its most able potential candidates and decide amongst them by fitting
his specifications into the equation and determining each candidate's probability of election.
An illustration from financial realms might be taken from the stock market. We wish to
predict the course of a given stock or group of stocks. We may collect data or indices of their
strength. We can, by a rather laborious multiple correlation technique, predict the most
probable course of a given set of stocks. The limitations to this are that the available data
represent only a small part of the influences at work in determining prices, and that the
statistical procedure is complex and laborious. It may, however, be worked very successfully
by competent people in the stock market, when proper allowances have been made for
improbability large enough number of cases, long enough periods of time, and temporary and
unusual disturbances such as the failure of the French debt negotiations, unexpected change
in the rediscount rate, and other less obvious influences which do not appear in the regression
equation.
From all of these illustrations, one should not get the impression that we can predict
with too great precision. Correlation simply enables us to determine the amount of
relationships present in our data. The fundamental and far more difficult problem is to get good
data. To get good data means to measure the subtle and intangible causes which determine
(and so have high relationship to) the phenomena whose behavior we wish to predict and
control.
In biological research we may correlate the relative influence of heredity and
environment in producing the final integrated organism. We may control heredity by
parthenogenetic multiplication of a single ovum and then deliberately vary the environmental
conditions of temperature, humidity, nutrition, stimulation, grafting or extirpation, and correlate
on large numbers of such organisms the resulting conditions with the environmental variations.
Conversely, we may keep constant environmental conditions and vary the heredity by taking
different ova from different parents. We may even make a slight first approximation towards
this sort of experiment in the education of human beings. For example, in a project we were
executing for the National Research Council, it was desired to get intelligence tests which
would measure predominately those factors making for achievement in school, which were
hereditary or due to native ability . All the orphanages in the eastern United States were
canvassed to find the one in which experimental conditions of common environment from as
early childhood as possible could be secured. Then with environmental differences partially
controlled in this way, we sought to predict achievement in school and felt that those tests
which would predict that achievement were measuring differential native ability factors, at least
a little more purely than where environmental training was completely uncontrolled .
In research in higher education it has been found that college admission examinations,
based on preparatory school work for men admitted, have an efficiency of prediction of their
attainments in college courses of about 12 per cent (when r = .45) . Intelligence tests have
about an equal efficiency. The school records of the men have a slightly higher efficiency. But
each of these three indices measures some factors not sampled by the others. Consequently,
we find that when they pool their prediction by means of multiple correlation, we increase that
efficiency of prediction from 12 per cent up to 29 per cent. (r = .70) In the same way in industry,
if a good criterion of success on a job can be had, a diversity of measures correlating with it
call be found with sufficient research. The prediction by the employment department of an
applicant's future success can be progressively raised in precision.
These efficiencies may seem ridiculously low compared with that possible in the
physical sciences. It must be remembered, however, that predictions we have been making in
the past on such things and considered good by sheer opinion and subjective observation
were even lower. It emphasizes the fact that in social phenomena there are still a vast number
of unmeasured influences at work which await further research.
Correlational techniques are all of very recent development, principally in the last fifteen
years, and have already diversified until their name is legion. There is rectilinear and various
types of curvilinear correlation. There are many approximative as well as very exact formulae.
There are graphical and algebraic tabular and mechanical methods of solving the formulae in
considerable number. There are further complications of correlation, such as multiple and
partial correlation.
There are a great number of conditions or qualifications which must be observed in their
use and in their interpretation. The popular fallacy that "anything may be proved by statistics"
arises from not observing these limiting conditions. I might mention two of these limitations.
One is the question of adequacy of sampling. One test for that is known as some variety of
reliability correlation. Thus we may have taken observations on some group and we need to
compare the results with an equal number of observations on another similar group. A
difference in averages of the two samples, if large enough to be significant, indicates a
constant difference of some sort. A low correlation coefficient between the observations in the
two samples indicates all sorts of variable influences which are not co-varying or similar in the
two samples. Consequently it is always desirable to know the reliability correlation coefficient
or the correlation between two samples of each of the phenomena correlated before we base
other calculations upon it.
Another precaution is to know the range or variability of the distribution of the two things
correlated. The correlation between ability and achievement of all grade school children will be
very much higher because their range is longer than a similar correlation on a fifth grade group
where the range both of achievement and ability is smaller. Again, in Fig. 1, the cases in any
one array are less widely scattered than in the sum of all the arrays. It is similar to reducing the
labor costs of a large and small firm to percentages of total cost before they can be compared.
In the same way we need to transmute by appropriate formulae, correlation coefficients to
coefficients based upon the same or standard ranges in order to compare amounts of
relationship. So far I have spoken principally of simple correlation between variables two at a
time. The relationships amongst a large number of variables may be analyzed by partial
correlation. A partial correlation coefficient indicates that part of the correlation between two
variables which is independent of the simultaneously varying influence of one or more other
variables. These others are then said to be controlled or partialled out.
In the former agricultural illustration we wish to determine the direct dependence of crop
on fertilizer when rainfall, which inevitably varies at some time, is controlled or partialled out.
We might correlate crop and fertilizer on only those instances where the rainfall was ten
inches, then again where it was twelve inches, and again for fourteen inches, and so on. We
might then average these coefficients, each one of which is obtained for constant rainfall. But
we can do this more conveniently by partial correlation. By a simple formula involving the
observed or zero order correlation coefficients, we may build up successive orders of partial
correlation controlling or partialling out as many extraneous variables as we may have
measured. (This requires many cases as the probable error increases with each order!) Thus
we may learn the relation between fertilizer and crop, independent of rainfall, soil, chemistry,
and other factors.
From these illustrations of the applications of correlations, let us turn to their mechanical
calculation.
Before describing the machine which is illustrated in Fig. 3, let us review the arithmetic
involved, so that you may see the purpose of all the wheels later on. Fig. 2 is a sample
correlation calculation. We start with two series of numbers or observations — an X series and
a Y series. The problem is to determine the amount of correlation between them, or how much
they tend to co-vary. We first get the mean of each series by listing on an adding machine or
otherwise. Next, in the small x column we enter the deviations of each raw score from the
mean of the X series. In the next column we enter the squares of these deviations. Similarly
we get the algebraic deviations and their squares for the Y series. In the last column we get
the cross-products of each x deviation times its paired y deviation. Then we must get the sums
of these columns. These six sums — Σx, Σx2, Σy, Σy2, Σxy, N — are the ingredients which are
combined in the formula for r, the correlation coefficient given below. In case we know the true
mean, the formula is simple. In case we do not know the true mean, or if it is a decimal value,
we use an arbitrary origin and apply a precise correction factor later on. The formula involving
the arbitrary origin is the second more complicated one.
Variable X
2
4
6
13
10
±x
-5
-3
-1
+6
+3
0
Σx
x2
25
9
1
36
9
80
Σx2
Variable Y
8
5
7
9
11
±y
0
-3
-1
+1
+3
0
Σy
Meanx = Σx/N My = 8
S. Dx = √Σx2/N = 4 S. Dy = 2
Σxy
NΣxy - Σx • Σy
rxy = —————
= .625 or —————
———————————
————
—————
2
•
2
2
√Σx Σy
√Σx - ( Σx)2 √Σy2 - ( Σy)2
For true mean as above
For arbitrary origin
E – 1 - √1 – r2 = 22 per cent.
y2
0
9
1
1
9
20
Σy2
xy
0
+9
+1
+6
+9
25
Σxy
The labor involved in getting these six sums is very great and there are numerous
tables, graphs and other short-cut devices that have been put out in the last few years. There
are two machines for this purpose with which I am acquainted. One is a correlation and
forecasting machine developed by Professor Hull, at Wisconsin, and the other one is this
machine which I have developed at the Princeton Laboratory. They are mutually
supplementary, designed for different classes of problems. His is for the Pierce-Arrow trade —
this one for the Ford.
This model (Fig. 4) is a preliminary one with many inadequacies which we hope to
eliminate in the model we are now constructing. This later model is diagrammed in Fig. 3. To
begin a calculation, we group the variables into class intervals, and enter these on the slip of
paper under the setter arms. The operator then sets the X arm and the Y arm at the
appropriate class intervals and turns the crank. This enters the six sums in the six counters at
the right. The X control arm through a gear and rack moves the X pinion to any one of twentyone banks of teeth. In each of d2 drums there are ten banks of teeth, each containing as many
teeth as there are units in the successive squares of the numbers from 1 to 10. There is one of
the widest tooth across the whole face of the drum, four (including the first) of the next width,
nine (including the previous four) of the next shorter, sixteen of the next and so on, up to 100
of the shortest. The X square counter registers one for each arm of its pinion that is hit by a
tooth, and therefore records the square of the X value set by the operator. The +d and -d
drums contain ten banks of ten teeth each. The X counter, which is actuated by the pinion in
mesh with these drums, counts one for each revolution or for every ten teeth. Thus it is
counting the number of units for which the operator has set the X control. Simultaneously and
similarly the Y units and the Y-square units are being cumulated in the two Y counters. This
model will enable the use of the true mean (if that is known) with positive and negative
deviations. The positive deviations will be turned into the counter in a clockwise manner from
drum +d, while the negative deviations will be turned out of the counter through counterclockwise rotation of the pinion. This reversal of the rotation will be accomplished through
having the two duplicate right-and left-hand drums, +d and -d, geared to go in opposite
directions. From either one of them the pinion may take rotation, according as the setter arm is
set above or below the midpoint of its scale.
The xy products are accumulated through the +dxdy, -dxdy mechanism. The +dxdy drum
is mounted rigidly with the x pinion, D, and is therefore making as many rotations as there are
units of deviation in the X score being entered. It is also going in the clockwise or counterclockwise direction according to the sign of the x deviation. The duplicate –dxdy drum, through
an intermediate gear, is always going in an opposite direction to the +dxdy drum. The xy pinion,
C, can take rotation from either drum according to the sign of y and will take rotation from that
bank of teeth which corresponds in number to the units of deviation in the Y score. Thus at
each rotation of the crank shaft the +dxdy drum will revolve x times and on each revolution will
drive the C pinion y teeth, thus achieving the rotation of x times y teeth of the C pinion and the
entry of xy units in its counter The xy counter then accumulates the values algebraically.
Whether the x deviation is plus or minus or the y is plus or minus, all the four possible
combinations of signs are allowed for and properly entered in the xy counter. A little counter on
the drive shaft cumulates N or the number of cases entered.
Figure 4
The model in Fig. 3 instead of having the X and Y values entered in by radial setter
arms will have a scatter diagram plotting device. This consists of two arms at right angles, one
with the X scale written along its edge and the other with the Y scale along its edge, and with a
writing point at their junction. With coordinate paper underneath, the operator slides the writing
point to the intersection of the pair of X and Y values, as if he were plotting the points in Fig. I,
and turns the crank. This yields both the geometric plot with all the information of individual
variations afforded by it and the algebraic coefficient summarizing the amount of relationship in
a single number.
As by-products, the means and standard deviations of both series are secured. The
curvilinear correlation ratio, eta, may be calculated as also bi-serial correlation. A further
feature is the possibility of taking trial runs, that is, by adding or subtracting cases to find all the
above coefficients for differing population samples and so testing for adequacy of sampling.
For if the coefficients do not vary as more cases are entered in, it is an indication that a
representative sample has been secured. With a Monroe calculator the formula in the case of
the arbitrary origin may be solved in about forty-five seconds from the readings on the dial
faces. With the simpler formula when using the true mean, it may be solved in fifteen seconds
or the operator may solve it roughly mentally and so be able to watch the stability of the
coefficient as he enters more cases.
As the appreciation of the usefulness of properly controlled statistical work in social and
biometric fields grows, the amount of such calculation is already increasing in geometric ratio.
It is hoped that a mechanical correlation device such as this will greatly reduce the
laboriousness of such calculations and also enable much more research in the statistical field.
#16. A Correlation Machine
Reprinted from Industrial Psychology for January, 1926, Vol. 1, No. 1 Pages 46 to 58
A correlation and standard deviation machine has been developed at the Princeton
Psychological Laboratory. Before describing it, however, an explanation of the industrial uses
of correlation may be convenient for those readers who are not familiar with this statistical tool.
I. The Use of Correlation Coefficients
The coefficient of correlation is an index of the degree of relationship which may exist
between any two series that can be expressed in numbers. It has been defined as the
measure of mutual implication or as the measure of the tendency to co-vary. A few examples
will make these abstract definitions take on more meaning. If we accept the average of a
salesman's earnings for a year as measuring his selling ability, we may determine how well a
given rating by the employment department tends to measure the salesman's ability. We would
correlate the earnings of a group of salesmen with their ratings on the employment
examination and see whether the two series of numbers tend to co-vary that is whether the
salesman highest in the ratings is highest in earning power, or whether those lowest in the
ratings are lowest in commissions, and to what extent this tendency is true. If there is a perfect
relationship, each man having the same rank and position in the ratings and earning, then,
knowing the ratings in advance, we would predict the other from it. Since, however, in complex
phenomena like human abilities there are such a vast number of factors operating, that a
perfect correlation like this is never found but the coefficient of correlation tells us to what
extent the tendency towards a perfect correlation may exist. With coefficients less than 1.00
(1.00 indicates perfect correlation; zero, chance relationship or no correlation) we may still
predict the most probable score in one series if we know the subject’s score in the other series
and the correlation between the two series.
II. Errors of Estimate
Since the coefficient is only the most probable value, it is expressed as a certain value
plus or minus its probable error, which • is an index of the range in which the coefficient may
fall, or how exactly it is determined. Thus we speak of a correlation of .75 plus or minus .03.
The prediction of on one score of its paired score in the other series is also only the most
probable value and has what is called the standard error of estimate which shows the region
above and below the most probable value in which the predicted score will fall two–thirds of
the time. The aim of research by correlation methods is to increase the accuracy of prediction
(or reduce the standard error of estimate) by developing better measures and more accurate
data which will yield higher correlations.
An example of the use of correlation in industry may be taken from the employment
office of a large corporation which is looking for the best applicants for clerks, foremen, or
typists. The corporation would collect data on their employees of that type covering such
variables as length of experience, age, amount of schooling, average previous earnings over
some period, and the score on some aptitude test or questionnaire. They would then form
criterion with which to correlate these variables. The criterion would be some index of success
in the job such as average wage. Those variables which correlated most highly with this
criterion of ability on the job should be most carefully studied in scrutinizing an applicant's
qualifications, for from them the most certain prediction of success may be obtained.
This method, of course, helps to estimate the contribution towards optimum prediction
of only such variables as can be quantitatively expressed. A thousand and one other variables
or temperament and character and personality traits cannot as yet be quantitatively measured
and must be left to the usual methods of subjective judgment. When, however is realized that
this subjective judgment has an enormous standard error of estimate as has been shown by
many studies the value of knowing exactly how much dependence can be placed on the
quantitative factors becomes much more evident.
III. Multiple Correlation
Multiple correlation is the next step beyond simple correlation just described. In simple
correlation the relation of the variables two at a time is secured and the prediction of one from
the other alone can be obtained. In multiple correlation, the relation between a team of
variables and some there variable can be obtained. This other variable may be the criterion, or
some social measure, which we wish to predict. To get the multiple correlation coefficient, we
have to secure the correlation of each variable of the team with the criterion, and the
intercorrelations (which are the correlations of each variable of the team with every other
variable). The highest multiple correlation and so the optimum prediction is obtained from a
team each member of which correlates most highly with the criterion and intercorrelates lowest
with the others. For low intercorrelation means that The different variables do not co-vary or
have a very small percentage of common elements, or common causes, and so duplicate each
other least in their contribution towards the measurement and prediction of the criterion.
IV. Regression Weights
From the multiple correlation calculations, the regression weights for each variable are
obtained. These regression weights indicate the exact relative contribution of each member of
the team of variables towards the optimum prediction of the criterion by that team Thus in the
examples above with salesmen applicants, we can evaluate the relative importance of
experience, age, amount of education, previous earnings, and test results, and determine the
optimum prediction or most probable selling ability, and its standard error of estimate. It is
necessary that we previously secure the intercorrelations of all these variables and the
criterion of selling ability on salesmen already in the employ of the company. This calculation
yields the equation into which substituted the applicant's record as far as it can be expressed
in numbers and through which is ultimately secured the most probable prediction with its
known limit of error.
In insurance the longevity of an applicant is to be predicted from data on his
examination blank. The actuarial department may compute the correlation between each
variable such as blood-pressure and other health indices, k i n d of occupation, age,
presence or absence of certain habits; etc., and length of life, on a large number of cases.
The multiple correlation of the team of these variables gives a m u c h higher prediction than
any one alone or than all of them in combination weighted according to subjective
judgment. The regression coefficients o r weights indicate the relative importance of each
of the variables.
An example, from mechanical phenomena, of these correlations might be found in an
automobile company which wishes to predict the length of life of its tires or to evaluate exactly
the different factors which determine long wear. Data would be collected from experimental
tires varying in thickness, size, internal structure of layers, quality of material, air pressure in
the tube, or whatever features were considered worth studying, and the multiple correlation
would be calculated between this team of variables and the resultant criterion of length of wear
in actual use or more exactly as tested by some friction wear apparatus. The regression
weights would indicate the importance of each one of the factors of the team and determine
not only the optimum size, thickness, composition, etc., but also the relative share of the cost
of production or research that each f these features deserved.
In agriculture, the relative importance of rainfall, quality of seed , certain chemical
ingredients in the soil, method of cultivating, or other variables towards producing the
maximum crop may be estimated from the regression weights of each of these variables
through their multiple correlation with the crop records.
V. Sampling
All these examples, and a thousand and one others where correlation may be used,
depend as far as prediction is concerned, on the fundamental assumption that the sampling of
the cases included in the study is representative. If it is representative, then, whatever
relationships are shown will hold for future applicants and the calculations based on that
sample may be used for the prediction of most probable individual achievements. There are
methods for testing for adequacy of sampling to find out whether a given sample group is
representative or not. In general they are to take further samples and see coefficients remain
constant, it is an indication that we have included in our sample all the forces and influences
which are at work.
VI. Reliability
In correlation, this test for adequacy of sampling takes from one point of view a
specialized form known as the coefficient of reliability. (The probability error mentioned above
is an indication of adequacy of sampling from another point of view, namely, the number of
cases involved.) The reliability coefficient is simply the correlation between one set of
measurements of a given variable and a second set of measurements of the same variable.
Thus to find out whether an examination is reliable, we may give a second examination of the
same subject matter to the same class and correlate the first set of scores with th second. If
this correlation is high it indicates that we have taken an adequate or representative sample of
the mental function or skill which the examination measures, while if it is low it indicates that
the sampling is inadequate, that there are other variables (perhaps changes of attitude,
headaches, trick questions) which were neither sampled nor controlled and so have come into
one measure while not present in other. A reliable examination in this technical sense is one
which when given under standardized conditions will give the same rating or rank position for
the subject when repeated or if given on different days.
V. Partial Correlation
In addition to simple correlation, multiple correlation to secure the prediction from a
team of variables and reliability correlation to determine the consistency or adequacy of the
sampling of the variables, there is a further type of correlation called partial correlation. The
purpose of partial correlation is to determine the pure relationship between any two sets of
facts freed from the extraneous effects of some other simultaneously existing sets of facts. In
complex phenomena, the different measurable variables generally interact upon each other
until the relation between any two of them is greatly obscured by the influence of the others.
In determining the correlation between the length of experience in selling and the
success in salesmanship, for example, a third undesired variable of age enters in for the more
experienced applicants are usually older and hence we cannot evaluate the prediction from
experience alone, as it is always accompanied by the age variable. To secure the true
correlation between experience and success, independent of age, we need to control the
variation of the age series. We might do this by correlating experience and success of men all
of one age but this would require an enormous number of men from which to take those who
satisfy such requirements. By partial correlation, we can do this on fewer cases.
The partial correlation coefficient is the true correlation between two variables with
another or other coexistent variables held constant. In the illustration it might be considered the
average coefficient of the coefficients between experience and success calculated once for
each age group, for here in each calculation age does not vary but is held constant by the
selection of cases. The partial correlation coefficient is secured, by a simple formula, from the
simple correlations between the variables involved. Other examples of the use of 'partials," as
they are commonly called, might be taken from many fields. In analyzing the mental processes
in education, the correlation between arithmetic and history achievements is obscured or
enhanced by a common variable of reading ability. If this is measured separately and partialled
out (that is, controlled by the partial correlation formula) the true or pure relationship between
arithmetic and history independent of reading ability, may be determined. In insurance
prediction certain health indices such as blood-pressure, weight, age, or others may
intercorrelate or vary together in such a way that the apparent simple correlation between any
one of them and longevity is obscured by the uncontrolled simultaneous variation.
VI. Range “σ”
Just as in economic fields such things as the amount of rent in relation to the total
expenses of two firms, differing greatly in size, cannot be directly compared by naming the rent
in absolute dollars and cents but must first be thrown in percentages to be made comparable
for the two firms, so coefficients of correlation need to be corrected for the range on which they
are calculated before they can be compared.
The use of the coefficients will vary greatly with the range of the two variables from
which they are derived. The range is conventionally expressed by the standard deviation or
sigma, "σ," of each variable. It indicates the amount of spread, or dispersion, or variability, of
that series of numbers of numbers on both sides of the mean. It is that deviation or distance
from the mean within which two-thirds of the cases of that variable fall. Thus the distribution of
two variables such as an intelligence test score in the fourth grade and mechanical test score
in the first through the eighth grades might have the same average but very different standard
deviations The group from the eight grades would show a much greater standard deviation.
The coefficients of correlation developed from them would be derived from longer range and
would be much higher than coefficients from a shorter range. A correlation of .40 between two
traits on a fifth-grade population may be equal to a correlation of .90 on a population ranging
from the first through the' eighth grades.
We thus see that we need standard or normal range into which any derived correlation
may be converted by a simple formula to be made comparable with correlations derived from a
different group. In education the range of unselected twelve year-olds where the standard
deviation is twenty-five "mental months" is frequently called the standard range and used as a
basis of comparison, since twelve-year-olds can be found more completely than any one other
age in the public schools. With the growth of statistical methods, similar standard ranges will
be developed in other fields in order to make comparable coefficients of correlation indicating
simple relationship between any two quantitative phenomena, multiple correlations between
teams of variables, and partial correlations between sets of facts freed from the extraneous
and obscuring influence of other simultaneously existing facts.
VII. The Mathematics of Correlation Coefficients
The coefficient of correlation, r, involves the calculation of the means of the averages of
each of the two series of numbers that are being correlated and the calculation of the
deviations of each number from the average of its series. These deviations must be algebraically
summed, they must be squared and their square root summed, and their cross-products must be
summed. A cross-product is the deviation from the mean of one series, denoted by x, times
the corresponding deviation made by the same individual on the other series, denoted by y.
Thus, if the average examination mark of class is 80% and their average term mark is 75%
and if one pupil secures an examination mark of 60%, and a term mark of 60%, the paired
x and y in his case are -20 and -15 giving a cross-product of +300.
The simplest form of the correlation is
Σxy
—————
————
√Σx2 • Σy2
in which Σxy means the cross products and Σx2 means the sum of the squares of the x
deviations and Σy means similarly the sum of the squares of the deviations of each number of the
y series for the average of the y series.
The formula for the standard deviation is
σ x = √ Σx2/N
in which the symbols are as explained above and N is the number of cases. The usual form of
the correlation formula is
Σxy
Nσxσy
which simplifies to the formula above by the cancellation of the Ns in the denominator.
When an arbitrary origin, or assumed mean, is used in place the true average, the more
complicated formula is
Σx • Σy
Σxy - ———
N
—————————————————
———————
———————
2
2
N√Σx /N - ( Σx/N) √Σy2/N - ( Σy/N)2
in which the only new symbol is Σx and Σy which are the sums of the deviations of x and y
series. Since the exact average of a series almost always ends in decimals and would be
extremely laboriousto square and to cross-multiply. So an assumed mean integer is always
used, enabling convenient deviations, and an exact mathematical correction (the
Σx • Σy
————
N
term in the numerator and the second terms under the radicals in the denominator) is applied
in the final formula t give precisely the same value for r as would have been secured if all
deviations had been calculated from the true mean. If the assumed mean be zero, raw scores
may be used throughout instead of plus and minus deviations. This arbitrary origin form of the
formula looks a little formidable, but is in reality very simple to apply1
The value of the coefficient must always be between -1.00 and zero, or between zero
and +1.00 for positive correlation. The interpretation of the coefficient varies with the kind of
facts studied and requires a background of knowledge of the size of the coefficients usual in
that particular kind of calculation. A coefficient which may be a high one for the relation of
some health variable such as blood-presume to longevity in insurance may be a low one for
the correlation of a child's marks in history with some measure of his reading ability. In general,
however, coefficients of less than .30 indicate low relationship, .30 to .50 fair correlation and
above 70 high correlation.
The multiple correlation coefficients, partial correlation coefficients, and regression weights
are all calculated from the zero order correlation coefficients or the correlations of the variables
taken two at a time. F o r f u l l e r d i s c u s s i o n o f t h e a r i t h m e t i c o f these calculations
and the formula, the reader is referred to the chapters dealing with them in such standard
texts as "Statistical Methods" by T. L. Kelley.
VIII. The Mechanism and Operating Features of a Correlation Machine
The following section will describe six models with various features, for carrying out the
mathematical formulae described m the preceding section. General principles for them all are:
(1) The reduction of the data to an arbitrary scale of not more than twenty group intervals in
order to simplify all the multiplication and squaring processes. This grouping is generally
considered sufficiently fine to keep the error due to grouping down to less than 1 per
cent.
(2) The use of standard revolution counters capable of algebraic summation through adding
numbers when rotated in one direction or subtracting numbers when rotation is reversed.
(3) The use of an arbitrary origin or assumed mean which requires the application of exact
correction factors in the final formulae
(4) The conversion of number punched in by the operator into degree of rotation received
by the counters through laminated gear wheel with differing numbers of teeth shorn off.
(5) The simultaneous summation of x and y (deviations), x squares and y squares, xy
cross-products, the number of cases in the total distribution, and the number of cases or
frequencies in the cells.
(6) The accumulation of the xy cross-products by the mechanical combination of a drum
revolving x times and driving a pinion y teeth on each revolution.
Fig. 1, which is a diagram of Model B, illustrates these principles. A represents the
laminated gear drum of ten discs, each having all teeth except ten, twenty, etc, as labeled,
shorn off. A' is a similar drum revolving in the opposite direction through an internal differential.
D represents a ten tooth pinion keyed to its axle so that it can slide from left to right and mesh
with any one of the banks of teeth or discs of the drum. It is controlled by the rod OO
connecting with the different x keys of the keyboard, J. The pinion D, therefore, makes as
many revolutions as there are units of deviation in the particular key pressed by the operator.
This number of rotations is cumulated in the x counter and can be read on its dial face
represented by the circle at the end of the D axis at the left. F is a similar pinion controlled from
the y keys through the rod QQ and ringing up the number of y deviation units in the y counter
at the left K represents the main drive shaft on which the drums are mounted and the handle
which the operator turns once for every pair of x and y measures entered. N, therefore, is a
revolution counter summing up N, the number of cases, the correlation formula. f is a quick
reset counter geared to the main shaft. It can be reset to zero by slight touch and is used to
count the number of cases in any cell or number of repetitions of any given pair of values. If
not cleared between pairs of different measures, it will simply cumulate the same sum as the N
counter.
B and B' represent duplicate drums whose banks of keys are graduated according to
the squares of the numbers from one to ten The pinion in mesh with this drum is E keyed to its
axis and sliding on it from left to right rigidly coupled with the D pinion through the 00 control
rod which is actuated by the x keys. The E pinion, therefore, rings up in the Σx2 counter at the
left of the square of the number of the units of deviation determined by the key the operator
has pressed.
The two sums of deviations counters are revolution counters recording one for a full
rotation, while the sums of squared deviations are direct drive recording one for every tenth of
a revolution or one for each tooth of the ten-tooth pinions E and H. B and B' drums are rotating
in one piece in the same direction as all squares of deviations are positive numbers. Drums A.
A', B, and B' are ten inches in diameter and of 16 diametral pitch leaving a gap for the recall
mechanism of 60 teeth of 135° where teeth are in mesh with the pinions.
IX. Cross-product Mechanisms
The cross-product mechanism is represented in the two drums C and C' each carrying
ten banks of teeth varying from one to ten teeth in each. B and B' rotate in opposite directions
through an intermediate gear wheel in order to allow the pinion in mesh with them, G, to add or
subtract numbers by means clockwise or counter clockwise rotation. The drums C and C' are
on the D axis and therefore make as many revolutions as there are units in the x deviation
punched. The G pinion slides on its axis right and left to take rotation from differently
numbered banks f teeth. It is controlled through the QQ control rod actuated by the y keys and
therefore works rigidly with the F and H pinions. At each rotation of the C drum the G pinion
will register as many units in the Σxy counter as determined by the y key punched. Therefore,
since the C drum is making x revolutions and the G pinion is recording y on each revolution,
the Σxy counter is recording x rotation times y units or the xy cross-products. It further is
recording these products with their proper sign no matter whether the x deviation or the y
deviation or both are negative. For the D pinion in taking rotation from either the A or A' drum
forces the C drum to rotate either clockwise or counterclockwise recording as the deviation is
positive or negative and the G pinion in taking rotation from either the C or the C' drum rotates
either counter-clockwise or clockwise depending on the sign of the y deviation If a clockwise
rotation is normal for a positive deviation, then for a negative x deviation the D axis will go
counter-clockwise and the normal direction of the C pinion will be reversed. If both deviations
are negative, the direction of rotation will be reversed once for the x and reversed back again
for the y resulting in a positive entry in the Σxy counter.
The arrangement of the counters is such that in use the true mean in known and placed
at the assumed mean in laying off the scale on the keyboard (thus obviating the necessity for a
correction for the use of an arbitrary origin) the simplest form of the correlation formula stands
grouped on the dial faces as in the formula, namely,
Σxy
—————
.
————
√Σx2 • Σy2
The operator may take the approximate average of the two quantities in the denominator
mentally and divide it into the dial reading of the numerator thus keeping track of the size of the
coefficient of correlation as he enters numbers. This enables him to take "a running coefficient"
which indicates the adequacy of sampling. For if the coefficient does not change as further
cases are punched in, it is an indication that the sampling already entered is adequate or
representative of the total population.
The rod P represents the recall mechanism for repetition of a pair of numbers or
retrieving an error due to a mis-punched key. In travelling from right to left it pushes both
control rods 00 and QQ to the left, thus taking all pinions out of mesh to the left beyond their
drums. On depressing the key for the next entry, a catch releases this rod P which by a spring
traveling to the right allowing the pinions to go to the right as far as the particular key pressed
will let them, thus leaving the pinion ready to mesh with the bank of teeth corresponding to that
key. At the end of the turn of the crank handle a roller at R forces the P rod to the left through
running up the inclined plane to its end. It thus clears the machine ready for the next entry. For
correlations having a number of frequencies in each cell requiring repetition of each entry, at
catch at R depressed by the thumb causes the roller to fail to engage the P rod and so leaves
the pinions in position for as many turns of the crank as there are cases at that pair of values.
The keys x and y are identical and number from -10 through 0 to +10. On the keyboard
between the two banks a piece of paper is clipped with spaces for entry of the raw scores of
the particular calculation corresponding to the deviations of the keys alongside them. To start a
calculation the operation enters the raw scores of the two variables on the paper. Thereafter,
he reads and punches only raw scores from his data while the machine automatically
transmutes them into an arbitrary scale of twenty-one clue intervals, ten being positive and ten
being negative. To lay off the scale of raw scores on the paper, the operator by visual
inspection from the data subtracts the lowest score from the highest, thus finding the range
of scores, and divide it by twenty-one to determine units the number or units in each class
interval. Writing the lowest score at the extreme left hand or -10 key space he writes the others
on up to the right by increments of the class interval. In case the operator knows the true mean
of the series by previous listin g on an adding machine or otherwise, he can make that
mean the midpoint of the class interval at the zero key and lay off his scale up and down from the
middle. This will make the assumed mean coincide approximately with the true mean, thus
obviating the necessity of correcting for an arbitrary origin in case the difference between the
two means is not more than one-thirty-fifth of the range and the coefficient is desired accurate to
within only 1 per cent.2
X. Various Correlation Machines
The A model of the machine, illustrated in the photograph shown below, is a preliminary
experimental model built last year in the Princeton laboratory. It employed only twelve degrees
or class intervals of the variable and hence introduced a considerable grouping error. The
values of x and y were entered into the machine by setting the two radial control arms to points
on a piece of paper on which the operator had laid off the raw score scale of the particular
calculation in progress. The substitution of keys like an adding machine for arms to be set is a
great convenience for the operator and makes for swifter as well as more accurate work.
The use of pins and star wheels in the early model was abandoned in favor of standard
involute gear teeth, as these al quieter and reduce the diameter of the main drums one half. In
the A model the arbitrary origin, was placed at zero, making all deviations positive. The
maximum square or cross-product was 144.
It was found to be more satisfactory to change to a machine such as the B model,
having 21 class intervals but only 10 numerical values handled by the machine, as these 10
could be used (by reversing direction of rotation) once as positive deviations and again as
negative deviations. This reduced the grouping error and also reduced the maximum square or
cross- product to 100. As the Σxy counter may have to start and stop ten times in a single
revolution of the crank, this reduction in the number of units which it may have to accumulate
for each case very much decreases the possibility of the counter's going at such high speeds
as to overcount through momentum when disengaged at the end of the run. A positive friction
device on each pinion has solved this difficulty completely in the B model.
The C model substitute for the double drum rotating in opposite directions in the B
model a device on the Σx and Σy and Σxy shafts which enables reversing their direction of
rotation. This cuts in half the number of banks of teeth needed on the drums, materially
reducing the width of the machine. The pinions take rotation always in one direction and then
that direction is reversed for the negative deviation series and left unchanged for the positive
series.
If the crank is reversed, it will simply take out all positive and negative values just
entered in the preceding turn of the crank, thus allowing the operator to correct an error of a
mis-punched key. This error retrieving feature holds for the B and C models. At present,
however, it is achieved at the expense of the reset feature on the counters. Reset counters
may be driven accurately only in one direction through a ratchet mechanism, and all the dial
rings brought back to zero by turn of the reset knob. I have not yet succeeded in finding any
small, commercially produced counters combining the reset feature with dependable addition
and subtraction. Such a device may be built for each of the six counters but would very much
increase the cost of the machine. The B and C models as at present designed, therefore will
require a preliminary reading of the dial faces before starting a calculation and subtraction of
these values from the final readings in order to get the six sums for each correlation
calculation.
The D model simply substitutes the rest feature for the error retrieving feature. It employs
one counter to accumulate the positive deviations and one for the negative deviations for each of
the three summations (Σx, Σy, Σxy) which require algebraic addition. Those three sums are then
the difference between the plus and the minus counter readings. In this model all counters have
the reset knob but lack complete dependability when their ratchet mechanisms are reversed in
turning the crank backwards to retrieve an error.
Model E employs the sigma of the x+y sums formula instead of the xy cross-product
formula to which it is an exact mathematical equivalent. This involves securing the deviations
and their squares for three variables namely x, y, and x+y. The operator may punch the first
two and the machine automatically punches the third, though this feature involves a fairly
complicated mechanism whether the entries are made by keys or by arms set to certain
positions. The alternative is for the operator to set the x+y value. This model dispenses with
the intermediate small xy drum of the B model in the diagram as all the ingredients of the
formula are secured from the Sigmas from x, y, and x+y which require only the summation of
deviations, the summation of their squares, and a count of the number of cases. A similar
formula using Sigma of x-y could be used equally well. While model E involves a slightly
simpler mechanism, it would be much less convenient for the operator as he would have to
enter three values instead of two and would have to combine the dial readings into a much
more complex formula at the end.
XI. Graphic Correlation
Model F is a combination of any one of the above models with a scattergram plotting
attachment A scattergram or scatter diagram is a graphic instead of an algebraic solution of
the correlation problem. Instead of summing up the relation between two variables by a single
index number, the correlation coefficient, it shows on a graph the amount of relationship
between the two sets of facts studied. Each case is plotted as a single point determined on the
horizontal scale by the x value of that case and on a vertical scale by the y value of that case.
If the scatter of these points over the square tends to be along the diagonal line, the
correlation approaches unity, as for every step increase in x a similar step increase in y is
shown to exist rather uniformly. A scatter along one diagonal indicates positive relationship,
while a scatter along the other indicates negative correlation or inverse relationship as when
the older children in a given school grade tend to be the duller and the younger are more apt to
be the brighter. If the scatter of points in the rectangle is in more or less of a circle or scattered
in all four quadrants, the correlation coefficient approaches zero as that shows that any value
in one variable may accompany any value in the other variable and hence that the two do not
tend to co-vary or correlate.
In ordinary practice correlations are usually plotted on a scattergram and then the
algebraic coefficient derived from it. This has the added merit of indicating whether the
relationship between the two phenomena is a rectilinear one or a curvilinear one. If it is
curvilinear a different of formulae is needed.
The scattergram attachment to the correlation machine consists of a metal rectangle
carrying two sliding arms at right angles to each other. On the x arm the x class intervals are
written in raw scores and the y values similarly on the y arm. A printed form of cross-section
paper is placed under the rectangle. The two arms are joined by a pin which slides in grooves
of each one so that the operator by pushing in any direction around the rectangle can set both
scales simultaneously. A writing point is attached to this pin. Each sliding arm by means of a
connecting rod d sets its pinions at the corresponding key in case the scattergram device is
connected like a player piano attachment to be superposed on the usual keyboard. The
operator then pushes the pin until the two sliding arms show at their intersection the x and y
value that he is entering and then presses the writing point to plot the case on the scattergram
underneath and turns the crank to enter the algebraic values on the counter dials.
Of course, all these machines may be electrically driven so that a touch of a bar or the
depression of the writing point in the scattergram attachment will cause the drum to be rotated
at once electronically and all values entered in without further effort on the operator's part. Also
magnetic counters might be substituted for gear driven ones but are prohibitively expensive in
the required styles.
For users who require a large number of simple distributions, a counter can be placed at
the end of each array ( row or column) of the scattergram to sum up the number of cases in
that class interval of that variable. This would require forty-two counters. The vertical row of
counters would show the frequency distribution of the y variable and the horizontal row would
show it for the x variable. Similarly, counters might be placed, one under each key in the
keyboard models, to tabulate the frequencies of the two variables simultaneously as byproducts in securing the means, standard deviations, and correlation coefficients.
For curvilinear correlations this is essential as the curvilinear correlation ratio “eta” as
the ratio of the standard deviations of the means of the array over the standard deviation of the
whole distribution. (There are two correlation ratios, one for x on y, and one for y on x, one
being derived from the means of the rows or horizontal arrays and the other from the means of
the of columns or vertical arrays.) The scattergram attachment will thus tabulate the arrays, the
means of which may be most simply secured by listing all the values in that array on an adding
machine; then, with these means as one variable and the whole distribution of the same set of
facts as the second table, their standard deviations may be simultaneously secured on the
machine. The ration of these standard deviations will be one of the two curvilinear correlation
coefficients.
By adjusting the size of the spaces which the two sliding arms of the scattergram
attachment cover at each step or class interval, this attachment can be used for a great many
other varieties of plotting, tabulation or table-reading. It could be superimposed on a table of
squares or cross-products and the values read with less probability of error and much more
ease for the operator in routine work.
The combination of the dial readings into the various formulae may be made fool-proof
for a clerk who knows nothing about correlations or statistics but can operate an adding
machine like the Monroe by providing "a box step solution chart,” such as illustrated in the first
note at the end of this paper. In following through this chart from box to box each detailed
addition, multiplication or other process is distinctly labeled.
The machine may be calibrated or its accuracy tested at any time very simply. If the x
and y are set for the same group interval the two counters summing their deviations should
show identical readings and the three counters showing their squares and their cross-products
should show identical readings. Going up the keyboard from left to right systematically testing
each corresponding pair of keys in turn should diagnose any flaw in the mechanism such as
the breaking of the on some bank of one of the drums.
In addition to assisting with curvilinear correlations the machine can be used to secure
biserial correlations or general to count any two-category distribution such as boys and girls,
passed and failed, existence or nonexistence of a certain fact Thus, if one wished to determine
the amount of relationship between size of salary and yea rs of experience in a job,
and one variable, for example, the salary, was known only in two categories, as all
those below a certain amount and all those above it, biserial r would give the answer.
Mechanically the x keyboard would be used to enter the number of year of experience of one
group of salaried employees and the y keyboard to enter the years of the other group of
salaried employees. The f counter would be attached so as to count only the cases entered on
the x keyboard. The difference between the n counter, adding the total number of cases in
both categories, and the f counter, gives the number of cases in the second category. The
means of each category would be calculated just as explained above for the means of each of
the two variables and the standard deviation of the whole distribution would be secured from
adding together the reading of the Σx2 and Σy2 counters and adding together the Σx and Σy
readings.
XII. Possible Developments
In future development a machine could be made to calculate several correlations
simultaneously as, when with one turning over of the data pages a criterion is to be
correlated with several tests. This would require a separate set of six counters for each
correlation calculation. A key would throw out of mesh all of one set and throw in all of the other
set. Thus the criterion score might be punched only once in the x the bank of keys, the first
test score punched in the y bank, then touching the key to throw in the second set of
counters, the second test score could be entered in the y bank with the same entry of the
criterion.
The principles of this machine could be further extended to build a machine which would
calculate the first four moments of any given set of data to apply in the equations to find the
best fit curve. It would simply be necessary to group the data and have a succession of drums
and pinions so arranged as automatically and simultaneously to accumulate from an arbitrary
origin the sums of the first powers, second powers, third powers and fourth powers of the
deviations, much as the present correlation machine gets their first and second powers. At
present, however, there is probably not enough work of this kind done to justify the
construction of a curve-fitting machine.
We are at present going through the stages of experimenting to find the most
mechanically durable, compact, and simple design. The preliminary A model was built last year
of the simplest design with only twelve degrees of variables and with none of the operating
features such as keyboard, error retrieving mechanism, etc., which would be desirable in
routine office use. The difficulties of getting a competent designer, financing, and patenting will
delay the availability of the machine for some time.
Another much more elaborate correlation machine has been developed by Professor
Clark Hull at the University of Wisconsin. It possesses automatic recording features which
make it more useful than the models described here for work with a large number of variables,
as in problems of multiple or partial correlation which require all the intercorrelations. The
operator enters the raw scores on two duplicate ribbons by means of a perforating unit. The
rest of the machine is largely a device for securing a sum of products. As the two ribbons are
automatically fed through the machine corresponding entries are multiplied together and the
products summed. On a sixteen variable problem, for example, the sixteen pairs of ribbons
would be punched. Each would be fed through once with each of the other sixteen, securing
the one hundred and twenty cross-product sums needed for the one hundred and twenty
correlations. Each ribbon would be fed through with its own duplicate, securing the sum of the
squares of that variable for the standard deviation. It thus secures the six sums needed for
correlation calculation, successively, requiring the presence of an operator only to change
ribbons. The machine described in this article secures the six sums simultaneously but
requires an operator to make each entry as it is not equipped with a ribbon control in the
present design. Professor Hull’s machine has a further very useful application in solving
regression equations or performing the forecasting function in predicting an individual’s most
probable achievement from preliminary measures. In this case the regression coefficients on
one ribbon are multiplied by the individual’s scores on the different measures on the other
ribbon and the sum of products is the prediction desired.
In the writer's opinion, the two machines fill rather different needs. Professor Hull's
machine is much more efficient for Iarge bureaus doing a great deal of intercorrelation and
forecasting. The machine here described should prove more convenient and very much
cheaper for small laboratories and offices which want miscellaneous calculations on different
variables and a machine that can be handled by students and clerks without an expert.
Notes
1. On a Monroe calculator such as would be in almost any statistical office, it may solve in
some forty seconds in three continuous runs as indicated in the following procedure.
The six sums Σx, Σy, Σxy, Σx2, Σy2, and N are read from data already listed and
entered in their appropriately labeled parentheses. The Σy is entered on the keyboard of
the Monroe and cranked in N times as indicated by the A • F symbols. The C value is
entered on the keyboard and cranked out C times, thus simultaneously squaring and
subtracting it and the result which is the second parenthesis in the denominator is noted
down in the G box. Similarly, the A and E values are multiplied and the B value
subtracted out B times, the result is multiplied by G, and its square root is read from a table
and noted down in the H box. The A value is added in D times and the B value is
subtracted C times, giving the numerator which is then divided by the H value and the quotient
is the correlation coefficient. This procedure requires only two readings to be noted
down.
N
Σx
Σy
Σxy
Σx2
Σy2
σ1/N
Denominator
r
A
B
C
D
E
F
G
H
r
For arbitrary origin:
1. A • F-C2 = G. Note dial reading
2. (A • E-B2)G. Read √ = H
3. (A • D-BC)/H = r
For true mean:
1. E • F. Read √ = H
2. D/H = r
The form of the formula is
NΣxy - Σx • Σy
———————————
—————
—————
√Σx2 - ( Σx)2 √Σy2 - ( Σy)2
2. For more exact conditions when correction for the use of arbitrary origin may be neglected
see T.L. Kelley, Statistical Method, New York, 1923, 385 pp. Section 46
#17. On Predicting Elections or Other Public Behavior
by
Stuart C. Dodd, Director of the Washing/on Public Opinion Laboratory in Seattle, and research
professor in the Department of Sociology, University of Washington.
I. A Diagnosis
The job of scientists is to learn how to predict and eventually control phenomena in their
fields. The social scientist is outgrowing mere description of interhuman behavior and learning
how to predict it as a first step toward control of it. Elections provide the social scientist with
one of the best opportunities for experiment that he has. For elections test rigorously his
accuracy in predicting. Pre-election polls measure the intentions of people, their voting on election day measures their election behavior, and the degree of agreement between polled and
voted percentages measures the accuracy of the prediction. Whenever this agreement is low,
it challenges the pollers to study and improve their techniques. A mis-predicted election can
stimulate scientific progress in demoscopy far more than a poll predicting an election well.
With this philosophy of science in mind, let us study the national elections in the United
States on November 2, 1948. This paper will try
(a) to diagnose the factors in mis-predicting the elections, and
(b) to outline remedies.
The factors listed below seem the important ones in this election but may also be general to
any election, anywhere, any time. This paper will not review the detailed evidence, since the
Social Science Research Council's Committee is doing this thoroughly for the nation, and a
Bulletin of our Laboratory will do if fully for the State of Washington and the studies by our
Washington P.O.L. The remedies outlined here aim to improve not only election polls but also
any polls predicting some public behavior.
Before diagnosing, the chief facts about the U. S. election may be reviewed. About half
the American electorate of some ninety five million people turned out to vote on November 2.
Roughly about fifty per cent (twenty-three million odd) voted for Truman (Democrat), about
forty-six per cent for Dewey (Republican), about two per cent for Wallace (Progressive, with
Communist support), and about two per cent for Thurman (Dixiecrat). They gave Truman some
two million more votes than they gave Dewey which practically reversed the predictions of the
polls. Almost all the polls (along with most of the press and political commentators) had
confidently predicted Dewey's victory. The polls had been of larger samples than usual,
selected by their usual quota method, conducted by their customary techniques, by different
agencies, and held at various dates from the spring up until a fortnight or so before election
day. They showed a "don't know" percentage somewhat larger than usual, varying with the
date and the poll but running around fifteen down to eight per cent. Both pollsters and public
were much surprised at the unexpected mis-prediction and are still busy explaining it, either
humorously or scientifically — as in this symposium.
We diagnose the observable factors of mis-predicting as:
1.
2.
3.
Shifting opinions — shifting from a major party to the victor,
shifting from a minor party to the victor,
shifting from the "don't know" plural disproportionately;
Strength of opinions (i.e. intensity), resulting in differential turn-out;
Sampling certain strata of the population inadequately (especially the low
income, rural, and labor strata).
Each of these three factors is measurable, its share assessable and, therefore, its effect
seems reducible in future polling. Let us look at each.
II. Shift of Opinion
If the polls rightly represented the opinions of the electorate, there was a change in
opinion of some four percentage points between the time of the polls and the voting. Whatever
the true amount of the shift, it seems to have accelerated rapidly at the end. Dewey seems to
have lost and Truman gained right up to election eve. About a million leftish Wallace
supporters (judging from the difference between earlier and later polls) seem to have deserted
him by election day to vote for Truman and against the more rightist Dewey.
Previous studies by Lazarsfeld and others have shown little shift of opinion during the
pre-election campaign of some four months, but in this election there was a shift. Truman's
game fight against odds, the organized effort of labor unions, and other factors apparently did
change opinions. One result will be that hereafter pollers will measure trends up to the last
moment more exactly. Telegraphed polls in the last forty-eight hours even are possible if
arranged in advance. Also, any poll that is late in the campaign can measure the trend to date
by asking a retroactive question as to the date when the respondent decided on his current
choice and as to his previous preference.
The shift of "Don't know" opinion seems also to have been disproportional. Instead of
the "Don't knows" breaking in proportion to the "Do knows," as is usually assumed, apparently
they broke in favor of Truman and perhaps Wallace. Several principles can be suggested to
explain this, though more research is needed on how "Don't knows" decide. Possibly the
percentage of "Don't knows" who never decide and do not turn out is larger than the total per
cent not turning out. Probably the party which
(a) is growing,
(b) has more than average intensity of opinion, and
(c) has low prestige gets more than its expected proportion of the "Don't knows."
That the growing party with ardent supporters should capture converts more than a shrinking
party with less ardent supporters seems plausible. Whether there is a differential rate of
growth, or a differential intensity, the polls early in the campaign must explore and so guide the
later polling reports? That the party with low prestige gets more of the "Don't knows" than the
parties with high prestige seems implausible, but an experiment in Seattle supports this
hypothesis. A poll in the city here compared overt responding with a secret ballot which was
handed to the respondent by the interviewer. In the secret ballot the "Don't knows" dropped to
about half the per cent they were in overt responding. And this decrease declared itself
consistently mostly for the side with less prestige in such issues as old age pensions, veterans,
bonuses, etcetera. The explanatory hypothesis is that undecided sympathizers for the side
with low prestige (i.e. against the aged and veterans, etcetera) hid by saying, "I don't know,"
more often than similar unsure sympathizers for the side with more prestige who would name it
as their probable choice more freely. In other words, the "Don't know" class conceals more
supporters of the side with low prestige who will actually vote for it on election day. Factors
such as these need controlled research, repeated in many situations until the relative
importance of each can be assigned under specified conditions, by their multiple regression
weights when correlated against voting behavior as the criterion. Polls in the future can
measure rates of growth, intensities of opinion, and prestige of the rival parties, and thus refine
the present crude prediction of "Don't know" behavior as proportioned to "Do know" behavior.
III. Strength of Opinion (Intensity)
Just as the Literary Digest's fiasco in mis-predicting Roosevelt's defeat in1936 taught
the pollers and the public to depend on representative samples and not merely large samples,
so the 1948 mispredictions may teach more of us to measure intensity of opinion as well as its
direction, if we want to reduce the last few percentage points of uncertainty and error in our
predictions. For intensity of opinion, the emotional strength backing an idea, is a cause of
action. It predicts what a person will probably do. The intensely motivated persons will turn out
to vote in spite of rain, long waiting in line, or other obstacles, more often than the indifferent
persons. In general, intensity of opinion is one good measure of the probability of later
behavior. This may be stated with more scientific rigor as the hypothesis that any index of
intensity of an opinion will correlate highly with an index of relevant behavior.
Therefore, wherever differential turnout to vote may be a deciding factor in an election,
polls on intensity of opinion are essential. Where voting is compulsory for everyone or where
intensities may be equal on both sides, measuring it may not be essential. But in the American
election this year, there was differential intensity. Republicans were overconfident, sure the
election was won beforehand; consequently many did not bother to go and vote (as Mr. Dewey
noted afterwards). However strong their convictions, these were often neutralized by an
overconfident attitude resulting in weakened net motivation to turn out. Democrats, on the
other hand, knew they could win only by great effort to turn out every vote. Truman
campaigned strenuously. Organized labor (the C.I.O. and A.F.L.) are reported to have spent
five million dollars in fifty critical districts in September and October on behalf of the
Democrats. In one district alone they had one thousand teams of canvassers getting
Democrats out to register and to vote. The report is that fifteen thousand baby sitters were
hired on election day to enable fifteen thousand feminine Democrats to cast their votes.
Consider the effect of differential turnout. Suppose for argument that the polls were right
in showing about fifty per cent of the electorate favoring Dewey and only about forty-six per
cent favoring Truman. If, then, only forty-five per cent of the less intensely motivated
Republicans turned out, while fifty-five per cent of the more intensely motivated or prodded
Democrats turned out, this ten point differential would convert Dewey’s expected excess of two
million votes into the actual two million deficit.
Granted then that intensity of opinion is important, how may it he measured? At least
sixteen indices of intensity are known and more seem inevitable. Some of the indices already
in use, in varying degree, are the following:
1. Past behavior of the same kind. How often has the respondent voted in previous
elections that were open to him?
2. Efforts made along this line.
(a) Has the respondent registered?
(b) Worked for a party?
(c) Talked for it?
(d) Attended meetings or otherwise devoted time to it?
(e) Read up or become informed?
(f)Contributed money?
(g) Worn a button or publicly given support?
(h) Sacrificed for it in any way?
(i) Ever exchanged any experience for it?
3. Group memberships. Is the respondent a member of any groups which have a
plurality of members for one party? In proportion as such plurality is large and he
belongs to several such consistent groups, the probability of his supporting the
correlated party is greater. Such group memberships are indirect indices of intensity
of such opinions as may be correlated to candidates or issues in a given election.
Early polls can determine these correlations, such as the Catholic-Protestant
membership, for example, became of great importance when the Catholic, Al Smith,
was a candidate in 1928.
4. Assertions of intense opinion.
(a) Does the respondent say he feels "very strongly" or "strongly" or "not at all
strongly"?
(b) Where does he mark the "intensity thermometer" when it is shown him on a card?
(c) Does he score high on some one of the many types of attitude tests reflecting
intensity of feeling?
(d) What does he say he will do, or give, to support his opinion?
(e) What degree of physiological deprivation will he undergo, if necessary, from a
headache or missing a meal, up to risking torture and death for "the cause?"
The sixteen indices above, each suggested in a phrase, need development through
research. Each could be amplified in various cultures or for predicting various classes of
behavior other than elections. Each needs to be scaled. Scaling converts it from a crude all-ornone variable answered yes or no to an ordinal variable of better to a cardinal variable which
measures the intensity in many equal degrees. For polls to do this better requires prior
research developing scales by the best current techniques, such as a combination of the
Thurstone, Liked and Guttman techniques.5 This will give the polling profession quantitative
indices of intensity of opinion which have the desired technical features of reliability (of several
kinds), validity, reproducibility, unidimensionality, equal intervals, transcendence of the deriving
group, etc.
Chief among these indices of intensity of opinion relevant to voting is the effort of
registering to vote. This act both legally excludes the unregistered as ineligible to vote and also
selects a sub-population who have a high probability of turning out on election day. Wherever
polls aim to predict elections (and not necessarily to measure the opinion of all the electorate),
their sampling should he confined to registered voters. The percentages for the various
candidates among the registered voters have been found to agree with the percentages voting
for those candidates more closely than do the percentages among the unregistered voters.
Wherever the lists of registered voters are accessible, every nth name can be drawn as the
best randomized sample to interview: otherwise the poller can ask if the respondent has been
registered and predict only from percents among registered voters.
IV. Sampling
A major factor in mis-predicting is sampling error of both the variable and the constant
(i.e. biased) types. The familiar variable sampling error is measured by the standard error
formulas but is often forgotten or not stressed to the public. Pollers should continually give their
readers some such reminder as "if many similar polls were made, in about half of them the
percentages reported here might vary as much as one percentage point more or less." If
people expect this wobble in the polls, they will not try to interpret the polls too exactly and will
not be disappointed when the wobbling shows up. American pollers have oversold their public
in getting them to expect more accurate predicting than most current polls can give. It would be
more healthy and honest to release polled percentages with a "± 1%" (or whatever the
probable error is alongside.
The less familiar constant errors in sampling, called biases, can be measured, for one
technique, by the discrepancy between the percentage from the sample and the universe
sampled. These discrepancies are due in part to random sampling error and in part to constant
sampling error or the biases that make up unrepresentative sampling. Representative
sampling means that sample and universe should have matched proportions of each subclass
of age, sex, economic or educational level or other relevant characteristic. "Relevant" means
the characteristic is correlated with the polled responses — which early polls can find out in an
election campaign. Only correlated characteristics need to be representatively sampled, for
they (and only they) will produce a bias. Their bias will be in proportion to
(a) the size of the correlation coefficient (between a characteristic and the poll's
responses in the sample) and
(b) the extent of mismatching the proportions (of the subclasses of that characteristic in
the sample compared with the corresponding proportions in the universe).
Thus, for example, economic subclasses seem to have been correlated in this election
with political party. The "haves" supported Dewey more; the "have nots" supported Truman
somewhat more. Under representing the lowest income levels, therefore, could be a major bias
in this situation. Therefore sampling economic subclasses accurately was vital for accurate
prediction in this election.
These basic principles for representative sampling seem to have been inadequately
carried out by most polls using the quota method of sampling. And since most American polls
use the quota method, most of them fell into the same pit. On the other hand, the polls using
the areal method of sampling predicted the elections much more accurately. This difference
between quota and areal sampling in preventing bias and mis-prediction, we believe, is a
major finding from the U.S. election polling of 1948.
For explanation of the greater accuracy of the areal sample, we believe it is due to the
strict randomized selection of respondents which means a high probability of being
representative of all characteristics. In contrast, the quota sampling is apt to under-represent
the lower extreme of society (who voted Democratic more) because these people are less
accessible or convenient. Quota interviewers find it inconvenient to tap remote lumber camps
or farms, transients and tenements, and the unstable isolated types of voters. (Gallup's last
three presidential elections have under-predicted the Democratic vote, probably because the
quota method under-represents the lowest income and less accessible groups.) The areal
sampling design compels the interviewers to reach into all area, rural or urban, reputable
or
disreputable, etcetera, and get all kinds of people more exactly in the long run than most quota
sampling currently does. Quota sampling may predict well for awhile and then turn up an
occasional but serious mis-prediction whenever the questions polled are correlated with some
characteristic of the population which may or may not be representatively selected. Areal
sampling will guarantee representative selection within limits of its standard error and,
therefore, eliminates bias within pre-assignable limits. In fact, we are working out, in
cooperation with the Laboratory for Statistical Research of our University of Washington, a
formula relating cost, accuracy, call backs, such that the cost of an areal poll of a pre-assigned
accuracy can be read off from curves beforehand (or alternatively, the accuracy attainable with
a given budget). The mis-prediction due to the shift, strength, and sampling of opinions has
been described, for elections and more generally far any polls predicting public behavior. But
the relative share of each mis-prediction factor requires further research. Our prescription from
present data is
(a) to measure shifts of opinion by repolling e a at successive dates and by inquiring
retroactively about dates of decisions (the time dimension),
(b) to measure strength of opinion by those intensity indices which correlate highest with
the behavior to be predicted (the intensity dimension),
(c) to measure the biases and eliminate them by rigorously representative selection of
the population (the population dimension), and
(d) to measure in all geographic areas more representatively to assure fully
representative selection of the population (the space dimension).7
Notes
1. The Washington P.O.L designed and carried out an experiment to measure factors in preelection polls and prediction. For the shift factor, the poll was made during the last preelection week and trends were probed by a retroactive question about date of decision. For
the strength factor, registered voters were tabulated separately and voting intentions were
asked. For the sampling factor, an areal and a quota sample were matched in using the
same interviewers and questionnaire, in the same place and period. This poll was not
strictly an election prediction, since its findings were not tabulated till after election day, and
the percentages were not announced before the voting; it was a study of election prediction
factors to enable more accurate prediction in the future. The findings, which shed dear light
on these factors will be first published in full detail in a special Bulletin of the Washington
Public Opinion Laboratory (available on request from the Laboratory's State College office
in Pullman or University office in Seattle, Washington) and as part of the SSRCs inquiry.
The SSRC financed a repoll after the election, probing the actual behavior and reasons for
it among the respondents in our pre-election areal sample. The principles in the diagnosis
and the prognosis in this paper are based in part on the data from these studies in this
State.
2. Note that this amount of four percentage points seems to be also about the normal
discrepancy between polls and national elections. This is the average discrepancy in 1391
pre-election polls made in many countries by diverse agents in the past dozen year as
summarized by Gallup in a recent privately published paper. It is also the average
discrepancy our Laboratory has found on twelve indices (percents of age, sex: schooling;
marital, veteran, and economic status; occupation; ownership of home, car, and phone;
nativity; and voting between our state-wide areal samples and state-wide complete
statistics. (See Bulletin 1, Section C, "Validation," Washington Public Opinion Laboratory,
University of Washington. Seattle, April 1948, pp. 22) The U. S. election polls were unusual
in having this normal amount of discrepancy all in one direction, instead of the usual overand-under prediction. It seems highly improbable that random sampling error would
account for the generality with which the U.S. pre-elections polls showed their constant, or
one-way, error. This improbability further supports attributing the mis-predictions to factors
other than random sampling, such as to the shift in time, to the strength of opinion, and to
the sampling bias discussed here.
3. Of course, if any corrections are made in the raw figures, both should be reported fully to
the public through the some channel to enable checking.
4. Conducted by Professor Allen Edwards and his class in Public Opinion Analysis and not yet
published.
5. A model for this has been worked out in our Psychology Department. See F. P. Kirkpatrick,
"A Technique for the Measurement of Attitudes," M. A. thesis in Psychology, University of
Washington, 1948.
6. The amount, of course, varies with the per cent and size of sample — the "1%" being for a
reported "50%" of people giving a particular response among 1200 respondents, by the
crude formula op = oq/N where p = the per cent and q is its complement.
This formula becomes more complicated as it becomes more exact in ratified
samples. It is not strictly applicable in quota sampling, which doesn't satisfy its basic
assumption that the respondents are select strictly at random. In quota sampling the
respondents are selected chiefly, within strata, by their accessibility or convenience to the
interviewer.
7. For readers interested in polls not only as instruments (demoscopes) for observing
samples of people, but also in their contribution to systematic social science, the
dimensional analysis of this paper may be noted. The mis-prediction factors here involve
the four chief dimensions, namely time, space, people, and residual characteristics as
specified in the author's Systematic Social Science (University Book Store, Seattle, 1947)
and his Dimensions of Society (MacMillan, 1942). Comprehensive treatment of situations
in the social sciences usually involves all four dimensions. Thus, the general dimensional
formula for all the social sciences (of which the factors discussed in this paper are
subcases) may be written in the particular subform here as:
S = tT-1 : 1L2 : Pp : I 2i
This generalized formula for the polling situation (S) discussed in this paper asserts it to be
2
composed of a set of intercorrelated indices (I i ) corresponding to (:) various classes of
people (Pp) in various unit areas (1L2) at dates in time (tT). The quantic number classifying
all such
#18. A Call for Experimental Designs for Election Polling
From the International Journal of Opinion and Attitude Research, Vol. 4, No. 2, 1950
Mexico, D. F., Mexico
(Journal Editor's Note: Stuart C. Dodd, Director of the Washington Public Opinion Laboratory,
University of Washington, suggested that the International Journal publish a symposium on
experimental designs for election polling for the 1952 elections in the United States. The
Journal accepted this suggestion with pleasure and asked Dr. Dodd to write the Call for
contributions to this symposium.
The Editor urges all interested persons to participate, so that the symposium may
achieve the greatest possible usefulness for those who will carry out pre-election polling.)
Many analyses have been published about what was done wrongly or left undone in
polling before the past elections. Now the need is for exact plans for polling before the 1952
elections, in order to achieve better prediction of this basic political behavior.' We need precise
yet comprehensive experimental designs: a) for pre-election polling with the best techniques
known to date; and b) for methodological experiments to find the best techniques wherever the
current ones are in doubt or unproved.
The International Journal of Opinion and Attitude Research proposes to conduct such a
symposium in the pages of its next several issues. Anyone interested in pre-election polling is
invited to participate in either or both of two ways:
1. To write out and submit for publication an experimental design or designs.
2. To submit criticisms of designs published in the International Journal.
Participants in the symposium will thus obtain the benefits of criticisms by other
interested persons and will be enabled to improve their designs preparatory to executing them
before election time.
Such systematic publication and criticism of experimental de signs can be expected to
yield the following results:
1. Better techniques will be used and more agencies will use them, thus forecasting
more accurately.
2. Coordinated attack can be planned between agencies and between polls within
agencies.
3. Grants from Foundations and other funds can be raised in advance to carry out
these experimental designs.
4. Controversies about alternative techniques can be factually decided by the acid test
of science, i.e., the question of "which technique improves prediction the most?"
5. Standards in polling will be developed in these experimental designs such that a poll
may be reported as done by Design X as described in the International Journal.
As a preparatory step for such a symposium it might be desirable to set up
specifications to which the described experimental designs ought to conform. To provide initial
impetus for the discussion, the present author proposes the following set of specifications.
1. The designing — is the purpose to test technique A? Or to forecast the election using
the set of specified techniques B? Or to explore a problem such as "How will the
'don't knows' vote?" Etc.
The most precise designs will state each purpose as the testing of a
hypothesis to the effect that some statistical index, X, (e.g., the observed
percentage) may be regarded with a stated degree of confidence, Y, as falling
beyond or within , certain limits, Z, (e.g., deviation from actual percentage). Thus, a
pre-election poll might test the hypothesis that "the X per cent observed with a given
technique may be accepted on a Y confidence level as forecasting the actual vote
within Z percentage points." But often hitherto the X, Y and Z are left somewhat
vague or at least not published in advance. In proportion as these are explicitly
stated in the prediction, polling will be less of an art and become more of a verifiable
2.
3.
4.
5.
6.
7.
8.
9.
science.
The most comprehensive designs will inventory the problems (such as
gauging turnout, voting behavior of Don't knows, change of opinion, bandwagon
effects, etc.), and provide an experimental design or even alternative designs for
each. The most comprehensive design will also inventory the hypotheses collected
from the literature and the techniques proposed for testing them. Less
comprehensive designs dealing with a single hypothesis or set of alternative
hypotheses should also be welcomed in the symposium provided each clearly states
its scope.
The essence of an experimental design is specifying the variables studied,
specifying how certain of them are controlled or held constant, and then specifying
how the relations among the other variables is to be measured.
The districting — is it a national design? Is it suitable for reporting breakdowns by
states? Etc.
The timing — is the experimental design for one poll or a series? For what dates
relative to convention date and election date? Does it provide for post-election
check-backs? Does it include retroactive questions on past behavior or opinions and
prospective questions on intentions and expectations, as well as questions on
current opinion?
The sampling — is the universe "all adults," "all the electorate," "registered voters" or
what? What is the size and stratification breakdown of the samples? What are the
exact techniques for drawing the sample? What are the instructions for executing the
sampling design — especially as to refusals, substitutes, call-backs, inapplicable
instructions, failures of inter. viewers, etc.?
The interviewing — what limits of qualifications, training, experience, instructions,
supervision, are called for in the experimental design?
The questioning — what are the questions, their phrasing, the response categories,
the cards or other aids, etc.? Are intensity of opinion, expected turnout, relevant past
behavior, relevant group memberships information and connections with issues,
cognate opinions, and other predictor variables, included in the questionnaire?
The tabulation — what tabulation and cross-tabulations, what chi square tests or
correlations, what standard errors or other tests of significance should be made and
reported? What indices and what limits for them will be taken as crucially testing the
hypotheses? This is the heart of every experimental design. Without it, the design for
polling is not a scientific experiment and should not be called "an experimental
design."
The reporting — where may summary and detailed reports be obtained? What
should they include directly and what by citing other documents (such as a sampling
design, a previous study of the interviewing staff, etc.?)
The agency — for what type of surveying agency is the experimental design most
suitable — a journalistic, market research, governmental, or academic agency, or
any type? The best designs might be built around the resources of a particular
agency. This will help to get its pre-election polling plans as effective as open
criticism of fellow scientists can make them. It will also help any donor of subsidies
to decide where they can be best utilized.
10. The financing — does the experimental design include a budget, or estimate of
direct costs at least, wherever a subsidy is needed to supplement the usual
resources of the polling agencies?
11. The coordinating — does the experimental design include an administrative plan,
such as an SSRC Committee might arrange, for dividing up the experimentation
among several polling agencies. This should provide for:
a. the most comprehensive testing to get done, and
b. desirable duplication with identical techniques to corporate findings (without the
present wasteful duplication with techniques whose comparability is low or
unpublished).
Each contribution to the symposium should be accompanied by an abstract of two
hundred wards or less summarizing its content.
Notes:
1. The SSRC report "The Pre-election Polls of 1998" (SSRC Bulletin No. 60) analyzed past
polls without recommending integrated designs for future polls.
#19. On Estimating Latent Versus Manifest Undecidedness:
The "Don't Know" Per Cent as A Warning of Instability among the Knowers
By
Stuart. C. Dodd, University of Washington
and
K. Svalastoga
Education and Psychology Measurement, Vo. 12, No. 3, Autumn, 1952
I. Problem
CAN the stability of opinion or identical response from one poll to a repoll be predicted
from the first poll alone? What relation has the size of the undecided response to the stability
of response among those who are decided (i.e., those who respond "yes" or "no" or any
definite answer other than "don't know")? Does the "don't know" category as a measure of the
manifest undecidedness also predict the extent of latent undecidedness which may result in
change of opinion?
An experiment in the Washington Public Opinion Laboratory found a high negative
correlation of -.91 in a set of questions1 between the percentage of "don't knows" and the
percentage of stable responses among the decided responders. This means that the stability
of opinion was here highly predictable from the amount of indecision observed in a single poll.
If this observation is found to hold generally, then repoll reliability can be well estimated without
a repoll.
II. Definitions
To present the evidence let the variables first be defined operationally as follows:
Ni = the number of respondents asked question i in both poll and a repoll. i denotes
each of our seven questions in turn i = 1,2,…,7.
Di = the number of respondents among N, who give a decided response (i.e., other
than "don't know") to statement i in the first poll. i = 1, 2, 3, …,7.
di = the "decided" percent = 100 Di/Ni
Ui = the number of respondents among N, who give an undecided response
ui = the undecided = 100 - di or "don't know" percent.
Si = the number among Di who give identical responses to question i in the poll and in
the repoll.
si = the "stable" percent among the decided = 100 Si / Di
Table 1: Relation of Indecision to Instability in Poll and Report / Data on Seven Questions
Decided Di
Statement
No.
1
2
3
4
5
6
7
Stables
Si
235
201
209
216
175
158
174
Unstables
Ui
27
51
45
41
55
80
71
Undecided
Total
Per Cent
Undecided or
"Don't
Knows"
Ni -Di
5
11
11
10
25
25
20
Ni
267
263
265
267
255
263
265
100 Ni -Di /Ni
1.8
4.1
4.1
3.7
9.2
9.2
7.7
Per Cent
Stable
Among The
Decided
si = 100 Si /Di
89.7
79.8
82.3
84.0
76.1
66.4
71.0
III. Experiment
At the end of an interview with 522 respondents in this poll on supranational control of
national affairs each was handed a second questionnaire to be filled out at leisure and mailed
in to the Laboratory. Among other questions, the seven questions studied here were repeated
verbatim in this repoll by mail. 271 respondents or 51 per cent mailed in the repoll and are the
group analyzed here. Their responses are recorded in Table 1.
IV. Findings
The correlation scattergram for the seven "don't know" percentages (ui) and the stable
percentages (si) among the decided responders in the first poll is shown in Chart 1. The
correlation coefficient between indecision and instability was .91. Alternatively stated, the
"don't know" percentages correlated at .91 with the percents of altered responses among the
decided in poll and repoll. This correlation is significant at the one per cent confidence level
even though it is based on only seven observations.2
0
0
o
1
s
.
CHART 1. Regression of Per Cent Stability among Those Who Give Decided Response, upon
Per Cent Undecided in the Total Sample
The regression equation for the estimation of the stable percents (si) from the ''don't
know" percents (ui) was
si = 92.6 – 2.5ui
Equation (1)
The "don't know" percentage is observable from one poll; the stable percentage
requires two polls repeating the questions of the first poll exactly. Since repolling the same
persons is often difficult and expensive it is useful to be able to estimate this stability, or repoll
reliability, of a question from the "don't know" percentage of the first poll. The reason, in
statistical terms, for this high predictance (as a correlation predicting a future event may be
called) is that the proportion of people with ill-structured opinion who prefer to say "I don't
know" tends to be constant. The complementary proportion who assert "yes" or "no" and later
change their minds also tends to be constant in the data here. (Let "ill-structured opinion" be
defined as the opinion of the undecided responders plus the decided but unstable responders.)
"The ratio of the frequencies of the undecided to the unstable tends to be a constant" is
another way of stating the phenomenon observed.
The psychological reason for this high predictance seems to the authors to be that the
opinion is ill-structured. A high percentage of don't knows is a symptom of an ill-structured or
undecided opinion. Ill structured opinions will shift with time and its intervening stimuli more
than well-structured opinions. If indecision at one moment and instability over a period are but
observed indices of ill-structuring then they should correlate highly — as this experiment
indicates. Of course, many corroborating experiments are needed to verify this hypothesis that
ill-structure breeds manifest indecision and instability and to map the limiting conditions of the
hypothesis. Without inferring a common factor of ill-structuring, this hypothesis of "sure and
stable" opinion is better expressed by the inequality
rds > .9,
Equation (2)3
the "decided and stable" hypothesis.
This hypothesis asserts the questioned generalization that "the per cent of change (100s) goes with the per cent of 'don't knows'." It asserts a correlation higher than an arbitrary high
amount (such as .9) chosen to permit exact tests of sampling significance. This states in
statistical terms our experimental finding: When few know their minds, those who know may
change."
Notes
1. The questions used in this repairing experiment were the following quasi-scale: On the
whole do you agree or disagree with this statement?
1) This country should always keep its right to go to war for a just cause.
2) In world affairs this country has a right to do as it pleases when it pleases.
3) Our country should feel free to use its military power whenever that can give us what we
want.
4) If this country cannot settle its differences with another nation through friendly talks,
then it should have these differences settled by an international agency.
5) This country should avoid offending world public opinion in its foreign policy.
6) We should work for a world organization with a police force stronger than the armed
forces of any one country including our own.
7) We should give an international body the power to make laws on world affairs.
2. More exactly stated, a correlation of .91 or higher, based on seven pairs of observations,
would only occur in less than one per cent of many similar samples if the true correlation
coefficient in the population of all such samples was zero. Hence the hypothesis of zero
correlation between d and s may be rejected.
3. Some readers may ask whether u and s as defined are variables that by logical necessity
will have a non-zero or spurious correlation. For the numerator of d is the denominator of s.
It will be shown, however, that if u and s are considered as random variables, then their
covariance and hence their product moment correlation coefficient is zero:
The covariance
σus ∞ σ1–D/N; S/D = –σD/N S/D = -E(D/N – E(D/N)) (S/D – E(S/D))
where E is the expected or mean value
= {-E(S/N) – E(D/N)E(S/D))}
Then we obtain
E(S/N) – E(D/N)E(S/D) = p(x) – p(y) • py(x) = p(x,y) – p(x)px(y) = 0
That is, the joint probability of x and y minus the marginal probability of x times the
conditional probability of y is zero. But then we also have ruis = 0. This demonstrates
that there is no spurious correlation. The high observed correlation is not due to chance nor
to a spurious common factor. It can only be due to some real and large factor such as the
degree of structuring or decidedness of the opinion in the population.
#20. Research note on the “Law of Forecast Feedback”
Letter by Stuart C Dodd
The American Statistician, Vol. 19, No. 2, April 1965, pp 57
Dear Sir:
"The Law of Forecast Feedback" proposed by G. C. Smith (in The American Statistician
for December 1964) becomes reversed upon more adequate analysis of the psychological
factors (showing it to be a hypothesis of forecast feedback and not a scientific "law" as yet).
Smith hypothesizes: "If a system of forecasting achieves a significant reputation for infallibility,
its forecasts tend to become part of the chain of events, affecting the outcome in an
unpredictable manner." He further states that "The last four words, 'in an unpredictable
manner,' are the most important and probably the most controversial part of the Law."
A revised and more positive hypothesis might claim for testing that knowledge of the
forecast would affect the outcome in a manner highly predictable from adequate foreknowledge obtainable through polls as to
1) how strongly the people concerned liked-or-disliked the forecasted outcome, and
2) how likely-or-unlikely they claimed they were to take appropriates steps to remake
the future outcome more to their liking.
Thus consider people who are much overweight for their age and height and then learn
the medical scientists' forecast of a proportionately shortened life expectancy for them. Is the
effect of this forecast unpredictable? Is it not predictable from polling that: Insofar as those
people
1) assert strong liking for longer life and
2) assert their belief in the likelihood of the forecasted outcome (that overweight tends
to shorten life), and
3) assert their intention or further likelihood of acting so as to reduce their own
overweight, in just so far the forecast will tend to be modified in the direction of less
overweight and longer living?
Or consider Smith’s example of the U.S. presidential elections of November 1948. The
American people widely knew the summer-time forecasts of the pollsters predicting a large
Republican victory. With our present hindsight on such dynamic psychological factors, one
could forecast the chief effects of the pollers' forecasts, namely: The party expecting sure
victory will tend to relax its efforts, while the party warned of probable defeat will strengthen its
efforts in proportion to
1) its strength of desire to alter the threatening outcome and
2) its perception of its own power (in know-how, ability, resources, etc.) to alter that
outcome.
Polls might not be able to measure fully these factors of likings and likelihoods before
they become manifest as in Truman's "give 'em hell" campaign and the Labor Unions' houseby-house canvass in crucial areas. Actuarial experience is needed, of course, for predicting the
amounts of these forecast feedback factors just as much as for the factors in the original
forecast. But the direction of many such post-forecast factors is known and supports our
revised hypothesis that effects of forecasts of human behavior can themselves be forecast
increasingly with research. Their unpredictability, claimed by Smith's hypothesis, seems a
function of the forecaster's current ignorance or inadequate analysis of all the factors in the
situation. Incidentally, a bit of indirect and tenuous support for the revised hypothesis is shown
in the correct prediction of Truman's victory in 1948 by our Washington Public Opinion
Laboratory's poll. Whereas other polling stopped earlier in the campaign period, the
Washington poll operated up to election eve. It thus (unwittingly) reflected the net result of the
post-forecast feedback factors of Truman's dramatic campaign, etc., and the voters' liking for it,
which came to full operation only at the end of the campaign period. The man inadequacy in
Smith's hypothesis seems in treating people as random and unpredictable reactors to
feedback information instead of being purposive reactors who are likely to try to prevent
disliked outcomes. But instead of polemics, let us revise the hypothesis of forecast feedback in
explicit terms of four necessary and sufficient factors which can now be clearly measured in
indices from polls designed to test the hypothesis. This can replace argument by the agreement-compelling power of controlled experiments.
Hypothesis A: "Insofar as a forecasted outcome is
a)
b)
c)
d)
widely known and
judged likely to happen and
strongly disliked and
seen as alterable by the actions (or inactions) of the population concerned,
in just so far that forecast is likely to become a further factor stimulating those people to try to
alter the outcome so as to be more likely to be nearer to their liking.
For fuller statement and experimental testing of this transactional "likability" theory for
predicting human behavior including feedbacks in terms of its 3 general modes or factors,
namely: Likenesses known (in 10-point scales); Likings felt (in 10-point scales); and
Likelihoods of doing consequent acts (in 10-point scales), see: Dodd, S. C. "The Likability
Models" to appear in Systematics in 1965.
Sincerely,
Stuart C. Dodd,
Research Professor Sociology
University of Washington, Seattle
#21. The Momental Models for Diffusing Attributes1
Dimensional Formulas for Spreading All-or-None Acts among People in Time
I. The Need for Systematizing
A. The diffusion data in Project Revere
In the course of a research program for the Air Force on spreading messages through a
target population by leaflets dropped from planes, data on diffusion of 51 messages were
measured in some 30 American communities. Three quarters of a million leaflets altogether
were dropped in 27 experimental designs or series of tests, upon some three quarters of a
million citizens. The populations varied from "captive" ones in schools and camps to open
communities ranging from villages to metropolises in the states of Washington, Utah, and
Alabama.
The Air Force wanted to learn principles for predicting and producing diffusion of a
message. They wanted this "Project Revere" to chart principles of high generality for any future
situations or theater of psychological warfare. These principles should guide leaflet operators
in maximizing the diffusion and compliance with leaflet messages under specifiable conditions.
These conditions, in turn, should be reliably observable beforehand and so highly correlated
with the diffusion criterion as to predict it well.
B. The diffusion models indicated
During the course of the three-year project some eighteen types of diffusion curves or
mathematical models were explored in varying degree. With our general dimensional system 2
as a guiding frame of reference or conceptual scheme, more specific models were developed
aimed at isolating each dimension or precondition of diffusion in turn. The curves and data
dealing with the time dimension will be the theme of this paper.3 This means we shall try to
systematize here chiefly those models which predict the growth of diffusion, the actual
spreading of the message through a population during a period.
The 18 curves of diffusion growth, or distribution in time, which were observed to fit data
in Project Revere in some degree from very close, to very loose fits, were:
Rectangular
Power series
Binomial expansion
(Bernouilli and Poisson)
Waning exponential
Linear logistic
Gompertz
Linear
Logarithmic
Normal
J.shaped
Harmonic
Harmonic normal
Waxing exponential
Cubic logistic
Random net
Harmonic exponential
Harmonic logistic
Harmonic random net
The diversity of these models and their apparent preconditions calls for systematizing.
Can they be unified as variant forms of one master model for diffusion? Can their reconditions
be so fully specified that the operator can predict which sub-form of that master model the
diffusing, will follow in a given situation ? We believe the momental growth model described
below and its extensions go far towards systematizing these diffusion formulas.
C. An example of a tested model.
But before describing this systematizing, let an example of diffusion data and modeling
be noted to make the theoretical discussion more concrete. Figure 1 records a controlled
experiment testing the logistic model in diffusing a message through a set of people in a series
of consecutive periods. This laboratory experiment in sociology synthesizes the diffusing of a
communicated item when its necessary and sufficient preconditions are exactly known and
controlled. After much trial in open communities and in captive populations, we had observed
that the diffusion data tended to fit the S-shaped logistic curve better and better in proportion
as the interacting of people was exclusively
1) in pairs;
2) at a steady time rate of meetings; and
3) with equal chance for every person to meet any other person.
Random steady pairing, then, are the three behavioral preconditions, or algebraic assumptions,
underlying this model for diffusing an attribute (such as the hearing and consequent knowing of
a message here.)
These three preconditions of random steady pairing may be set up in a classroom
experiment and verified any time. Thus in Figure 1 a class of 62 students, each starting a
unique message (such as his own birth date, or initials) paired off just once at will in each of
seven successive equal periods. At each pairing each recorded in effect all the messages
known to the other up to date by simply recording his partner's name. When every student
knew every message, diffusion of all the messages was complete. This full diffusion took 25
minutes. At the end of each period a quick tally of the total number of messages actually
known agreed excellently with the number expected by the logistic curve as shown in Figure 1.
This agreement can be dramatized by plotting the logistic expectation on a blackboard in
advance and then plotting the actual diffusion period by period as visible evidence. For a
descriptive index of this closeness of fit in these data, a correlation coefficient of .96 was
observed between the actual and the expected increments in the message knowers. The more
exacting intra-class correlation of the data to the logistic hypothesis was also .96. For a
sampling index of goodness of fit, the chi square test of the discrepancies of the data from the
model was not significant at the one percent level. Such discrepancies would arise by chance
98 times in a hundred among like samples. The hypothesis that these data could have been
drawn from a universe specified by the logistic model could not here be rejected. We conclude
from the visual indications of Figure 1 and from the descriptive and sampling indices that the
logistic model fits these data from a controlled experiment excellently.
We predict that any time any diffusion is similarly tested, the fit of the logistic curve will
be excellent — within sampling limits, of course. The three preconditions are controlled in this
laboratory experiment by making the interactors meet in pairs at a steady rate of one meeting
per period and with equal opportunity to pair off with anyone in the group. Rigorous
randomizing was guaranteed in earlier experiments by drawing partners by lot. This rigor was
relaxed in this experiment to let them meet at will — with just as excellent fits resulting. Nonrandom or systematic factors, such as preferences for pairing off with a friend or with a
member of the opposite sex, could be controlled further in the future (as they were not
controlled in this coeducational class) by using a group of strangers of one sex. Still subtler
systematic psycho-analytic factors cued from personal appearances, etc, can be further
reduced towards greater randomness by darkening the room till each partner is chosen with a
minimum of sensory cues.
The three preconditions seem often likely to be approximated in communities wherever
the diffusion is through conversational or other personal meetings, in a stable situation, among
large numbers of people each influenced by many separate or independent reasons for a
particular act of communicating to a particular person at a particular time and place. Such a
multiplicity of thousands or even millions of diverse influences determining the aggregate of
interacts which diffuse an attribute is chat approximates a chance distribution, of course, and
tends to fulfill the randomness precondition.
II. Algebraic Development of the Momental Growth Model
A. Definitions and assumptions
Diffusing requires just three sectors or classes of variables to make a closed system.
Diffusing can be completely predicted and its preconditions can be specified in terms of:
A set of Acts, i. e., transmitting an attribute;
A set of Actors, i.e., a population of people or any diffusible entities:
A set of Time intervals, i. e, a growth period.
For simplicity at first the acts will be limited to transitive actions which are observed in
all-or-none or dichotomous categories so that A will have values of 1 or 0 only. Their definition
will be further limited here to first acts, such as a person's first hearing a message and
becoming a knower. His further rehearings are neglected. Thus, the act of a new hearing or
not-a-new hearing becomes identical with the classes of knowers or non-knowers (The model
stops short of studying any forgetting over a longer time period.)The dimension called acts, A,
and the dimension called population, P, thus merge into one dimension, defined by just two
points or categories, namely "knowing", and "not knowing". In effect this reduces diffusion to
two variables — a proportion of actors growing in time.
The definition of "a set" as an aggregate of interchangeable elements implies equal
opportunity for each element or rectangular distribution when operating with sets (as in forming
an intersect or product of two sets). A rectangular distribution of the acts over the successive
time intervals means a steady acting or constant speed of that activity. A rectangular
distribution of the positive acts among the larger number of persons means a random
distribution as long as the acts in a set of acts are equally operative, as implied by their
definition.
Thus the momenta] growth models by assuming sets of acts, sets of people, and sets of
periods imply the preconditions of steady, random, acting. Then further assumptions about
the number and sign of the acts will be seen below to differentiate the members of the
momental family of models.
B. Statistical moments of an attribute—the elementary forms of probability
Given a set of acts distributed randomly in a population and steadily in time, the next
step in deriving the momental models is to compute the statistical moments of an act when
observed as all-none variable or statistical attribute (so A1—1, 0), For this we expand the
discrete form of formula for the attribute's moments of any order front the zeroth to the fourth,
namely:
—
def. P —
Ma = Σ1 Aa/p (=Aa)
Equation (1)
defining the a'th moment where a = 0, I, 2, 3, 4 in turn.
The results are presented in Table 1
The raw moments may be computed from either zero as origin point or from unity if the
attribute is reversed and its complementary frequency is to be computed. The raw moments of
an attribute are identical for all orders. Every raw moment whether a frequency, a mean, a
variance, a skewness, or a kurtosis is just p, the mean or proportion of attribute holders. For
the deviations are either 0 or +1 and all powers of 0 are 0 and of 1 are 1 so the average of the
deviations is constant and equal to q, the complement of p, so p-q = 1. Thus, the two raw
moments of an attribute measure a simple probability and its complementary probability.
Next, the central moments (about the mean, p, as origin) when computed yield the five
formulas (p+q; 0; pq; q—p; p+q-3 pq) which measure respectively an alternative probability, a
nul probability, a joint probability, a difference probability, and a composite of alternative, joint
and difference probabilities a th. (Table 1)By simply multiplying an attribute by itself a times
and averaging this power of A, formulas for the seven elementary forms of probability are
developed. Each formula is well known by itself, but the systematizing series of them as
moments of an attribute seems little noted in the statistical textbooks.
This series is summarized in Equation I for its general statistical for
[M = Aa]
Equation (1a)
formula and by for its dimensional formula This, by reporting the exponent parameter only, tells
the shape of the curve and leaves the fuller and more local information as to the curve's slope
and origin point to the statistical formula. Several interesting and important properties and
symmetries emerge from Table 1 relating attribute moments to probability formulas.
The five central moments specify basic operations of mathematics and logic. For the
zeroth moment expresses an algebraic sum (p+q) and in the pre-quantitative level, a logical
sum of classes or union of sets. The first moment expresses art algebraic zero amount or a
denial of a class in the language of symbolic logic. The second moment expresses an
algebraic product (pq), and expresses a logical product or intersect of sets in those fields. The
third moment expresses an algebraic difference or one-way complement of classes (in saying
''the members of one class which are not members of the other class''). The fourth moment
expresses all these operations jointly in asserting a difference of a product, from a Sum (p+q-3
pq). Thus, these moment formulas specify a sum and a difference, a product, and a negation
between a probability anti its complement in terms of the calculus of classes these five
combinations of classes involve the logical operations of a sum(∪),a denial (~), product (∩), an
inclusion (⊃),and an equality (=). (Equality is the case where a sum, a product, and an
inclusion become identical.)
Table 1: Moments of an Attribute related to probabilities
Another emergent symmetry is that the first three moments in Table 1 are computable
at one point in time and so deal with static populations. i.e., with a probability of being. The last
three of the seven moments in Table 1. are meaningless if computed at one point in time. For
p and q are mutually exclusive sets (or classes) by definition and so their product or overlap in
common elements is nul. But if observed in a changing population between two dates (where
there is a positive probability of a non-knower becoming a knower, for example) these three
highest order moments become non-nul and meaningful. Then the middle row in Table 1
expressing negation is midway between the static and dynamic moments in that it negates
both the probability of being and the probability of becoming.
In respect to the range of values for each moment, the two raw moments can vary from
0 to 1, each being the complement of the other since p+q = 1. Among the central moments, the
zeroth is a constant at unity, while the first moment is a constant at zero. The second moment
varies from 0 to .25 at maximum while the fourth moment varies over the rest of that range
front .25 to 1. The third moment varies most widely from +1 to -1 at which extremes it
measures perfect probability of an event and of its opposite event.
C. Powers of the moments — specifying laws of probability growth
The second step in deriving the power moments model is to raise the moments to the tth
power where t is the number of unit periods of time that are observed in the diffusing process.
This multiplying of each form of probability by itself in each successive period means that the
population is interacting or mixing thoroughly and randomly in each period.
Raising the moments of an attribute (in the various forms as specified) to the time power
yields the following family of dynamic probability models or sets of probable diffusion curves,
Pt = (Aa)
t
Equation (2)
the momenta' growth model,
the tth power of an attribute's
moment
As the exponent in denoting the order of the moment takes on successive integral
values the model becomes in turn the Bernouilli binomial and approximately the normal
probability curve as t enlarges), the decaying exponential curve, the logistic curve, a difference
binomial, and a binomial of the logistic complement. The exponential curve arises from the first
moment about zero; the others arise from the moments about the mean. This set (with its
variants) includes the most important curves found to fit diffusion data in Project Revere. The
momental growth model thus systematizes in a single three-letter formula a highly predictive
family of diffusion models. Its fuller implications and variant submodels are discussed in
sections ahead. More of its details are given in Tables 2 and 3.
Table 2:
Dimensional
formula
[A" ]t
[Al]t
[Al]t
[A']t
IA silt
[Aa]t
The Momental Growth Model For Probable Growth of Diffusing
Simplest statistical
formula
(p + q)t
qt
pq(t)
(p - q)t
(1 —3 pq)(t)
P
(Σ1 Aa/p)t
Name of probability growth curve formula
Normal curve (if t is large)
Decaying exponential curve
Linear logistic curve
A difference binomial
A stochastic function of the 4th moment
The momental growth models or set of
curves
The exponent in parentheses denotes a stochastic process or chain product of factors, t
in number, which are not constant as each grows from its predecessor.
D. Operational definitions of the models
The first three diffusion models in Tables 2 and 3 are more fully restated as operational
definitions in three tenses. An operational definition is most complete when its operations are
stated in the past, the present, and the future tenses. These tell respectively how to make, or
to know, or to use whatever is defined.
In the past tense the operator is told what operations to perform on what materials and
in what relations in order to generate whatever is defined — the diffusion of a message here.
Thus in Table 3 column 2 gives the formulas (i.e., the rules in algebraic shorthand) for
generating each curve as when p + q is raised to a large power, t, to generate an
approximately normal distribution. This tells the social engineer how to produce the normal
distribution by the rule: Mix many uncorrelated and equivalent acts thoroughly in the population
or distribute randomly n equiprobable attributes in a population.
Another form of generative operational definition is the differential equation such as in
column 3 of Table 3. This states the preconditions as a rate of diffusing at any current instant.
Then integrating this equation which means adding the increments period by period, generates
the curve that is wanted. This tells the social engineer how to develop diffusion stochastically
— that is, how to spread the all-or-none state step by step from any current state.
Next, the present tense of operational defining specifies the operations to identify , or to
classify, or to measure, whatever is defined. Identifying operations are specified in column 4 of
Table 3 in "rectifying" formulas. These recast each curve as a straight line. Then the operator
can plot his diffusion percents against time and see if his points align. To the extent that they
do align, the data are thus easily identified as fitting that shape of curve.
Another form of the present tense of operational definitions is the dimensional formula
given in column 5 of Table 3.This classifies the shapes of curves by their exponents which are
the natural numbers 0, 1, 2, 3, 4. Finally, the future tense of operational definitions states the
function, or future use, of whatever is defined. Scientists generally want to use diffusion
formulas for predicting the course of a diffusing. For this, the integrated growth formulas in
column 6 or Table 3 are useful.
III. Behavioral Development of the Momental Growth Model
A. Definitions and assumptions in behavior terms
The usefulness and predictivity of any model depends upon the symbols for the
variables, assumptions, and operations having reliably observable referents, of course, either
in the phenomena or in the researcher's behavior. Thus the population, P, must mean
observable persons (or other diffusing entities). The all-or-none acts, A, must be observable
either while happening or from traces later, as when a knower of a new and distinctive
message is reliably observed and the previous acts of a telling and a hearing of the message
can be safely inferred.
As long as the act or its result is reliably observable, its content can be very general. It
need not be limited, as it is in the present paper, to human transferring of a novel attribute. For
the actors can be people, protozoa, or physical particles. The all-or-none act may be diffusing
news or a culture trait, a possession or a disease, a gene or physiological trait, any chemical or
electrical condition, etc. The diffusion model here is intended to be an operationally specified
theory of diffusion which is general to any field of science. It seeks to specify the preconditions
in general language (such as "steady, random, acting") so as to enable predicting this simplest
form of diffusion of anything in any set of entities. In short, our modeling seeks formal
generalizing of the most diverse contents.
But what may be the social counterpart or material interpretations of the formal
algebraic operations used in deriving the momental growth model? Just what behavior of
people corresponds to "raising the moment to the time power", or to "averaging the attribute,"
or to the assumption of "randomness"? The answers to these questions are spelled out in the
forthcoming volume, Revere Studies on Interaction,4 and can only be summarized here in a
few sentences which risk oversimplifying.
The algebraic operations of computing the seven statistical moments of Table 1 here
represent the behavior of the researcher in observing the frequencies of occurrence of the
attribute in the whole population and in each of seven ways, i. e., as "present" (=p), as "absent"
(=q), and "either" (=p+q), as "neither" (0), as "both" (at two dates) (=pq), as "one and not the
other" (=q— p), and as perhaps "either and not both"'(=1-3pq).5
Algebraic addition (as in the numerator of every moment, i. e., ΣAa) represents, or may
be interpreted as, the researcher's behavior of counting the attribute in some way in order to
observe the plural as a whole. Algebraic multiplication (as in pq) represents observing the
interaction of well-mixed proportions of the population, p and q, i. e., group behavior (of dyads).
It consists of getting everyone to meet each other in pairs. Algebraic raising to a power (as in
t
M a represents observing the moment ( Ma) after the population has interacted fully with itself in
each of t successive periods. Thus acts of plurals are algebraically symbolized as sums;
interacts of groups are symbolized as products (of acts and reacts); and such interactings
repeated t times develop the tth power of the act.
The operations of raising a set of persons to the ath power (which is implied in raising
the attribute-in-persons to the ath power in getting the ath moment) can be portrayed in matrix
terms as:
P0 = a cell in an array, an 0-matrix of order 1 x 1 representing a person; an array, a 1matrix of order
1
P = an array, a 1-matrix of order 1 x n representing a sociological plural, a list of
persons;
2
P = a cross-classifying of 2 arrays, a sociomatrix or 2-matrix of order n x m, a
sociological group of pairs;
3
P = a cross-classifying of 3 arrays, a 3-matrix of order n x m x r, a pro-con plural (?);
P4 = a cross-classifying of 4 arrays, a 4-matrix of order n x m x r x s, a pro-con group
(?).
Thus the five successive moments describe respectively the attribute as observed in
persons, in plurals, in groups, in pro-con plurals (?), and in pro-con groups (?). (The social
meaning of the third and fourth moments is not fully clear and is questioned here pending
further semantic and field research.)
Finally, the algebraic assumption of random distribution of the attribute in the
population represents behaviorally a law-abiding distribution resulting from many, small,
independent influences, or "multiplex" causation. It represents the democratic ideal of equal
opportunity (for anyone to possess the attribute). It also represents zero correlation with all
variables (other than P and T) and hence means "all else constant" or "other things being
equal," It thus says the attribute system of actors-acting-in-time is isolated or uncorrelated with
other variables and so the diffusing becomes a closed system in the situation studied.
These principles may now be applied to restate the algebraic momenta] growth formula,
a
t
[A ] , in terms of diffusing behaviors. Accordingly, the five underlined sentences below will try to
spell out the model for each order of moment in turn, in terms of behavior which a social
engineer may manipulate.
-t
B. Normal Diffusing, pt = A0
The normal probability case of the momental growth model can be stated as follows:
Insofar as n, small, independent all-or-none acts or attributes, each as likely to be
present as absent, are distributed with equal chance for everyone in a population, their
sum will tend to fit the nth order binomial distribution and the normal bell-shaped
distributioncurve as n gets large. The simplest statement of normal diffusing in
mathematical terms is the differential equation for its rate of change, namely dp/dt = -kpt.
The primary hypothesis here, as in any model, is that if the specified curve or model
proper fits the data acceptability, then the preconditions specified in the "if," or "insofar as,"
clause may be inferred to be fully and solely present in the given situation. Secondary
hypotheses may then isolate each precondition in turn for separate testing of its full and sole
presence. An acceptable fit may be compatible with acceptable fits for alternative models or
alternative preconditions. In such cases, of course, further experimenting should be designed,
or other criteria should be used to choose, between the alternative equally well-fitting
hypotheses.
The social engineer's rule for one of the many ways of producing normally distributed
diffusion by any one data is then: Mix n equivalent and independent all-or-none influences
thoroughly in a population. The essential preconditions are many, small, independent acts
distributed among people at whatever speed.
An example might be a classroom where many, small items of knowledge are
distributed with equal chance for every student with a resulting normal distribution of such
knowledge in a school test. For a controlled experiment verifying this law one could pitch many
pennies, of course. But for a social experiment, each of n news items may be chosen (not
assigned) by half of a group of people where many small and uncorrelated influences
determine each choice by each person.
-t
C. Exponential diffusing, pt = A1
The momental hypothesis, or candidate social law, for the case of the first moment
(about zero) can be stated as: Insofar as all-or-none reacts of one kind in a plural
(reacting to a common stimulation) are randomly distributed in a population and steady
in time, their sum will tend to follow the convex-upwards decaying exponential curve. Its
simplest mathematical form states its rate of change as dp/dt=kq.
Again for the exact preconditions of "randomness" the approximate but socially more
meaningful phrasing of "many, small, different determiners" or "equal opportunity for everyone
to get the attribute" may be substituted. As usual, observing an acceptable fit by the
researcher's pre-announced standards does not prove that the consequent clause (stating the
formula or curve that is expected) follows from the antecedent or "if" clause. That is a matter of
logical or mathematical proof. An acceptable fit tests whether the preconditions could exist fully
and solely in the observed situation. It tests whether such preconditions could explain the data
in the observed situation.
The essential preconditions for predicting and producing exponential growth of diffusing
may be stated like a recipe: mix repeatedly, steadily, and with equal chances, one all-or-none
act in a population, or (for a paraphrase) — repeatedly and randomly distribute an attribute to a
constant percent of a population. An approximate example could be the diffusing of some item
of news or advertising that is communicated daily by a different but about equally effective
channel to a public. For a controlled experiment, arrange for a set of people to add randomly a
fixed percent to their current knowers (or subtract from their current non-knowers). For
randomness to result from internal purposive decisions (not imposed by the experimenter)
arrange or select any situation which gives "multiplex causation" for each person's decisions.
D. Logistic diffusing, pt = A2
-t
The simplest logistic case of the momental growth model, or operationally specified
theory, stemming from the second moment, can be stated as: Insofar as every actor meets
any other actor with equal opportunity and at a steady rate in lime and if holders of an
attribute transfer it whenever they meet a non-holder, then the number of holders will
tend to grow in the S-shaped simplest logistic curve defined by dp/dt = kpq, where k = 1.
(The more general case does not restrict k to 1 which means letting holders transfer the
attribute only in a proportion, k, of their meetings with non-holders.)
The social engineer's rule for controlling this simple logistic diffusion is thus: Get people
to meet in pairs, with equal and steady opportunity. One example of logistic growth is the
population growth from people mating in pairs when new technology of food production may
permit a new cycle of population expansion. A controlled experiment confirming the logistic
model was the classroom test on "rumor" spreading outlined above.
E. Summary of behavioral implications
Let the behavioral implications of the momental growth models now be summarized.
This model tells the social engineer how to predict and product diffusion of an attribute through
a population during the period under the general democratic precondition of somewhat equal
chance for every person.6 "Randomness", is here hypothesized to mean in loosely equivalent
phrasing: "a precondition of equal opportunity for everyone," or "a precondition of many, small,
independent determiners," or "a precondition of a homogeneous population." These models
test highly democratic situations in the sense of equally of opportunity in respect to the
interacting at issue.
Then further special preconditions in addition to steady random acting produce the five
submodels as follows:
A0
A1
A2
A3
A4
-t
many acts at any speed yield binomial distributions approaching a normal
distribution as the aces become n.
-t
one repeated act yields the decaying exponential submodel
-t
one interact in pairs yields the linear logistic submodel
-t
many pro-or-con acts are surmised to yield the "difference" binomial
submodel
-t
one pro-or-con interact in pairs is surmised to yield a submodel defined by a
stochastic or chain product built on the fourth moment, namely (1 -3pq)(t)
In summary, the momental growth model predicts that the diffusing of any attribute or
all-or-none act through a population of any entities in any science (such as a leaflet message
dropped from planes in the case of a human population) will tend to follow one of the
momental growth curves insofar as conditions are stable with fairly equal chance for everyone
to hear the message. This means that the model is independent of the nature of the message,
the culture, or the situation in the community, as these factors affect only the size of the
parameters in the model — especially the "potency" parameter. The shape of the particular
growth curve will be specified algebraically by a particular power of the index of the act (A a )
which is then averaged in the population (Aa ) and raised to the time power (Aa)t. The shape of
the particular momental curve will be specified behaviorally according as the "act" is a "multiact" or a react, or paired act or a pro-con act.
The general slope of the momental growth curve or speed of growth of pt (the
proportion of knowers at time t) will be a function of the local potency. This is measured by k in
the equations above in general and is operationally defined by the number of acts per person
and period. This potency is determined by the degree of interest a particular message may
have to a particular population in a particular culture and current situation.
A general formula for the momental growth model of steady random diffusing of an
attribute or item is:
pt = kAa
-t
the momental growth model
Equation (3)
where
pt =
A=1,0
a =0, I, 2, 3, 4
—
Aa =
1t =
k=
k=
the cumulated proportion of attribute holders at time t
the attribute or all-or-none act diffused
the exponent algebraically specifying the order of the moment and
socially specifying the aspect at the activity or the subset of
persons that are observed
the mean of Aa in the population
the time power
a potency constant, or function or acts per actor-period;
1 in the simplest cases to which the present paper was limited.
Insofar as the diffusing is neither steady nor with equal opportunity, departures from this
model will, of course, arise. The momental model provides baselines of the simplest
probabilistic forms of diffusion. From such baselines more complicated models describe and
predict variant and mixed forms of diffusion. The content diffused may be items of culture or
knowledge, disease or inheritance, electrical or chemical states or any other attributes,
whether in a human or other population of entities, under further specifiable conditions.
Notes
1. This research was supported in part by the United States Air Force under Contract
AF33(038)-27122, monitored by the Human Resources Research Institute (now Officer
Education Research Laboratory, Air Force Personnel and Training Research Center), Air
Research and Development Command, Maxwell Air Force Base, Alabama; Permission is
granted for reproduction, translation, publication, and distribution in part and in whole by or
for the United States Government.
2. See S.C. Dodd, Dimensions of Society, Macmillan. 1942, 944 pp. and Systematic Social
Science (offset edition), University Bookstore, Seattle, 1947, 788 pp.
3. More detailed reporting is contained in 17 technical reports to the Air Form, 54 papers in
journals that grew out of Project Revere and a volume entitles Revere Studies on
Interaction which is ready for press. Fives summarizing papers by the author are: "All-orNone Elements and Mathematical Models for Sociologists." American Sociological Review,
Vol. 17, No 2, April 1952.
Diffusion is Predictable. Testing probability models for bows of interaction,"
American Sociological Review, Vol. 20, No. 4, August, 1955.
"The Transact Model - a predictable and testable theory of social action, interaction,
and role action," Sociometry, Vol. XVIII, No. 4, December, 1955.
"Testing Message Diffusion in Harmonic Logistic Curves," Psychometrika Vol. 21,
No. 2, June, 1956.
Formulas for Spreading Opinions - a report of controlled experiments on leaflet
messages in Project Revere' (Public Opinion Quarterly, 1956)
4. Dodd, Stuart C., Rainboth, F D. and Nehnevajsa, J., Revere Studies on Interaction, circa
1,000 pp. (to be published).
5. This social interpretation of the logical and algebraic formulas will vary somewhat with the
form of formula used (i.e., its units and origin points).
6. Just how to equalize all opportunities in specific situations requires numerous, of course,
further research and consequent know-how by the social engineer.
Section 3: Studies on Semantics in Polling
Focus on opinion factors of Wordings, W
#22. Public Opinion Definitions
Reprinted from the International Journal of Opinion and Attitude Research, Vol. 2, No. 3, Fall,
1948, Donato Guerra 1, desp. 207, Mexico, D. F.
Concepts of "Public" and ' Public Opinion" Public Opinion Definitions
In response to the editor's request to discuss the term "public opinion," I shall not
describe its historical usage. Rather, let us strive for an operational definition which will be
most fruitful for scientific progress. We need a precise meaning for what has been a vague folk
term. Since any definition is relevant to some purpose, whether technical or general, I propose
the following definitions expressly for the technical purposes of polling:
1) "opinion" as the statements of respondents (including their agreement to
statements);
2) "public" as any specified population polled (and connoting "all adults of a region" if
not otherwise limited); and
3) "public opinion" as the statements of a specified population as inferred from a
sample polled.
These definitions make "opinion" equivalent to "polled speech behavior" of people,
which is in fact what the polls now record, whether those statements of respondents are about
their knowledge, such as about their own age, or about attitudes or about anything 'else.
Subclasses of public opinion would be denoted by suitable adjectives (forming logical
products) such as:
a) "informational opinion" (or "information" for short) as verbal responses of
interviewees on matters of their knowledge or information;
b) "informed opinion" as verbal responses of those interviewees who have shown a
specified amount of information in their responses to previous questions;
c) "attitudinal opinion" (or "attitudes" for short) as state ments by the respondents
indicating their attitudes;
d) "political opinion" as public opinion on governmental interests, etc. Further adjectives
would qualify opinion as far as needed, including such a term as "popular opinion"
denoting that the opinion of the people was inferred without a poll.
These definitions are consistent with technical usage at present and provide a
consistent system of terms which can be operationally and therefore reliably specified in
greater detail by any manual) of polling procedures.
Towards further standardization of technical terms in polling, the following set of
definitions is proposed:
A poll:
One set of person-to-person interviews, on one sample of people,
in one period; a measurement of public opinion.
A survey:
Any poll, or other canvassing type of investigation by other
techniques, of a specified problem in a specified population and
period (e.g., a geo. logical oil survey, a recreation survey, a survey
of churches, etc.).
An inquiry:
A polling:
A set of questions on one general topic usually for a separate client
or purpose, and these questions may be only a part of one
questionnaire or of one poll or may include more than one poll.
A series of polls; a unitary set of polls in several periods, as when
studying change of opinion in a panel.
A poll-by-mail:
Any canvass of persons by mailed questionnaires with one
questionnaire, in one period.
A poll-by-phone:
Any canvass of persons by telephone with one questionnaire in one
period.
A written poll:
A poll where the respondents write their answers, as in a poll-bymail, or a questionnaire administered to a class room, etc.
A poller:
A polling agency; any person or organization making a poll or polls.
A repoll
A set of interviews with the same or different respondents with a
repeated questionnaire.
A prepoll
A poll of a full sample inquiring about future polls.
A pretest
An informal poll of a subsample with a trial version of the
questionnaire to test that version in the field.
A question:
One interrogative sentence or phrase to be answered.
A sub-question:
A question asked only of those who responded in a particular way
to a previous question; a conditional question to be asked of those
who fulfill some condition as "If married ask: (the sub-question)."
A co-question:
A follow up question, depending on a previous question which it
explores further but asked of all respondents, e. g., a "reason why"
question or an "intensity" question.
A para-question:
Alternative phrasing of a question, as in open end or closed end
forms.
A filter question:
A question whose responses classify some respondents for further
questioning or for a subquestion, e.g., "If 'yes,' to filter question,
ask."
A questionnaire:
A set of questions.
A check list:
A set of possible answers for the respondent to check the ones he
endorses.
An item
(of a check list):
One possible answer in a set.
A version:
A particular draft of a question or of a questionnaire, usually dated.
A response:
The respondent's answer to one question, including "no response."
Interviewee:
The person interviewed — to be used to denote that a poll is by
face-to-face interviews and not by telephone, mail or other
methods. This is systematic usage with "interviewer," "interview,"
etc.; it denotes a subclass of respondent.
Respondent:
The person responding to a face-to-face inter. view, or telephonic
interview, or mailed questionnaire, or other form of personal
canvass; a subclass of informant.
Informant
Any person giving information whether by letter, printed publication,
interview, or otherwise, and whether in response to a question or
not.
Informant:
Any person giving information whether by letter, printed publication,
interview, or otherwise, and whether in response to a question or
not.
A schedule or a
ballot or a format:
The sheet(s) recording one person's responses (a "blank ballot"
before the responses are recorded). (The "schedule" is most used
but it has other meanings also as in "time schedule." The term
"ballot" is preferred by some pollers while others object to its
connotation of a political poll. The term "format," the connotating of
a type setter's layout of a page, is used by some as a technical
name with least irrelevant meanings to pollers. What folk term
ultimately acquires a technical meaning must be decided by usage
of technicians.)
A sub-percent:
A percent of a percent, such as a percent of those who gave a
particular response to a previous question. The sub-percents will
add up to 100 percent which is only a part of the sample.
Percentages whose denominators are the entire sample polled
should not be called sub-percents.
A sub-opinion
An opinion whose affirmers all affirm opinion X. An opinion of all
members of a subclass of the class of respondents who hold
opinion X; part of these holding opinion X hold its sub-opinion; and
an opinion of a sub-percent.
Affirmative or
"pro-opinion":
A "yes" response to the question, however it is phrased.
Negative or
"nul-opinion":
A "no" response, denying the question, however it is phrased.
A counter or
"con-opinion":
A response affirming the opposite of some statement.
No opinion:
A "don't know" answer, whether undecided, uninformed, or other
subclass of non-response when the question has been asked.
Not asked:
A question omitted, whether by error, or by instructions as
inapplicable, or for other reason.
No contact:
Interviewer was unable to find or to meet an assigned respondent.
Refusals:
Any respondent contacted but stops the interviewing for any
reason.
Rejects:
Responses which are unclassifiable by the editor into the other
categories.
Call back:
An interview after previous unsuccessful attempt(s) to contact the
respondent.
Further statistical terms related to the sampling of polls are better standardized. See, for
example, A Statistical Dictionary, Kurtz and Edgerton, John Wiley, 1939, page 191; Descriptive
and Sampling Statistics, Peatman, J. G., Harper & Bros., for such terms as: sample, universe,
quota, stratum, cell, representative, primary Sampling unit, secondary sampling unit, domal
sampling, areal sampling, random sampling, systematic (random) sampling, random
(numbers), error, sampling error, random error, two-way errors, variable errors, bias, one-way
error, constant error, etc.
#23. The Interrelation Matrix
Reprinted from Sociometry, Vol. III, No. I, 1940, American University of Beirut
The purpose of this paper is to apply algebra to the data of Inter-personal relations In
order to increase both the precision and the generality of any analyses or syntheses of those
data.
I. The Matrix
In any set of persons whose interrelations have been observed and recorded, let the
data be arranged in a rectangular tabulation where each person heads one column and also
heads one row. Let the indicator of the nature and degree of the interrelation between a pair of
persons be entered in the cell at the intersection of the row and column which represent those
two persons. These cell entries (between the double vertical lines In Table 1) are called a
“matrix” by mathematicians and matrix algebra has been developed since the middle of the last
century for dealing with such data. The algebraic treatment of this common form of tabulation
of data may be made clearer by an example. Table 1 on the following page records the
interrelating attitudes of 261 freshmen at the American University of Beirut as measured by a
calibrated social distance test between religious groups.
In this example of the Beirut test the properties of a matrix that make it a useful scientific
instrument may be noted.
A. Inclusiveness
By providing a cell for every possible pair of members of the group studied the matrix
compels the observer to observe completely and not select the dramatic relation, and neglect
others. Reasoning from particular cases which may or may not be representative is made less
likely. When all the interrelations are observed and recorded, indices summarizing them, or
predicting consequences in the future, are more soundly based. Sociograms are excellent to
visualize patterns of Interrelations and to facilitate many inferences, but they have the limitation
when the group is large of either becoming confusing or (In order to restore simplicity) of
selecting some relations and omitting others. It is the essence of scientific induction that any
generalization or theory must fit all the facts and not merely a selection of them when such
selection may be open to the bias of the observer in selecting such facts as fit his theory. The
matrix can handle, in orderly fashion, all interrelations of a group of any size even up to the two
billion odd living human beings.
Social Distances between Religious Groups*
P = 261 Freshmen of the American University of Beirut Scale of this Beirut test: 0%
maximal friendliness, 50% = indifference, 100% = maximal hostility.
Pp
61
26
39
25
82
28
Responders
Greek
Orthodox
Jew
Protestant
Catholic
Moslem
Sunnis
Moslem
Shiites
Popularity
Respondees
Catholic
Moslem
Sunnis
Gre
ek
Ort
hod
ox
(5)
Jew
Protestant
65
25
30
45
45
42
45
35
37
37
(6)
37
50
70
10
(11)
25
32
32
35
(2)
37
45
52
35
(2)
45
60
57
30
39
44
41
41
40
75
52
42
20
(0)
42
38
62
29
35
41
43
41
'Popularity' =
"Friendliness"=
=
Moslem Friendless
Shiites
weighed average distance towards recipient (respondee) group
excluding its in-group distance.
average distance from expressor (responder) group excluding its
in-group distance.
A brief report on the construction of this scale and findings from its application is in:
Dodd, S. C.,"A Social Distance Test in the Near East,' American Journal of Sociology, Vol.
%LI, No. 2, Sept. 1935.
B. Isolation
A second property of the matrix is that, In common with the sociogram, it isolates one
type of Interrelation from the many types that may be in a group to concentrate study upon.
This isolation of each factor to study it first alone and then in combination is an essential
scientific technique in seeking to understand the complicated web of interrelations enmeshing
people. All the cells must have entries of one type of indicator, whether of an attitude, such as
In the Beirut test above, or an indicator of some other interrelation.
C. Joint relationship
The cell, located jointly by the row and column coordinates, emphasizes its entry as an
interrelation between two parties (A and B) it is not a characteristic of either party, but only of
the pair. As there are two cells for each pair of parties, located symmetrically about the main
diagonal of the matrix, the existence of two possible relations — of A to B and of 8 to A — is
made explicit. The row and column feature calls for identifying the persons or groups between
whom interrelations are alleged. This should tend to eliminate unscientific talk about some
interrelations which are so vaguely conceived that no persons nor groups can be specified as
row or column headings.
D. Aggregation
The cell entries are not added nor multiplied but merely collected and listed.
Mathematicians call this type of combination an 'aggregation' and matrix algebra is a body of
rules for handling aggregations. Thus two matrices may be multiplied together yielding a third
matrix. The sequence of factors, which is immaterial in ordinary multiplying, makes a difference
here in that the product of matrix M1 times matrix M2 may not be the same as the product of
matrix (i.e., times matrix M1 - M2 ≠ M1 - M2). One is a product obtained by of “post-multiplying,”
the other “pre-multiplying.”2
The interrelations of people in pairs may not be abject to significant combination by
ordinary adding, subtracting, multiplying or dividing but they can be listed, i.e., aggregated.
This branch of mathematics, then, provides rules well adapted to deal with the data of internal
relations.
As a result of these properties the matrix offers an orderly way of specifying the
complete structure of a human group in respect to one kind of Interrelation at a time. It is
inferior to the sociogram in visualizing that structure but superior to it in providing a basis for
further mathematical analyses or syntheses.
II. The Units
The units in the inter-person matrix say he generalized in several directions.
A. Population Units
Any row or column instead of being headed by a person may be headed by a plural i.e.,
more than one person. The indicator in the cell then would be the average indictor for the
persons in the plural as are the social distances in the Beirut test above. Let the term 'party'
denote either a person or a plural. The inter-party matrix then can provide arrays (i.e., row and
column) and resulting cells for all interrelations between persons, between persons and
plurals, between persons and plurals in any or all combination as completely as they may be
observed.
Algebraic symbols, which generalize arithmetic quantities, will serve to generalize the
units of the interrelation matrix. Let P denote the number of persons in a party whether one or
sore. Let the subscript lp denote the aggregation or listing, of parties, p in number. Thus in the
Beirut matrix above p = 6 (the six religious plural) and P is a variable with a different value for
each piers) as listed in the first column. Let the double colon :: denote a cross-classifying of
these parties, against each other so that the symbols Pp :: Pp denote rows and columns
headed by the parties. Let S denote the situation as recorded, i.e., the matrix. Let I denote the
indicator of the interrelation between the parties of each pair, so that I is the cell entry and
varies over as many values as there are cells (which are P2 in number). Then the matrix can
be specified in the algebraic equation:
S = Pp :: Pp : I
Equation (1)
where the single colon denotes 'correspondence,' i.e., that in every cell there is a value
of the indicator I which corresponds or belongs In that cell. 3 In the Beirut matrix above the
indicator I is the average score of a social distance test having a particular value in each cell.
B. Indicator Units
The indicator I may be generalized in two directions. First there may be more than one
indicator in a cell, i.e., more than one kind of interrelation between any pair of parties. Thus in
Moreno’s spontaneity test one person on the stage may express several different emotions
towards another person. Let the aggregation of such indicators, I in number, be denoted by the
I written as a subscript. The symbol l1 then means a list of indicators, listing I different kinds of
interrelations. This i expands the second degree matrix of two sets of arrays (i.e., rows and
columns) into a third degree matrix of three sets of arrays. If the second degree matrix with
one indicator is written on a page and rewritten with a different Indicator on each of I page,
these superposed pages would serve to visualize the third denoted matrix. The degree of the
matrix is denoted by the number of its subscripts.
S = Pp :: Pp : I = the third degree interrelation matrix
Equations (2)
This matrix of interrelations of any kinds between people in any groupings is obviously a
more complete and flexible symbolizing of the complex phenomena of inter-human relations
(which we mean by the term "interrelations"). It needs still further generalizing, however, as will
be seen below.
Second, indicators may be generalized in another direction as either qualitative or
quantitative.
We have elsewhere elaborated a theory of measurement that mania observing of
phenomena starts with the qualitative and proceed to increasing quantitative precision. He first
distinguishes and names a qualitative something and then, on reobserving It, notes first its
presence or absence which converts that constant quality into a primitive all-or-none, or twopoint, variable. Then he distinguishes ordinal degrees of the quality, as in positive,
comparative, and superlative degrees of adjectives and adverbs. Ranks are ordinal units
stating a sequence without any assertion that the units (i.e., intervals between ranks) are
equal. Moreno's sociograms recording attitudes in three degrees of "attracted," "indifferent,"
"repelled" are in ordinal units. Still greater precision comes with the devising of cardinal units.
Cardinal quantities are multiples of equal and standardized units. Cardinal units, finally, may
be calibrated in measuring the accuracy (i.e., the closeness to perfect equality) of those units.
The Beirut test above is believed to be the first social distance test to be constructed with
cardinal units. The intervals between its floe statements ranging from "I would be willing to
marry--" to "I wish all these (persons) could be killed" were equalized by Thurstone's
technique.4 In this theory of measurement we define as 'quantitative" all degrees of precision
from the all-or-none through the cardinal inclusive. An Indicator I, is understood to have an
exponent of unity and this 1+1 denotes any quantitative indicator of the interrelation studied. 5
Now when the exponent of unity changes to zero, 1 0, the indicator denotes a quality. Since a
zero exponent in algebra reduces its base letter to a value of the indicator-to-the-zero power
means "one" but one of the kind of interrelation specified by its particular subscript.
I0 = 1
0
Equations (3)
0
0
But IA = anger, IB = friendliness, IC = any specified attitudes or other interrelations, etc.
Thus the indicator-to-the-zero power symbolizes algebraically any qualitative
interrelation which has not been differentiated into degrees or amounts of that quality. I +1
symbolizes any qualitative interrelation which has been differentiated into degrees or amounts
of the quality thus becoming a quantitative indicator.This invention of using the zero exponent
combined with subscripts in matrix algebra has enormous possible fruitfulness for sociometric
research.It brings qualitative phenomena into the realm of exact mathematical treatment (in so
far as the qualitative data are reliable, of course).
Using this notation we have developed tentative mathematical rules for adding,
subtracting, multiplying, and dividing qualitative indicators with each other and with quantitative
indicators. These rules represent refinements in logic beyond the accuracy which is usually
possible when logic is limited to verbal language where most words are primitive all-or-none
variables and do not permit of expressing all intermediate degrees of variation between "all'
and "none," between meaning "this" and "not this."6
The exponent on the indicator may take other values besides zero and unity but the
development of their meaning is too long a story to enter into here. In general I in postsuperscript position, Ii, can denote the varying values which the exponent may have in
specified situations.
C. Time Units
One further generalizing of the units of the matrix is useful.Thus far the indicator I has
implied a static interrelation observed at some moment of time. To include dynamic data or
interrelations changing in time, let I represent the length of any time period. Then if I
represents the amount of change, growth, process or action in that period, the rate of change
is I/T. This can be written in equivalent form as T-1 I. Since everything happening in time has a
i
time eeriewhether explicit or only implicit, the symbol T -1. I i represents the changing of an
aggregation of qualitative and quantitative interrelations of any number of kinds. If, further, this
changing is observed for a series of i.e., periods this aggregation of time periods and heir
number may be denoted by t used as a pre-subscript, tT.This expands the third degree matrix
of Eq. 2 into a fourth degree "interaction" matrix as there are now four subscripts.
i
S = PP :: PP : tT-1I i = the fourth degree interrelation matrix
Equation (4)
This fourth degree matrix, S, is visualizable as a second degree matrix of p rows and p
columns for the parties studied, duplicated on i pages for the i kinds of interrelations studied,
and this "book" of i pages reproduced t times once for each of the t periods observed. With
these generalizations of its units, the interaction matrix becomes a powerful tool for
symbolizing inter-human relations and for manipulating them with the precision and generality
of algebra.
III. Some Applications
As evidence for the utility of this algebraic analysis, consider the improvement it yields
in defining some sociological concepts such as "plural," "group," "community," "mobility,"
"social forces" and "social control," "the economic process," and others discussed later on. We
define a plural as simply an aggregation of persons. The arrays of the matrix when no
interrelations are asserted (i.e., with all cells blank) represents a plural. The matrix of an
interrelation specifies a sociological group. We thus define a group as a plural of interrelated
persons. A community is speciflcable as a group with more than one kind of interrelation in
common. Thus the "Italian community" in New York City not only has their language in
common but other culture traits from Italy which interrelate them. Thus we define a plural by P
> i whether i ≥ 0. A group we define as a subclass of plurals by the conditions that P > i and i >
0. A community we define as a subclass of group by the previous conditions of a positive
interrelation between people and the further condition that i > i, i.e., that there be more than
one kind of interrelation. The nature and degree of the community is specified by the number
of people, P, the kind and number of their interrelations, i , and the amount of each
interrelation, i, between every pair of persons as specified by the third degree matrix, Eq. 2.
The mobility in a population is exactly specified when plurals head the arrays of the
matrix and the number of people moving from one plural to another In a time period is the entry
in each cell. Gross and net gains and losses, or turnover, can be calculated in various ways
from the basic data of this mobility matrix.
We have elsewhere defined7 a societal force as the acceleration of change in a
population. The amount of the force is measurable as the product of the acceleration and the
population:
F = PIT-2 = a societal force
Equations (5)
The unit of force is one person changed one unit of any specified kind per period. The
per period calculation of this formula provides an exact operational definition of a frequently
used but somewhat vague concept. We propose to define societal control as an "interparty
force," i.e., the accelerating of change in some people by other people. The parties controlling
head rows of the matrix, the parties controlled head columns, and the indices of acceleration of
change, IT-2, are the cell entries. In algebraic notation:
C = Pp :: Pp : IT-2 = societal control
Equation (6)
When studies of societal control come down from the clouds of nebulous verbiage to
specify the list of parties controlling and controlled and the kind and amount of change
accelerated in each pair of parties, then such studies can be verified and serve ,as a factual
basis for induction of scientific theories of societal control.
For another application, the fourth degree matrix can provide the basic data for
economics. The interrelations are exchanges, usually of goods for money. The parties are
buyers and the sellers. The matrix symbolized in Eq. 4 can specify every person and economic
grouping of persons on earth and every commodity or service ever bought or sold in the past
or the future. Of course most cells have zero entries as most people have only a few dozen
parties who are economically related to them out of the two billion odd people in the world
today. Every account of any business firm is but a selection of specified arrays from this
matrix. Every inventory of stock is but another selection of arrays. Wage schedules, price lists,
catalogues of goods, are but other selections from this all-inclusive economic matrix.
Budgets and production schedules and tariffs project certain arrays of the matrix ahead
in time, in short, economic statistics of all sorts can be shown to be summaries or derivations
from the data of this fourth degree matrix. Our hypothesis of the "economic matrix" claims that
all quantitatively recorded economic data and principles are derived from some selection of
cells from this matrix. We have yet to find any statistical economic data not derived from it.
Evidence disproving this hypothesis is being sought in order to test it more fully. This formula,
Eq. 4, seems to specify the general form of "the economic process" as an immense
aggregation of economic acts.
For one more field of application it may be noted that every sociogram, every formula,
and every tabulation in Moreno's "Who Shall Survive?" have been expressed by us as an
interrelation matrix or derivation from it. We went through that volume and wrote the variant
form of the formulae presented here for every tabulation and graph without exception. In fact it
was a study of this book which led us to extend a general theory (which we were developing
for systematizing quantitative data in all the social sciences in a complex metrical equation) in
such a way as to include the interrelation matrix.
The interrelation matrix has been developed as a special case of a still more general
and flexible formula which seems able to subsume quantitatively recorded data of any kind in
any social science. This general formula, specified in a matrix equation, is called "S-theory"
from its most distinctive symbol.It has been presented elsewhere8 but here the fact may be
noted that the interrelation matrix fits in as a part of a comprehensive algebraic systematics for
the social sciences.\
IV. Further Analysis
The above sketch of the application of algebra to the data of Inter-human relations is
only a foundation. Algebraic superstructures may be built upon it by various methods. Three
such methods of analysis will be sketched in another paper. One method involves analysis of
the surface, or contours of the matrix and clarifies the sociological concepts of "isolation,"
"contact," "Interaction," "leaders," "stars" and "heroes"; "in-group" and "out-group," etc.
This method Involves simple coordinate geometry. A second method involves analysis
of the structure of the group into all possible subgroups under specified conditions. This
method applies the algebraic theorems of combinations and permutations. A third method
involves analysis of the variances of arrays in relation to the whole matrix, to allocate
proportions of the interrelations determined by each of the parties in general. This method
involves the statistical theory of variance and correlation.
Notes
1. The research reported in this paper is more fully presented in a volume entitled Dimensions
of Society which, along with a companion volume on postulates entitled Foundations of
Sociology, by George A. Lundberg is in press (Macmillan).
2. An excellent exposition of matrix algebra for those universal in mathematics beyond high
school algebra is given by L. L. Thurston in the first chapter of his Vectors of Mind
(Universal of Chicago Press, 1936. p. 266).
3. The single colon denote one-way dependence (of the I values on the P values) or
subclassification while the double colon denotes two-way or mutual dependence or crossclassification. Both are our symbol, describing the subaggregating sting of entitles within a
total aggregation.
4. U. Thurstone, L. L. and Chave, The Measurement of Attitude, Univ. of Chicago Press,
1929, p. 97.
The ambiguity of each of the five statements selected was measured by the
dispersion of the judges' placement of that statement. The reliability correlation of the test
on repetition was found to be r = .91. The results from both trials are averaged in the table
here for greater stability. The findings in this sample of 261 freshmen were corroborated in
another sample of seniors and a third single of Beirut townspeople.
5. Prescripts such as IT can be used to specify the all-or-none, ordinal or cardinal degree of
precision of the Indicator, if this is desired.
6. In our view, statistics and mathematical variables generally are to a large extent a
development of non-Aristotelian logic. Thus Aristotle's dichotomous laws that "A is A" and
.
"A is not B" are violated by functions such as A = B where A approaches B as a limit and
any boundary point s purely arbitraryEvery percentage, every correlation coefficient (such
as rAB) and many other indices seem to us to measure the degree to which A is B i.e., the
degree of overlap or partial density in A and B.
7. Dodd, S. C., 'A Theory for the Measurement of Some Social Forces,' Scientific Monthly,
July 1936.
8. B. Dodd, O. C., "An Operationally Defined System of Concepts for Sociology,' American.
Sociological Review, Vol. IV, Nr. 5, Oct. 1339. Also see footnote 1.
#24. Comparison of Scales for Degrees of Opinion
This department is devoted to shorter articles and notes on research in the
communications field, either completed or in progress. Readers are invited to submit reports
on investigative studies which might prove useful to other students because of content,
method, or implications for further research.
Copyright 1960, Journalism Quarterly, University of Minnesota, Minneapolis 14, MN
Research in public opinion and communication often are baffled by their failure to
predict more exactly the behavior of a population from a set of poll responses. One of the
methodological factors which may affect the measurements and predictions is the number of
degrees in which an opinion is observed. The present paper reports a research which
compared several well-known rating scales for degrees of an opinion with respect to a set of
criteria to be explained below.
The poller in general assume that the higher the degree of sureness, or feeling of
frequency with which an opinion is expressed, the more its ex presser will net
(a) to resist changing it and
(b) to act on it, including expressing it to others, and so strengthening the opinion in a
potentially like-minded group of people.
The earnest attempt to tap the degree of intensity of an opinion for better prediction may be
glimpsed in the following statement:
One thing we asked the people was why they were going to vote for Roosevelt or
Dewey, and we tried to get the intensity of their response, that is, how sure they
were that they were going to vote for a given candidate. We asked them whether
they had made up their minds, and tried to set up a study on that, and we also
asked a series of altitude questions. We even went into the negative and said, "It
won't matter much to you it the other candidate is elected, will it? You aren't
going to vote, are you?"1
A series of questions as suggested above may give us some indication of answering in
degrees but the uses or the answers to such questions are very limited because we do not
know their reliability, validity and other related characteristics. To help meet this problem, some
well-tested standardized sets of responses in degrees have to be provides. The number of
degrees and the forms of stating them vary considerably from a simple Yes-No type to an
elaborate graphic scale such as Cantril's thermometer scale. The present study dealt with
1)
2)
3)
4)
the Stand 10-point scalometer,
the 7-digit scale,
the 5-digit scale, and
the 5-graded-words scale.
Descriptions of these four scales follow:
1. Stapel scalometer: The Stapel scalometer was developed by Jan Stapel the director
of the Netherlands Institute of Public Opinion, and is widely used by the Gallup polls.
Structurally, the Stapel scalometer consists of ten squares vertically drawn: five
white squares in the upper half and live black in the lower half of the scale. Positive
numbers range from one to five were here assigned in the upper half on the right
side of the white squares in an ascending order from the mid-half of the scale,
whereas the negative numbers ranging from minus one to minus five were in a
descending order in the lower half. The white color, the high position, and the
positive numbers are designed for registering the degrees of favorable opinion; the
counterparts in the lower half for the degrees of unfavorable opinion. In order to
facilitate the use of the scale, anchoring words such as ''very good," "very bad," etc.,
were added to the side of the scale. There is no middle square, since the forced
choice technique is used in order to reduce the number of undecided or indecisive
respondents.
2. The 7-digit scale: The 7-digit scale consists of a horizontal line with seven points
represented by positive integers ranging from one to seven: the lowest number,
together with some anchoring phrase, registering a respondent's least degree of an
opinion; the highest number, the highest degree of an opinion.
3. The 5-digit scale: The 5-digit scale is exactly like the 7-digit scale, except for the fact
that it has five steps instead of seven.
4. Word scale: A word scale consists of a series of words or phrases expressing equalappearing intervals of degrees of an opinion. Five "degree words" covering a ninepoint range from 1 to 9 were used here. The particular word scale that was
compared with the other scales just described consisted of the following phrases:
"very good" (scale value = 8), "good" (sole value = 7), "fair" (scale value = 5), "bad"
(scale value = 3), and "very bad' (scale value = 2). The word scale had been
calibrated by the senior author and his student Thomas Gerbrick.2
The comparison of the four rating scales required construction of some appropriate
"content questions," the answers to which could be expressed in degrees. The present
research was constructed during the 1956 Presidential election campaign. Therefore, the following "content questions" relevant to the election were asked:
1. How do you feel about having Stevenson as our next President?
2. In general, how would you think the prosperity of our economy will be during the next
four years if we elect a Republican administration?
3. In general, how well does the Republican Party (or Democratic Party) represent
people like you?
Note that the answers to the above content questions have to be in some quantitative
form if they are to be meaningful in terms of the way in which the questions were asked. Thus,
each of the above three content questions was followed by each of the four rating scales
described above.
In addition to the three content questions, one validating and four "evaluative questions"
were asked. The validating question was designed to measure the correlation between a given
rating scale as a predictor variable and the assertion of one's expected total behavior as an
approximation to the criterion variable of actual voting behavior. Thus the question read:
4. If you were to vote in the national election, would you vote for Eisenhower or
Stevenson?
The evaluative questions on the other hand, were intended to measure the respondents'
opinions about each of the four rating scales following their firsthand experience of using them
in an answering the three content questions. The four evaluative questions consisted of the
following items:
5.
6.
7.
8.
In your opinion, which of the scales was easiest to use?
In your opinion, which of the scales measured your opinion most accurately?
Which of the scales do you like best?
If you were going to conduct a survey, which of the scales would you prefer to
use?
In order to facilitate discrimination, the four rating scales were presented in pairs in all
possible combinations.
The respondents who participate in the present study were 400 evening class students
registered in social science courses in the summer school, University of Washington, in 1956.
The present study had as its purpose a comparison of the four methods In terms of
(a) reliability.
(b) validity
(c) consistency, and
(d) respondents' evaluations.
Reliability refers to the degree to which a given measuring instrument agrees in results
from one measurement to another when the phenomena being measured remain constant. In
the present study, the self-correlation (of a test and its retest) after a time interval was
operationally defined as the index of reliability.
By consistency is meant the degree to which a given rating scale agrees with another
rating scale in results. Operationally, the average intercorrelations of each rating scale with the
three other rating scales constituted the index of consistency.
The 400 respondents were subdivided into eight comparable groups of 50 persons
each, so that the four rating scales could be compared with respect to the four criteria just
described — reliability, validity, consistency, and the respondents' evaluation. For the reliability
study, some groups had to be exposed three times to the same content questions followed by
the same rating scales. The correlation between the responses in the first and the second
exposures gave the reliability index when there was a time interval of 25 minutes between test
and retest. The correlation between the second and third exposures gave the reliability index
when the time interval was one week.
For the validity study, the responses given by each of the four rating scales were
correlated with the responses to the question, "If you were to vote in the national election,
would you vote for Eisenhower or Stevenson?" Every respondent was given this validity question, since each had the opportunity to use at least one of the four rating scales and a validity
index was needed for each rating scale.
For the consistency study, some groups had to be given the three content questions
four times, once with each of the four rating scales. The intercorrelations among the four sets
of responses to the rating scales constituted the basic data from which the consistency index
was derived. In presenting the four rating scales, the effect of difference in sequence of
presentation was minimized by alternating the sequences.
For the analysis of the respondents' evaluations of the four rating scales, the proportion
of endorsements for each rating scale was computed first. Then the obtained proportions were
ranked in terms of the size as a measure of superiority with respect to
(1) ease of use,
(2) accuracy in measuring an opinion,
(3) how well liked the scales were, and
(4) extent to which respondents recommended them for future use in polls.
The analysis of the data showed that:
1. The Stapel scalometer tended to be slightly more reliable than the other scales.
2. The validity tests seemed incoherent — due, perhaps, to an inadequate external
criterion of validity.
3. All four scales were about equally consistent content with one another.
4. Students judged
(a) the Stapel scalometer as most accurate;
(b) the word scale as the easiest to use;
(c) the seven-digit scale as most liked; and
(d) the seven-digit scale as the one most preferred for use in future polls.
The five-digit scale tended to score lower than the other three rating scales.
Our net judgment, based largely on these findings but also on much further experience,
is that in this homogeneous range of U.S. college students none of the four forms of rating
scales was clearly superior to all the others, although the five-digit scale was generally
somewhat inferior. Pending fuller testing in wider ranges and other cultures, we recommend for
most purposes and cultures the Stapel scalometer, if its stimulation to respond in 10 degrees
by using black and white colors and vertical arrangement of spatially separated boxes is
reinforced by the digits +5 to -5 and by words anchoring the extremes appropriately to the
question asked.
The seven-digit scale and the word scale are limited to positive opinions (unless
awkwardly repeated for the negative extension of an opinion). The word scales are further
limited at present to polls in English where the standardizing of the phrases of the 13
"scalettes" developed by Dodd and Gerbric are appropriate.3
Seven seems to be a more advantageous number of degrees than five. This represents
our compromise between the finding that too many degrees were unused or were confusing to
respondents and the researcher's desire for many degrees (if of constant reliability) in order to
increase precision of measurement.
Stuart C Dodd
Sumo Chick Bong
Washington Public Opinion Laboratory University of Washington
Notes
1. V. West Churchman et al., Measurements of Consumer Interest, Philadelphia: University of
Pennsylvania Press, 1947, p. 12.
2. Dodd, S C and Gerbric, Thomas, "Word Scales for Degrees of Opinion — Use in Polls
Measuring Intensity, etc.," Washington Public Opinion Document No. U:56-91, 18 pp.
3. Loc. cit. (For a detailed report of the present study, see the junior author’s M.A. thesis,
University of Washington, Seattle, 1956, "Comparison of Four Scales for Degrees of an
Opinion. ")
#25. Simple Test for Predicting Opinions from Their Subclasses
The author is Director of the Washington Public Opinion Laboratory in le, and member of the
Department of Sociology, University of Washington.
Reprinted from the International Journal of Opinion AND Attitude Research Spring, 1948,
Dodato Guerra t, desp. 207, Mexico, D. F.
I. What the Test Is
The subclass test discussed in this paper deals with any two dichotomies such as any
question in a poll that is answered "yes" or "no," "I approve of —," or "I disapprove of —." The
usual "no opinion" means that the population is dichotomized into those having and not having
an opinion and then those having opinions are further dichotomized into those having positive
or negative opinions.1 Similarly, by question with many answers, whether degrees of one
answer or set of qualitatively different answers, can be treated as a set in which answer is a
dichotomy in being either checked or not checked. The test applies to any pair of frequency
variables, which are expressible in two class-intervals, but for the readers of this journal its use
in all-or-none opinions will be illustrated chiefly. It also is extendible to variables with more than
two class intervals and to sets more than two variables. This is the "scaling of attributes"
developed by Louis Guttman and others in his "Cornell Technique." 2 Except for a few
indications of these extensions in this paper, we all discuss the test only in the simple all-ornone case in a pair of opinions as this has wide but little appreciated usefulness in polling and
sample surveying.
To illustrate in familiar examples the principles to be discussed below in various abstract
or generalized symbolic systems, consider such sets of opinions or other data from a poll such
as
No (=X-) Yes (=X+)
(or Y-)
(or Y+)
Set A Opinion
Opinion
Set B Behavior
Behavior
Set C Condition
Condition
Opinion
Set D
X="Do you expect a third World War?"
Y="Do you favor conscription now?"
X=--"Do you smoke?"
Y="Do you smoke brand Q?"
X="Do you live in France?"
Y="Do you live in Paris?"
X="Do you favor international
participation more than national isolation?"
Opinion
Y="Do you favor the Marshall Plan?"
Opinion
Z="Do you favor the loan to Italy?"
Set E Condition X="Are you less than 40 years old?"
Condition Y="Are you less than 3o years old?"
Set F Opinion
X="Do you side with the Dutch?"
X1 = "Or the Indonesians?"
Opinion
Y = "Do you favor Jim Crow Laws?"
(
(
(
(
(
(
(
)
)
)
)
)
)
)
(
(
(
(
(
(
(
)
)
)
)
)
)
)
(
(
(
(
(
(
(
)
)
)
)
)
)
)
(
(
(
(
(
(
(
)
)
)
)
)
)
)
Suppose that each set here (even with its over-simplified wording) passes the subclass
test. What would "passing" this test mean? In answer, consider first what the test is, then what
the test will do, and finally what the tester did in using it.
The test for a class being a subclass of another is best stated in statistical language as
a vanishing frequency. It applies to a fourfold frequency table, or scattergram, (Fig. 1) between
two attributes or all-or-none variables, X and Y. In the population (P) that is the sample polled,
let the frequency of persons holding both opinion X and Y be c, the frequency holding neither
opinion be d, the frequency holding Y but not X be a, and the frequency holding X but not Y be
b as in Figure 1. (See appendix)
Under these conditions, the subclass test is simply that any one of the four frequencies
must vanish. If any frequency is zero, the two opinions form a class and its subclass for that
population. If any of the four quadrants has no one in it then the two opinions (or attributes
more generally) "will scale," that is, they can be ranked as more and less of one opinion. Under
these conditions they are unidimensional or collinear as they are points on one line. This nul
quadrant is the necessary and sufficient condition proving that in the population polled, the two
opinions are qualitatively of one kind and differ only quantitatively in degree. Thus:
a = 0 the subclass test
Equation (1)
This subclass test is the simple condition for predicting any opinion from its sub-opinion. It holds
for any quadrant, for if b vanishes it means that X is the subclass instead of Y; if c vanishes it
means that X scale is reversed or negatively stated; and if d vanishes the Y scale has similarly
had its "yes" and "no" meanings reversed. The absence of any "a" persons means no one
affirms opinion Y while denying opinion X. Thus no one asserts living in Paris but not in France,
or being under thirty years old but not under forty (see sets C and E above), or in general of
being a member of a subclass but not a member of its more inclusive class. Alternatively stated,
everyone in class Y is in class X. The c frequency is part of the c + b frequency. The probability
of holding opinion X is 100% among the holders of opinion Y.
This simple condition for establishing a subclassification may not be perfectly met in
practice. The subclass test may be only nearly passed by having a near-zero frequency. The
practical question then becomes to measure the approach to passing, the degree of
approximation to having one opinion a sub-opinion of another opinion. The simplest index3 to
measure this is the probability that one class is wholly a subclass of the other. This probability
is simply the proportion that the members of the joint X and Y class are of the Y class, namely:
Pxy/y = c/c+a the subclass probability in Y
Equation (2)
This is the probability that all members of Y are also members of X. When a vanishes this
probability is unity, expressing certainty, for then all members of Y are members of X and Y is
wholly a subclass of X. When Pxy/y (read as "p sub xy over y") is zero, Y is not at all a
subclass of X. The size of this "subclass probability" (as it may be conveniently named) then
measures how well the subclass test is met, i. e., how fully one opinion subsumes another.
There are two subclass probabilities. The second one is the proportion of common
members it class X (Pxy/x). If the subclass test is passed so that all Y members are members
of X also, then the subclass probability in X is simply the ratio of Y to X, i. e., the Y
membership divided by the X membership.
This subclass probability needs a further measure of its sampling error whenever the
tester wants to infer from his observed sample to another population. When he generalizes his
findings about two opinions in one poll to a larger universe of people he needs to know the
attendant error in this statistical estimation. One measure for this is chi square. In order to rule
out the hypothesis that the near-nul frequency is due to chance, chi square must be equal to or
greater than 3,841 at the 5% level of significance (and in the 2 X 2 table as here). As long as
chi square is less than 3,841 the subclass test is passed.4
(X2<3,841) = passing the subclass test (at the 5% level of significance)
Equation (3)
II. What the Test Will Do
The "subclass test" for predicting, which is discussed in this paper, will do a number of
things which need several languages to describe. One is our common sense or folk language;
another is the more exact qualitative language of symbolic logic, which is quantified by
statistical language and visualized in geometric figures; and all of these can be further
interpreted as special cases of a dimensional language. Thus folk terms talk about "more and
less of one opinion" and "predicting a person's opinions"; logical terms talk about "one class
being included in another" or "one opinion5 implying another"; statistical terms talk about a
"zero frequency in a fourfold scattergram," "scaling two opinions," and a correlation of unity
between opinions"; geometric terms talk about "longer and shorter lines measuring two
opinions" and "angles between them"; metaphysical terms about "qualitative vs. quantitative
differences" in opinions; and dimensional terms talk about all of these in an orderly system.
These languages (or scientific dialects) are partly just differing symbols for the subclass test
and partly they are different interpretations of it. All together they can help the reader see more
fully than from one language alone what the subclass test will do. A major intention in this
paper is to familiarize polling practitioners a bit with these diverse languages for dealing more
exactly with opinions.6
A. In Qualitative Terms—"classes" in symbolic logic and "attributes' in statistics.
In all-or-none terms, passing the subclass test means "all Y's are X's and none of the
Y's are non-X's." Thus in set C all in Paris are obviously in France and no one in Paris is notin-France. But less obviously, in set F, it also means that all persons favoring Jim Crow laws
side with the Dutch in Indonesia. Again in set A, passing the test proves, for the population
polled, that everyone favoring conscription expects a third world war. The test leaves the
converse statement undecided for one cannot assert from it, that "all X's are Y's" but only that
"some X's are Y's," i. e., that some Frenchmen are Parisians, some persons under 40 are
under 30 years old, etc.
Whenever two opinions pass the test, then in the language of symbolic logic the class of
persons having opinion X is said to include the class of persons having opinion Y, or, more
briefly, class X includes class Y. In the notation of the calculus of classes in symbolic logic
"X ⊃ Y" says that "X includes Y" or that "Y is included in X." In a diagram where a circle
represents a class and dots represent its members, one circle lies wholly within the other circle
X.
This again can be read as that "all holders of opinion Y are holders of opinion X, but not
vice versa" since not all member of class X are members of class Y.
In the calculus of sentences (which in symbolic logic are compounds of classes and
their relations of specified sorts) this inclusion notion becomes "implication". A logician would
say "Opinion Y implies opinion X" 7, (where an opinion is a sentence and not a simple class of
members). Thus he asserts "favoring the Italian loan implies favoring the Marshall plan" in set
D above. And with this calculus of sentences and the calculus of classes and with the third
calculus of relations, logicians can apply their wealth of theorems and proofs to opinions,
making deductions and predicting new and unsuspected relations to then check in
observations, etc.8 Thus a theory of opinion, a systematic body of propositions about human
opinions, can be built, up that is rigorously consistent in logic and also corresponds to the
observed facts of people's responses to interviewers. In the future, pollers will be as
amateurish without training in symbolic logic and semantics as engineers without training in
mathematics.
In class calculus, there are exactly four possible relations between any two classes.
This makes a series of four degrees of pairing in opinion polls. These are the different degrees
of overlap of two classes, or proportion of members in common, from "none" through "some" to
"all.", These four cases (diagrammed in Figure 2 and again in Figure 5) have either "no
overlap," "two-way overlap," "one-way overlap," or "full overlap." Thus the simplest relation of
complete difference is called the disjoint case as it has no overlap, or no common members, in
class X and class Y. The next case, in Figure 2 (See appendix), is that of overlap proper or
two-way overlap, where each class shares some, but not all, of its members with the other.
This shared subclass is the logical product of classes X and Y denoting members who have
both the characteristics of X and the characteristics of Y. Thus a "wife" is the logical product of
"married" and "woman." All adjectives modifying nouns, all adverbs modifying verbs, all
phrases or clauses modifying whatever they modify, or any qualifiers and what they qualify in
any language, can be interpreted as logical products and their meaning, diagrammed by
overlapping circles.
Next, the third case is the relation of inclusion of one class in another or one-way overlap.
This is the case which the subclass test identifies in any poll. The subclass test proves that no
matter how different the polled questions may seem to be from their wording and from the
reader's interpretation of the words, yet to the population polled they function semantically as
one whole question (including a part of itself). Passing the test means that the two opinions are
related as whole and part, as class and subclass. Thus in set F, "Jim Craw favorers" are a
subclass of "Dutch sympathizers." This is the observed fact, in any population, polled
whenever the subclass test is passed, regardless of how improbable it may seem to the reader
who is using his own semantic interpretations of those phrases. In other words, the subclass
test diagnoses differences in meaning of one phrase or opinion to different people. It is a tool
for dealing more exactly with meanings. It is an operational test for getting behind the apparent
constancy of symbols and getting at their varying designata to different people.
The subclass test can be used to test variant phrasing of questions in polls or
translations of a poll into another language. Thus if two versions of a question are given to one
population (such as bilingual people in the case of a translation), the correlation between
versions measures their degree of qualitative similarity from none (r = 0) to complete (r = 1.0).
But this correlation can be less than unity and yet the opinions be qualitatively the same and
differ only in degree. For, if they pass the subclass test, they are thereby proved to be simply
two degrees of one opinion — a more inclusive and a less inclusive opinion — a class of
opinion and a sub-class — more and less of one kind of opinion.
The fourth case of possible relations between two classes is that of complete similarity,
or identity of the classes. Here all members of each are members common to both. The circles
diagramming the two classes coincide in this case. (See Figure 2, appendix)
Now all four cases are logical products. They range from the minimal logical product of
zero (XY = 0) in the disjoint case up to the maximal logical product (XY = X = Y) in the identity
case. In a later section, (Figure 5), it will be shown that the logical product is a primitive or
qualitative correlation so that the series above can also represent correlation coefficients
ranging from zero to unity.
B. In Quantitative Terms—scaling and probability
What the subclass test of opinions will do for the poller can be analyzed beyond the
qualitative level of symbolic logic with its calculus of classes and of sentences to the still more
exact quantitative level. All-or-none classes and subclasses can be quantified in two steps
yielding either an ordinal scale or a cardinal scale.
For an ordinal scale, the subclass test arranges a set of more than two opinions into a
series of inclusions. By successive applications to the opinions in pairs (or by Guttrnan's more
powerful techniques for sets) they can be put into an order from most to least inclusive. This is
simply diagrammed as the concentric circles in Figure 3B (appendix). It means that the test
can discover the opinions that pass it out of any set of opinions that may be polled. The set
passing the test is thus "ordered," i. e., put into a series according to some order which may be
called an ordinal scale. Usually, a set of polled opinions, say n in number, may have any tangle
of relations as suggested in the overlaps of the n circles in Figure 3A. (Appendix) The test
spots concentric sets of circles such as in Figure 3B which may exist unsuspected in the data
represented by Figure 3A. The opinions represented by the concentric circles are then said "to
scale" together, i.e., to form an ordinal scale.
This ordinal scale of opinion has a number of uses. One use is to predict the opinions of
a person when only one of his opinions is known. For if that opinion is exactly located at a
point on a scale, it can be asserted that he will hold all opinions on one side of that opinion on
the scale, for these include the opinion he has endorsed. Furthermore, depending on whether
the opinion was worded in some form implying the logician's phrase "at least—", or "exactly—"
or "at most—", further assertions can be made about his holding none of the opinions on the
other side of the scale. This enables the poller to predict all the opinions in a scale by getting
an answer to only one question — a case of the parsimony principle in science and lowest cost
principle in business and least effort principle in behavior generally.
Another use is that the set of scaled opinions being a linear variable can have its
dispersion, its correlation with other opinions or circumstances, and other calculations, made
with greater accuracy than for an all-or-none variable or attribute.
A third use is that the multiple correlation of a variable with any sub-set of a scaled set
of opinions is constant whatever the sub-set used. For convenience, one sub-set can be polled
and correlated to the outside variable with assurance that the correlation would be the same
for any other sub-set.
A fourth use is to reduce the field of those n opinions to a more orderly and simpler
structure. N primitive all-or-none dimensions are converted into a single more discriminating
ordinal dimension.
For a still more exact cardinal scale, one must define cardinal units as equal
interchangeable units, multiples of which measure any cardinal amount. (The ordinal scale
merely stated an order, such as runners in winning a race, which may have very unequal
intervals or units.) To get a cardinal scale for attributes, one way is to use each person polled
as the unit. By counting these, one gets a frequency in the classes having and not having an
opinion. The ratio of a frequency in one class to the whole population (P) is the proportion of
persons in that class and defines the probability of that class (e.g., c/P in Figure 1). Thus by
simply counting the members in a class and dividing this number by another, one gets a
probability index (p) which predicts the proportion in that class which is most likely to recur if
many such samples of people were observed. Similarly, the probability of the members of any
subclass recurring within a specified class is simply the ratio of the memberships. Thus the
subclass test predicts or states the four probabilities of holding both opinions X and Y of these
two classes, or neither, or either one without the other. When two opinions pass the test a
subclass predicts its class with 100% probability, for all subclass members are members of its
including class.
Thus, obviously, the probability that the subclass "smokers of brand X" will be "smokers"
is 100%. But the subclass test also proves similar perfect probability in more obscure cases of
sub-opinions such as that the probability of being an Indonesian sympathizer is 100% among
those opposing Jim Crow laws. (In set F above)9
A cardinal scale can be formed by combining the ordinal scale with these probabilities
or ratios of frequencies. The classes are laid off in order of inclusion along a line from one
origin point, each as long as its probability, as this is also the arithmetic mean. (An arithmetic
mean of an all-or-none variable, or attribute, is a proportion — they are but two names for the
same quantity.) Then, as in Figure 4, the included class, i. e., opinion Y, is the shortest line
from the left (c), the including class X is the longer line (c+b), and the third sect is the residual
class whose members are neither in X nor in Y. From Fig. 1, b is seen to be the frequency in
the non-vanishing unlike-signed quadrant. All this simply undergirds the -practice in plotting
percents of "yes" and "no" opinions by a theory of measurements which explains and justifies
each generalized step in quantifying qualitative classes.
From simple probabilities, compounds are built up giving many of the distribution curves
found in polls. Thus, for one compound, by combining the Law of Alternative Probability (which
quantifies the logician's logical sum by counting the class members) and the Law of Joint
Probability (which quantifies the logical product) the binominal distribution is built up. This
states all the number of combinations of it opinions taken p at a time when p varies from 0 to n.
As n gets large it becomes the normal probability distribution — another useful tool for
predicting opinions whenever they are normally distributed.
Another .compound probability given by the subclass test directly answers the question,
"What is the probability for a class low in the scale given the probabilities above it?" Thus in
Figure 3B what is the probability of W's in class X given the probability of W's in class Z, the
probability of Z's in class Y, and the probability of Y's in class X? The desired joint probability
is, by the Law of Joint Probability, the product of the independent probabilities 10 (which is also
the quantified logical product of the four factors X, Y, Z, and W). Working out these probability
ratios of frequencies shows that this probability of the smallest class within the largest one is
their simple probability or ratio of their two memberships. Probabilities within a series of scaled
classes are thus transitive and can yield other probabilities up or down the scale.
C. In Relative Terms — Correlation
The meaning of the subclass test in terms of correlation formulas will suggest further
interpretations for polls. One form of correlation formula is the common elements formula. Here
the correlation coefficient is the geometric mean of the proportions that the common members
are of each of the two classes that are correlated. (See correlation formula in appendix)
Each proportion is the probability of the common members or in the class X or the class
Y. Their product would be the joint probability of common members occurring amongst
members of either class X or class Y if the two probabilities were independent. But as their
numerators are in common they are not independent — they are correlated probabilities to the
extent of those common elements. This product of the two proportions is the correlation
coefficient squared and is called the coefficient of determination. 11 Thus, the squared
correlation of two classes is their joint "correlated probability." This is also the logical product of
the two classes expressed quantitatively (through counting their members in finite samples of
some observed population). The square root of the probability, Pxy/x, of the common members
in class X may be called "the sub-correlation of Y in X" (rxc and the square root of the
probability, Pxy/y, of the common members in class Y may be called the sub-correlation of X
with (rcy). The full correlation of X and Y is then the geometric mean the two probabilities and
is also the product of the two sub-correlations.12
Now the usefulness of the sub-correlations is that they distinguish the case of inclusion
from overlap proper in Figure 5, whereas correlation does not distinguish them. To see this
clearly, consider Figure 2, interpreted now in Figure 5 in terms of correlation coefficients. (See
appendix)
Here note that in the inclusion case one sub-correlation is unity; the probability of the
subclass being wholly within its class is unity or certainty; the prediction of Y from c from this
one-way regression equation is perfect. In fact, this probability of unity is one way of stating
that for two given opinions, the subclass test described above passed.
Now the degree of similarity of two opinions is measured ordinarily by their correlation
coefficient. A correlation of unity means the two opinions are identical, and a correlation of zero
means the opinions are wholly different. But intermediate degrees of the correlation coefficient
do not tell whether the apparent difference in the two opinions is a qualitative or only a
quantitative difference, a difference of kind or of degree only. If the two opinions pass the
subclass test, as shown by one subclass correlation being unity), then their difference is one of
degree only and the amount of that difference is specified by the relative frequency in the nonvanishing unlike-signed quadrant of Figure 1.
One use of sub-correlation is in analyzing causation. If a cause defined as any
antecedent correlate of its effect under specified duplicative conditions then the correlation
measures the amount of the causal relation. But it measures the causal relation both ways; it is
the geometric mean of the relation of the cause to the effect and the relation of the effect to the
cause. But only the relation of cause to effect is wanted in isolation. This isolated relation of
cause to effect might be measurable by the sub-correlation of the cause in the effect. Still
better than the sub-correlation is its square, i.e., the sub-probability index, Pxy/x. This states
the probability that the common class is the cause (Y) of the effect (X) or more exactly, it states
the proportion of occurrences of the effect that are preceded by that cause. If the causal class
of events passes the subclass test then it becomes identical with the common class and the
ratio of its membership to the membership of the effect class of events is the probability of a
causal relation. Thus, if out of every 100 voters for the Marshall plan, 70 had previously said
they favored the loan to Italy, then it may be asserted that favoring the Italian loan was 70%
the cause of, or causally linked with, voting for the Marshall plan. For the effect had been
preceded 70% of the time by opinions favoring the Italian loan. The reliability of this 70% if
reobserved on further samples of 100 persons each is what the significance test then
measures (See Equations 3).13
D. In Geometric Terms—paints, lines, planes
The foregoing qualitative classes, quantitative scales, and correlations of them can all
be interpreted geometrically, making spatial or visual models of opinion polls. This uses an ndimensional space where n is the number of all-or-none opinions (or attributes, as the
statistician calls them). In this n-space a positive opinion, i.e., having X, is a point somewhere
and, when considered jointly with the zero point, forms a dimension, since any two points
determine a line through them. The distance of the positive opinion point from zero along this
dimension is measurable as the proportion who hold that opinion out of the population of P
persons. Going on along this dimension, the complementary proportion, i.e., all who do not
hold that opinion, fixes the third point which is the total population considered as unity, as in
Figure 4.
Now any other point, representing another opinion (Q), may or may not lie on this line
that is fixed by opinion X. Passing the subclass test tells that it does lie on this line and so is
"co-dimensional with X." But, in general, the point Q may be anywhere off the line X and the
correlation between opinions X and Q tells how near or far off Q is. For this, the Q line (or
dimension or vector as it is variously called) is fixed by the point Q and the same zero point as
for X, so the Q and X lines, intersecting at zero, form an angle. This angle is measurable by its
cosine or ratio of base to hypotenuse in a right angled triangle. This cosine in statistical
language is operationally defined by calculating the correlation of opinion X and opinion Q in
some population. The correlation coefficient and the cosine are interchangeable.
In this way all the intercorrelations in a set of n opinions define all the angles and
determine a sheaf of lines or dimensions, all meeting at their common zero point and going off
in n directions in n-space. From such analyses in the language of coordinate geometry, vector
algebra, matrix algebra and other branches of mathematics, one derives the formulas for all
partial and multiple correlation and regression equations for predicting one opinion from
another; all factor theory analyzing opinions into fewer and more basic opinions; and much
more still undeveloped theory for dealing with opinions more exactly toward better prediction
and eventual control.
E. In Dimensional Terms
All the foregoing languages exploring what the subclass test will do are themselves
subclasses of a dimensional language.
For dimensions included as three special cases:
a) All qualitative terms from the exact
"classes" of symbolic logic to the
inexact folk words and sentences of
daily speech.
b) All quantitative terms from ordinal
through cardinal variables.
c) All relations especially correlations of
quantified qualities.
Symbolized by zero exponents, a
postsubscript (and subscripts)
X0
Symbolized by non-zero exponents,
chiefly unit exponents (and further
scripts)
X1
Symbolized by exponents larger than
unity, chiefly 2 and further scripts)
X2
Dimensional language specifies with algebraic precision in exponents of zero, one and two the
categories of quality, quantity, and relation. These are "categories of the understanding," or
classes of all that is knowable, in metaphysics from Aristotle's day down to Immanuel Kant. But
whereas the philosophers depend on their subjective reasoning to distinguish the "qualitative"
from "the quantitative," the subclass test provides an operational definition making the
distinction objective. Whatever one's personal views about calling two things qualitatively
different in kind, or alike in kind and differing only quantitatively in degree, the subclass test
measures these differences objectively. It gets behind the symbols and discriminates their
meanings to a given population; it proves how the symbols function semantically among a
given set of interpreters. Thus the correlation between two opinions (or anything else)
measures jointly how similar they are in kind and in degree. Then to the extent that the
subclass test is satisfied, the two opinions are operationally demonstrated to be one kind of
opinion differing only to the degree that their correlation is short of unity. It thus analyzes an
observed difference into its qualitative and quantitative components (See Equation 7B vs.
Equation 7C).
The logical, algebraic, geometric and folk languages all relate symbols to symbols in
dealing with a class and its subclass. Whether these symbols have a one-to-one
correspondence to phenomena is left to the interpreter's judgment. The semantic ideal of "one
symbol, one meaning" is often unrealized so that, while Y is called a subclass of X, the actual
Y's (the designata of Y) may not all be included in X. This semantic error is corrected in
statistical language by the subclass test, which relates symbols to their designata in counting
the frequencies of members included and not included in a class.
In summary, the three time-honored philosophical categories of quality, quantity, and
relation may be applied to our subclass test for opinions. In these terms, perhaps the best
summary statements14 (see note 14 in appendix) of what the subclass test will do are:
Qualitative:
Quantitative;
Correlative:
any opinion is predictable from its subclass.
any person's opinion is predictable from another opinion scaled with
it, with specifiable probability.
the prediction of opinion X from opinion Y via their regression
equation approaches perfection in proportion to their part-whole
correlation.
III. What the Tester Did
In applying this subclass test to two opinions, the tester has always done several things
that partly fix the results whether he does them consciously or not. Hence it is well to be
conscious of them so that he realizes just how he has fixed his data so as to help it to pass or
not to pass the test. What the tester necessarily did was:
A. He chose the dimensions to test, namely the X and the Y.
B. He chose which to call "X" and which to call "Y."
C. He chose in wording the question what to call the two directions of each dimension
— namely, which is "positive" and which "negative."
D. He chose in wording the question an origin or point for dichotomizing each opinion
— namely, the point that is the arithmetic mean.
E. He chose the units for measuring lengths along the dimension — namely, the
persons (or other units) of the population tested.
F. He chose the confidence limits or acceptable degree of probability that one opinion
is a sub-opinion.
G. He chose the number of such dimensions, namely 2 or more up to n.
How much freedom has the tester in this choosing? To answer this, note what would happen if
he chose differently. Let us consider each of the seven choices in turn.
A. In choosing the two opinions which he will test, he defines the problem. Any two
opinions whatever may be tested and each pair can define a separate problem. A
set of n opinions can be chosen as a larger problem.
B. If he calls the more inclusive opinion "X" and the less inclusive one "Y", as we have
in this paper, the empty quadrant will be the upper left one. Interchanging X and Y
interchanges the zero and positive frequencies in the two unlike signed quadrants. If
he exchanges labels, calling the includer "Y" and the included one "X", then the
lower right quadrant will vanish. The former usage is followed in this paper and is
recommended to standardize, as it conforms to the statistical convention in
scattergrams that abscissa scales start at the left and ordinate scales at the bottom.
Then positive correlation scattergrams run from lower left to upper right and negative
correlation runs from upper right to lower left. One can then always interpret
scattergrams as "right up" for positive correlation and "left down" for negative
correlation.
C. The tester can choose to reverse either dimension. By saying, for example, "Do you
disapprove" instead of "Do you approve," he changes "Yes" to "No" and "No" to
"Yes." This reverses the positive and negative segments of that dimension. This
merely shifts the empty quadrant Thus if the X opinion has its dimension reversed in
direction the nul frequency will shift from the upper left to the upper right quadrant.
The X-present and X-absent columns in the fourfold scattergram will be
interchanged. Thus nobody will assert "I do not live in France" and "I live in Paris."
Again if the Y scale is reversed, as in restating the question with opposite meaning,
the vanishing frequency shifts from the upper left to the lower left quadrant.
Thus anyone of the four quadrants may be the empty one. Which quadrant is
nul depends merely on which opinion is called X and which alternative in each
opinion is called positive and which negative.
D. The tester can choose any origin or dichotomizing point and may, if he wishes and is
skillful enough, choose it so that the two opinions are more likely to scale. By
choosing the words of the question, he defines to the interviewee what is the
opinion, the X, whose presence or absence he asks the interviewee to show by
responding "Yes" or "No". Thus by substituting "Europe" for "France," the boundary
dichotomizing "all residents anywhere" into "residents in France" vs. "non-residents
in France" has obviously shifted. Again, by substituting "Do you approve of the
amount of the loan in the Marshall plan?" for "Do you approve of the whole Marshall
plan?" the boundary dichotomizing all those who have an opinion about the Marshall
plan is obviously shifted. This principle permits trial of various wordings toward
finding that dichotomizing point which, if a nul quadrant results, classifies opinionwith-wording-A as included in the opinion-with-wording-B.
The actual boundary or dichotomizing point in opinion X is the arithmetic
mean of opinion X. It is also the proportion (or percentage if multiplied by 100) of
people who have that opinion. For every positive X is assigned the value "1" and
every negative reply is assigned the value "0," so that on adding all the values and
dividing by their number to get the mean, one also gets the proportion of people who
respond positively to that opinion. Thus in every dichotomous opinion, including
every all-or-none opinion, the two subclasses are always those persons who are
above the mean and those below the mean. The proportion of people who have the
opinion are all above the mean and the people who do not have that opinion are
below the mean. For the mean will always be between "1" and "0" (as long as there
are two subclasses of that opinion),
In the case of non-dichotomous opinions where there are three or more
degrees of response, the tester can choose to simplify the situation by dichotomizing
it. By choosing any one cutting point he can re-express any multi-valued variable as
a two-valued variable (with loss of precision from this lumping of class-intervals
together). This makes the test for scaling in its simple nul quadrant form applicable
to any pair of frequency variables. This may be useful for rough exploring of a
situation to get clues as to how more exact prediction could be made with more
precise techniques.
E. The tester chose the population that he polled and its units, whether persons,
households or otherwise. Choosing another population might give different results. If
two opinions do scale in one observed population, the inference that they scale in
the universe sampled or in another sample depends on the usual principles of
sampling.
In choosing "persons" or other units to observe in that population, the tester
simply fixes on the units for expressing the percentages of each opinion. This
permits more exact quantifying of the qualitative assertion "X includes Y" to become
"Y is 65% of X," etc.
Thus, the frequency in the cross quadrant which does not vanish measures in
percentile units how much larger the including opinion compared with the included
opinion. This fact enables the scale which the two or more opinions form to be more
accurate than a mere ordinal scale ranking the opinions. It makes a cardinal scale
fixing the distances between each opinion in exact and interchangeable units of
population. (Subject as always to sampling fluctuation if inferences about its parent
population or about another sample are made.) The opinion itself is still in all-ornone units having values of 1 or 0 only. The frequency percentages measure it more
finely but in units of another variable, namely the population. Much confusion will
result if this distinction is not made. It is more accurate technique to express the
opinion itself in ordinal degrees (e. g., none, some, most, more, all, etc.), or in
cardinal units (1, 2, 3, 4... etc.) of its own kind, i. e., to measure it directly, wherever
possible, rather than to use units of some other variable such as the population
which measure it' indirectly. The indirect population units are used here only
because the opinion itself has not been refined beyond the primitive all-or-none form
of statement.
F. The tester chose the confidence limits — such as the 1% level or the 5% level —
within which he found the probability that these two opinions, if reobserved on many
samples, would continue to scale. These involve the conventional interpreting
necessary when the frequency in one quadrant does not quite vanish. Is the nonvanishing frequency due to chance or does it mean the two opinions just do not
scale? Some probability index is needed in practice (such as Equation 3) to judge
the "near nul" frequency. This specifies the degree of approximation to a perfect
scale that is tolerated by the tester.
G. Finally, the tester may choose more than two opinions. He can examine any set of
opinions, n in number, to see if they all scale and so are points along one dimension.
The test here requires that every pair of opinions shows a nul quadrant. Among n
opinions there will be thus at least (n2 – n)/2 conditions to be met, or that number of
nul quadrants to be found. Tests for this general case of scaling n attributes have
been extensively developed by Louis Guttman and others and will be fully reviewed
in his forthcoming book. From these tests, if the set of opinions do scale, one can
then make accurate predictions about which opinions of the set a person holds from
knowledge of one of his opinions only. The set of n opinions has been reduced to
one opinion with n degrees of it.
If the set of n opinions do not scale, the tester can make further choices such
as:
1) Abandon the search for scales.
2) Try another population.
3) Try other wordings, i. e., other opinions (an opinion being defined as a verbal
statement of an attitude). This shifts-the dichotomizing boundary.
4) Relax his standards of statistical significance.
5) Select a sub-set of opinions that do scale.
Thus he may find one or more sub-sets within the set of n opinions that he has
studied which do scale within each sub-set. In this case, he has partially factored the n
opinions. For scaling is one form of factor analysis. It identifies "factors" which form a
linear pattern or order of increasing inclusiveness within a set of n observed variables.
This means reducing the n observed dimensions to fewer dimensions — a case of the
parsimony principle in science. At the limit in complete scaling, n dimensions are reexpressed completely in terms of one dimension.
Notes
1. Similarly, the poll may first dichotomize those informed and uninformed on issue. By
dealing with the informed class, only the information variable is thus portioned out or
controlled. Its irrelevant varying is eliminated in such a poll. This partials out the
"uninformed" more accurately than a partial correlation coefficient would do. For the
information variable is exactly eliminated by this simple experimental design whereas
partial correlation eliminates it only an average. Full exploration of this partial correlation of
attributes is promised at Paul Lazarsfeld’s forthcoming book.
2. See Louis Guttman, "A Basis for Scaling Qualitative Data," American Sociological Review,
Vol. IX, No. 2 (April 1944), and other articles. These are well marshaled in his forthcoming
book Attitude and Opinion Analysis, which the author is told will treat fully the all-or-none
case among others. The "subclass fat," however, was independently discovered by the
author as a rigorous logical deduction from the dimensional S-system referred to below.
3. Other indices were developed and discarded as unnecessarily complicated. Thus one
index is the subclass correlation coefficient or part-whole correlation. The subclass
probability later will be seen to be the square of the subclass correlation coefficient and is
identical with the subclass determination coefficient, r2cx. (see Equation 6) Another index
was built which applied chi square and the coefficient of contingency to the near-vanishing
cell by dividing the expected proportion or population (expected in that cell as the product of
its row and column proportions) into the discrepancy between that theoretical and the
actually observed proportions in the near-vanishing cell. A third index measured asymmetry
about the diagonal by the ratio a/(a+b) which would run from .5 to .0 for a vanishing and
from .5 to 1.0 for b vanishing. Multiplying by 200 and subtracting 100 would yield a
percentage m of asymmetry or tendency for one class to be absorbed into the other. A
fourth measure index uses the evaluated determinant, namely, ab-cd in which ab tends to
vanish as a does.
4. Thus, for example, take a sample of 100 persons, 50 of whom hold opinion Y, of whom 5
do not hold opinion X. The "a" quadrant has 5 persons instead of the 0 needed for Y to be
wholly a subclass of X. What is the probability of Y subclass of X? If the theoretical value of
Pxy/y is .99 (i. e., 95% of Y members are members of X) then the "a" value is 50 X .05 and a
is taken as the nearest integer. (If 3 were taken instead, the test would be even more clearly
passed.)
5.
Y+
Xa=2
a' = 5
X+
c = 48
c' = 45
50
a'' = 3
a''' = 2.5
c'' = 3
c''' = 2.5
d=23
d'=20
b = 27
b' = 30
Y-
50
d'' = 3
d''' = 2.5
b'' = 3
b''' = 2.5
25
75
X2 =
6.25/27 = 3.75
6.
7.
8.
9.
100
Where absence of prime
denotes theoretical
frequencies, single primes and
underlining denote observed
frequencies, and double
primes denote the discrepancy
of observed and theoretic
frequencies, and triple primes
denote Yate’s corrected
discrepancies.
6.25/2 + 6.25/48 + 6.25/23 +
Squaring each discrepancy with Yates' correction, dividing each squared discrepancy by its
theoretic value, and summing gives chi square as 3.75. Since this is less than 3,845, the
subclass test is passed at the 5% level of significance and we say that Y here is a subclass
of X.
A laxer standard, where Pxy/y = .9 (instead of .95) the subclass test is passed if there
are as many as 8 persons in the "a" quadrant (at the 5% level of significance in a 2 X 2
table when the marginals are 50 persons and 50 persons as here. )
An opinion is defined here as a verbal statement of some attitude, and therefore "a
sentence" in the sentential calculus of symbolic logic.
All these languages are systematized by dimensional analysis and become special cases
of the dimensional S-formula. This comprehensive dimensional formulation of whatever can
be symbolized in any language or notation is developed in the author's Dimensions of
Society (Macmillan, 5942, pp. 944) and more fully and simply in his Systematic Social
Science (University Bookstore, 4386 University Way, Seattle, Washington, 1947, pp. 785),
a volume offset for criticism and revision.
To prevent confusion, note that the active voice interchanges with the passive voice in the
English phrases used when passing from class to sentence calculus, thus: Between
classes: "includes" shifts to "is implied by" between sentences and "is included in" shifts to
"implies" between sentences.
For a simple example — in set D above, suppose that the Marshall plan includes articles
defining an Italian loan. Then this Italian plan is a subclass of the Marshall plan. But from
the equations of symbolic logic we deduce that endorsers of the whole Marshall plan will be
a subclass of the endorsers of the Italian plan. We predict this reversal of the class and
subclass with complete certainty. Anyone observing these facts and accepting these
definitions given above, in any poll, anywhere, anytime, will confirm our prediction. For this
prediction in a popular sense, or deduction more exactly, admits of no exception. Any
apparent exception is a mistake in classification of that Person.
Consider first the logical proof of this reversal of subclasses and then its "common
sense" analysis. Let the Marshall plan be denoted by M, a class of treaty articles; let I
denote its subclass of Italian treaty articles; let PM denote the clan "people endorsing all the
Marshall plan's articles"; let PI denote the class "people endorsing all the Italian articles";
and let PNI denote the class "people endorsing all the non-Italian articles in the Marshall
plan." Then:
(1) M ⊃ I
(2) PM = PI PNI
(3) ••• PM ⊂ PI
The Marshall plan includes
the Italian plan
PM is the logic product of PI
and PNI
Therefore endorsers of the
Marshall plan are included in
endorsers of the Italian plan
By definition
By definition (since
Marshall plan endorsers
are here defined as
those who endorse all
its articles.
Since a logical law
states that every logical
product is included in
each of its factor
classes.
This reversal of the part-whole relation arises through imposing the condition (2) that slips
in hidden in the word “whole” (Marshall plan). The poller, by choosing to dichotomize the
universe of people who have opinions about the Marshall plan into those who endorse all
its clauses and those who do not endorse them all (whether endorsing some or none), has
laid down the condition expressed in (2) which restricts the pro-Marshall class of people to
being a part of the Pro-Italian class. This prediction is without exception, so that if a poll
finds any exception that response is proven to be an error of observation of some kind —
that respondents did not know the meaning of the questions or answered by guessing, etc.
This provides a technique of identifying errors of observation, for spotting persons who
misunderstood the questions asked.
From a common-sense analysis it is evident (but not at first sight) that anyone
endorsing every article in the Marshall plan will have endorsed all the Italian articles. But
conversely, some may endorse all the Italian articles but not all the other articles. This
yields more Italian plan endorsers than Marshall Plan endorsers and makes the latter a
subclass of the former. Alternately stated, the subclass test is fulfilled here in that one
quadrant has nobody in it, for no one can assert endorsement of all the Marshall Plan
articles and non-endorsement of the Italian part of them.
10. Since X including Y implies that the complement of X is included in the complement of Y, or
in dimensional notation where the presubscript denotes a subclass and the dot-colon
denotes either "implies" or is included in": YX .: xY.
11.The equation is:
PW/X = PW/Z P Z/Y P Y/X,
Equation (4)
where the subscript denotes the numerator and denominator classes in the probability ratio.
2
12. rxy
= PXY = PXY/X PXY/Y = c/(c+b) • c/(c+b)
Equation (5)
13. The sub-correlations seem related (identifiable?) to the two correlation ratios for curvilinear
correlation since the geometric mean of the correlation ratios is the correlation coefficient
(when the regression is rectilinear).
Some statistical readers may want to relate this common members correlation to the
fourfold correlation of two attributes. They differ because they measure correlation in
differently defined populations. The common members correlation ( rc) is among the
members of class X or class Y only, i.e., frequencies a, b, c Figure 1. The fourfold
correlation (rf) is among these with the addition of members of neither class, which is
frequency d in Figure 1. Here one starts with a specified population regardless of whether
they belong to class X or to class Y. The formulas for each are given in the appendix.
#26. The Coeffient of Equiproportion as a Criterion of Hierarchy1
Fellow of the National Research Council, U. S. A. and of the Rockefeller Foundation,
University College, London
The Journal of Educational Psychology, Vol. XIXApril, 1928Number 4
The term "equiproportion" is proposed2 to define exactly the looser concept of "hierarchy."
Equiproportion is defined by equation
rst/rsv. = rut/ruv
Equation (1)
or its "tetrad difference" form:
rstruv. – rsvrut = 0
Equation (1a)
A table of intercorrelations is equiproportional when all its tetrad differences are zero,
within limits of the probable error of sampling. This more exact terminology seems desirable for
hierarchy often meant subjective estimation of the tendency of the coefficients in the rows and
columns to decrease or else it meant the particular and approximative criterion of the intercolumnar correlation equaling unity. It further connoted a ranking one above another, which
was misleading since all the intercorrelations may be equal or zero and still equiproportion be
perfect.
In practice, tables which are strongly equiproportional will never be quite perfectly so
due to
(a) errors of sampling or
(b) the presence of group3 factors over and above the general and then specific factors
into with the n equiproportional variables can always be analyzed.
The object of this paper is to develop the best criterion of equiproportion in practice. It
must measure and interpret both the absolute amount of the divergence from perfect
equiproportion (presumably due to group factors) and the probability that this amount is, or is
not, due to sampling error.
As an approximative criterion the inter-columnar correlation equaling unity was first
developed because it enabled estimating the sampling error. But this proved unsatisfactory
because:
(a) The inter-columnar r might be unity although (1) was not satisfied, as instanced by
such coefficients as .3, .5, .6 being matched with .3, .5, .7 in the next column.
(b) When the r's in a column tend to be equal, the inter-columnar r tends to become an
indeterminate quantity since all deviations are shrinking towards zero. Before this
point is reached, however, those deviations become smaller than and therefore
swamped by, the probable error of the rs.
(c) To avoid this a "correctional standard" was required which rejected columns whose
deviations were not significant. This meant that this criterion of hierarchy could then
be only applied to a part of the data
This inter-columnar criterion was replaced by the tetrad difference criterion4 when its
probable error was worked out by Spearman and Holzinger. The full formula for the probable
error of the fundamental criterion of a single tetrad was found and several briefer
approximations to it.
To summarize the table a formula for a general probable error was worked out which
depends upon the mean and the standard deviation of all the correlation coefficients in the
table. The proof of this formula has not yet been published and the author states that it "in
some theoretical points requires elucidation still." This present criterion of equiproportion is the
significance of the median tetrad difference as compared with this probable error from the
whole table. The ratio of such quantity to its probable error may be called the significance ratio
— a term of convenience in interpreting many types of Statistical data.
A significance ratio of unity indicates a tetrad difference which would as often as not
occur by mere sampling and which, therefore, needs no further explanation. If the significance
ratio is larger than 6, it is currently considered established that, with quite high probability, the
quantity is not due to sampling error. In the case of equiproportion, this means that the
observed median tetrad differences departure from zero by a significance ratio of 5 or more,
definitely indicates group factors preventing the clean analysis of the variables into one general
and n specific factors.
This generalized tetrad difference criterion leaves room for improvement (aside from the
fact that its proof is incomplete, though promised) because of the following features:
(a) It does not satisfactorily interpret the absolute amount of the divergence from perfect
equiproportion, but rather performs the second function mentioned above, namely
that of giving the probability that this amount may be due to sampling error.
(b) It is very laborious to compute for a large table. There are 3nc4 tetrads in a table of n
variables. This is 3003 for n = 14.
(c) It does not clearly isolate the group factors, which may be present between each
pair of variables, and measure their size.
Now it has been shown by Garnett5 as a special case of the cosine law, that when
equiproportion exists, the intercorrelation of every variable in the table can be expressed as
the product of the correlation of those variables with the general factor, g, thus:
rxy = rxgryg
Equation (2)
This is also derivable6 from the partial correlation formula, which is zero upon the
elimination of the general factor as the residual specific factors are uncorrelated:
rxy - rxgryg
rxy•g = ————— = 0
kxgkyg
Equation (3)
For here the numerator must equal zero if the fraction does and this gives (2). The rxg
and ryg can be readily determined from the inter-correlations as explained in the Appendix of
Abilities of Man already referred to.
Under each of the (n2 – n)/2 observed inter-correlation in the table the rxgryg value,
expected if equiproportion were perfect, may be written, and the (n2 – n)/2 discrepancies
found. These discrepancies between the two factor hypothesis and observation are the
numerators of the ratio in (3). A measure might be built up of squared discrepancies similar to
the coefficient of contingency, which summarizes the amount of relation existing in all the cells
over and above that expected by the hypothesis of chance. But the fact of correlation between
correlation coefficients existing and the further fact that this correlation surface has not been
worked out makes such a measure a difficult proceeding. But on dividing this discrepancy by
the product of the two alienation coefficients as in (3), there results the residual correlation
after the elimination of the general factor.7 It is proposed to use the average of these partial
correlation coefficients with g eliminated, i.e., rxy•g, as the coefficient of equiproportion to
measure the average amount of the group factors, and so the extent to which the variables fall
short of being perfectly expressed by one general and n specific factors. 8
The partial correlation with g eliminated measures in familiar terms the absolute amount
of the group factors present. Its significance ratio to its probable error indicates the probability
of this correlation being due to sampling error. But it is not necessary to rest with this
probability for a crucial test can be made. This test is to increase N, the number of the subjects
(or if they are already sufficiently numerous to take sub-samplings). For if the partial correlation
with g eliminated is due to group factors it will become more and more constant as N
increases. But if it is due to sampling error it will decrease in accordance with the law of
sampling error expressed in the probable error formula:
PE of rxy•g = .6745
(1
-__
r2xy•g)
_______
√n
Equation (4)
On the hypothesis that the exclusive presence of one general and n specific factors is
—
2
obscured solely by sampling error, the rxy•g
is zero and so the PE limit is .6745/√N. In the
absence of group factors the PE of the observed partials will never exceed this limiting value
and will tend to range just under it as N increases. So closely will they approach this limit that
often, as shown in the examples later, they must be calculated to seven or eight decimal
places to observe any divergence from it. Therefore, instead of using an average probable
error to divide into the average partial with g eliminated, it is proposed to use this PE limit,
—
6745/√ N. in getting the significance ratio of the coefficient of equiproportion. 9
The complete criterion of equiproportion proposed is then: For four variables either the
significance ratio of the basic criterion of the tetrad difference, or the coefficient of
—
equiproportion, rxy•g and its significance ratio.
For more than four variables:
(a) The coefficient of equiproportion (which is the average of equations (3)) measuring
the amount of group factors and sampling error present,
—
—
(b) Its significance ratio, 1.48 rxy•g √N. (= rxy•g /PE rxy•g ), measuring the probability that
the observed partial correlation is due to sampling error,
(c) And, if desired, an auxiliary index, the sigma of the partials, to measure the
distribution of group factors evenly among all the variables when (sigma = 0) or their
concentration in a few variables when (σ >
< 0).
Examples
A few illustrations have been worked out of the features of this criterion and
comparisons with other criteria. A table of inter-correlated anthropometric traits has been taken
as one in which equiproportion was extremely imperfect; another of earlier mental tests
illustrates good equiproportion; and a third of more recent mental tests illustrates excellent
equiproportion and at the same time its complication by experimentally introduced group
factors in the shape of alternative forms of the same tests. The tables of raw correlations are to
be found in Spearman, "Abilities of Man," pp. 44ff.
Table 1:
Doll's Anthropometric Correlations
N = 477
Test ..........................
Name .......................
1
Right
grip
rag, ............................. .8016
2
Left
grip
8630
3
Height
1
2
.0270
.6402
5
Weight
.7684
4
Sitting
height
.8277
.6941
6
Vital
capacity
.6362
.0432
.0420
.0430
.0377
.0432
.0455
.0432
.0443
.0356
.0438
.0427
.0451
.0451
PE’s
3
4
-.2377
-.2490
- .2881
-.4210
.4708
Partials
5
- 2357 - .0798 .2100
.2812
.0454
6
.2385
.1822 -.1193
-.1192
-.0929
Average partial
(neglecting
.3202
.3222 12,852
.3042
.1759
.1504
signs).
The partial correlation coefficients are in the lower left triangle of the table, and their probable
errors are in the upper right triangle.
In Table 1 no equiproportion is found among the anthropometric tests. The intercolumnar correlation is -.02; the observed median tetrad difference is more than 10 times the
probable error expected by random sampling alone: and the correlation due to the group
factors after eliminating g is .2564. This latter value is over 8 times the probable error and
would occur by random sampling only once in 100,000,000 times. These variables then, after
eliminating the general factor, still contain large group factors, which cannot possibly be
attributed to sampling error.
Bonser’s Mental Correlations
N = 757
Table 2:
Test……………
Name…………
1
Mathematical
judgment
rag………………
.701
1
2
…….
.0265
2
Controlling
Association
.672
3
Literary
interpretation
.607
4
Select
judgments
.550
5
Spelling
.0363
…….
.0363
.0364
.0364
.0363
.0363
.0363
…….
.002
.0364
…….
.0363
.0363
398
PE’s
3
4
-.046
.018
-.019
.044
Partials
5
.024
-.029
.044
-.031
Average Partial,
(neglecting
.029
.030
.028
.024
.032
signs)
The partial correlation coefficients are in the lower left triangle of the table, and their probable
errors are in the upper right triangle.
In Table 2 good equiproportion is found in Boner's correlation of Bonser’s earlier
intelligence and schooling tests. The inter-columnar r is .96; the observed median tetrad
difference is only 1.18 times the probable error of sampling; the coefficient of equiproportion
giving the average raw correlation of the group factors, plus sampling error, is .0284. This has
a significance ratio of 1.16 times the probable-error-limiting-value obtainable by sampling. Note
that the significance ratio of the coefficient of equiproportion is almost the same as that of the
tetrad difference criterion, as it should be, since the methods are based on the same formulae.
The slight correlation here after eliminating g would occur by random sampling 44 times in a
100 and hence there is no ground for supposing that it indicates any group factors at all Note
how closely the probable errors of the observed partials in the table approach the probable
error limit, which would hold were there no group factors but only random sampling error
present. The observed and expected PE's differ only in the fourth dermal place.
Spearman’s Civil Service Correlations
Form A "Selective," or Controlled Response Type; N = 2599
Table 3:
Test13……………
Name…………
rag………………
1A
1A
Completion
.7306
2A
Analogies
.5640
3A
Passages
.5954
4A
Instructions
.5900
……
.1322762
.01322791
.01322489
.01322799
.01322799
.01322765
PE’s
2A
3A
.0069
-.0036
……
- .0027
Partials
4A
Average partial….
PE…………………
-.0038
.0048
.0132
- .0027
.0041
.0132
……
.0057
.0040
.0132
.0041
.0132
The partial correlation coefficients with g eliminated are given in the lower left triangle of the
table, and their probable errors are given in the upper right triangle.
The tests as given in Spearman, "Abilities of Man," p. 153 have been renumbered to
distinguish the two forms. The tests here 1A, 2A, 3A, 4A, 1B 2B 3B are there 1, 4, 7, 5, 3, 2, 6
respectively. Test 4 (Spearman No. 5) only had one form and was introduced to get the
necessary fourth test with which to calculate tetrad differences in each form group.
(a) Coefficient of equiproportion = average partial r of the table = .00425 = average raw
correlation due to group factors and sampling error.
—
(b) PE maximum limit =.01322808 = .6745/√N.
(c) Significance ratio = .322 = a/b.
(d) Probability of the coefficient of equiproportion being due to sampling error = 83 in
100.
In Table 3 is shown a triumph of precise experimentation in psychology. The tests were
built to measure g with as little group factor as possible. Due to using the best types of verbal
intelligence tests, care in construction and administration, controlled response (multiple choice)
form of response, and a large number of subjects (N = 2599) the resulting equiproportion is
almost perfect. The average correlation of the group factors together with the sampling error is
.00425 which is the coefficient of equiproportion. As this amount is only about a third of the
probable error of pure sampling, it has a probability of being due to that cause alone of 83 in
100. So that even this vanishingly small correlation is not likely due to group factors at all. The
probable errors of the observed partials have to be calculated to seven and eight decimal
places, as shown, before any discrepancy from the theoretical probable error of sampling
without group factors is revealed. Herewith psychological experimentation begins to achieve
the precision of the older sciences of physics and chemistry! If reported to the conventional two
decimal places, every partial correlation coefficient but one in the table would appear as zero,
indicating complete absence of both group factors and also of significant sampling error.
Table 4:14
Spearman’s Civil Service Correlations
Form B "Inventive," or Free Completion Response Type; N = 2599
Test15……………..
Name…………….
rag…………………
1B
Completion
.6792
2B
Analogies
.7130
3B
Passages
.6106
4B
Instructions
.6190
1B
……..
.0132
.0132
.0132
.0132
……..
.0132
.0132
-.0241
.0172
.0132
.0179
.0132
PE’s
2B
3B
-.0317
.0263
……..
.0011
Partials
4B
Average partials…..
PE…………………..
.0010
.0197
.0132
.0285
.0104
.0132
The partial correlation coefficients with g eliminated are given in the lower left triangle of the
table, and their probable errors are given in the upper right triangle.
The tests as given in Spearman, "Abilities of Man," p. 153 have been renumbered to
distinguish the two forms. The tests here 1A, 2A, 3A, 4A, 1B 2B 3B are there 1, 4, 7, 5, 3, 2, 6
respectively. Test 4 (Spearman No. 5) only had one form and was introduced to get the
necessary fourth test with which to calculate tetrad differences in each form group.
(a) Coefficient of equiproportion = .0163 average raw correlation due to poop factors
and sampling error.
—
(b) PE maximum limit = .0132 = .6745/√N.
(c) Significance ratio = a/b = 1.23.
(d) Probability of (a) being due to sampling error = 41 in 100.
In Table 4 the same tests on the same subjects are analyzed. But this time the tests
were thrown into the inventive or free completion type of response. This apparently introduced
some subjective scoring elements which act as group factors in slightly increasing the partial
correlations with g eliminated. Although the differences from Table 3 are not statistically
significant, yet there is an indication that the controlled response type measures g with greater
precision than the inventive response type with its scoring variables.
Table 5:
Spearman’s Civil Service Correlations
Form A vs. Form B (Introducing Group Factors Due to Alternative Forms of the
Same Tests); N = 2599
Test .......................................
IA
Name..................................... Completion
A* ...................................
.7973
rag
B**..................................
.7363
1B
2B
3B
Average partial17 ....................
.0464
- .1802
- .0718
.0833
2A
Analogies
.6059
3A
Passages
.5988
.7389
.3419
-.0968
.0603
- .0687
.0914
-.0821
.0094
.0500
.0500
*Determined by formula (19) of Spearman, "Abilities of Man," and averaging the three such
ratios for each form.
**Throughout these tables the average partial is taken without regard to sign, to indicate the
absolute size of the group factor correlation. The average partial here is the average of both
forms from the three rows and the three column cell.
The partial correlation coefficients with g eliminated are given in the lower left triangle of
the table, and their probable errors are given in the upper right triangle.
The tests as given in Spearman, "Abilities of Man," p. 153 have been renumbered to
distinguish the two forms. The tests here 1A, 2A, 3A, 4A, 1B 2B 3B are there 1, 4, 7, 5, 3, 2, 6
respectively. Test 4 (Spearman No. 5) only had one form and was introduced to get the
necessary fourth test with which to calculate tetrad differences in each form group.
(a) Coefficient of equiproportion = .0749 = average raw correlation due to group
factors and sampling error.
—
(b) PE maximal limit = .6745/√N. = .0132
(c) Significance ratio = a/b = 5.67.
(d) Probability of (a) being due to sampling error = 2 in 10,000.
In Table 5 a very neat experimental situation has enabled the isolation of group factors
due to alternative forms (selective and inventive) of the same tests. Although these tests are
known from Table 3 and Table 4 to give excellent equiproportion with no significant group
factors, yet when the calculation is based on both forms jointly there appears a coefficient of
equiproportion measuring the correlation of group factors of.0749. As this is 5.67 times the
probable error of sampling in the absence of group factors, and would occur once in 5000
times by chance, this correlation indicates significant group factors.
It is worth noting that the effect of group factors seems to be to increase the rags, the
correlation of each test with the general factor, g, as determined from all the tests. Part of the
tangled mass of group factors seems to be appropriated by the formula as a generalizable
factor, leaving the balance as a structure of positive and negative 10 group factors as shown by
the partials. Note that, in Table 5, the rags differ markedly according as they are determined
from one form or the other, and these again from the "pure" forms of Tables 3 and 4. This
illustrates the chief defect in the procedure of partialling g out to get the coefficient of
equiproportion as a criterion of equiproportion. For in proportion as the variables contain group
factors the determination of rag will become less accurate (increasing it) and the resulting
partials will be less accurate. They will still indicate by their general size how imperfect the
equiproportion is and which are the offending variables but they will not measure the size of
the group factors so reliably.11
It should be remembered that a given set of correlations can be synthesized by group
factors in an enormous number of sizes and arrangements. The particular group factors here
analyzed out are determined by the formula for getting rag, and then rxy•g. When the basal rag,
becomes changed the ensuing calculations will change accordingly. The intricacies of this are
very puzzling as shown by much further data worked out, but not here published. More
research is needed on possible structures of group factors and their effects on the correlation
coefficients. At present, however, the accuracy of the r ag, determination can be increased in
many ways. For example, after working out the partials, as in the above tables, those variables
showing hugest group factors might be dropped and the r ag’s, recalculated from the others.
The method of "reference values" is described by Professor Spearman in the Appendix of his
book, "The Abilities of Man." Of course the principal method is so to build the tests and
conduct the experiment that group factors will not appear, if g is to be measured.
In Table 6 a summary of the foregoing discussion and indices from the three criteria of
equiproportion on the various sets of data are given. The chief fact to note here is that the
three criteria agree, even the approximative inter-columnar correlation gives a dependable
verdict. The tetrad difference criterion gives the same order of probability that an observed
imperfection in the equiproportion is due to sampling error (and therefore not to be attributed to
group factors) as the coefficient of equiproportion. But the latter also affords a measure of the
absolute size of the group factors in familiar correlation terms (compare Tables 2 and 3). There
is no doubt after inspecting the vanishingly small partial correlations of Table 3 that this is a
Mate nearly perfect equiproportion than in Table 2. Yet because Table 2 is based on one-third
as many cases, its PE of the tetrad differences is larger and by this criterion shows a
significance ratio that is alike for both tables. But the coefficient of equiproportion distinguishes
the greater perfection of Table since the original data, or the correlations based on a larger, or
a smaller, number of subjects, N, were not available for the data of Tables 1-5, the crucial test,
going beyond the mere probability, that the apparent group factors were really group factors
and not sampling error, could not be made here.
Summary
From these illustrations of the analysis of the variables made possible by the coefficient
of equiproportion criterion, its features may be &unmerited, as follows:
1. For many variables it is much less laborious to calculate than the tetrad difference
criterion. For 14 tests there are 3003 tetrads to work out, but only 91 partials, or one
for each correlation coefficient.
2. It not only measures the probability that the observed imperfection of equiproportion
is due to sampling error (and so that complimenting improbability that it is due to
group factors), but it also provides a measure of the absolute size of the group
factors, if they exist.
3. It further identifies the precise pair of variables in which the group factors exist more
simply than by a complicated comparison of various tetrads.
4. It is an analysis which would be undertaken in any case after the application of the
tetrad difference criterion in order to explore the constitution of the variables. It thus
does not involve the extra work of working out the criterion as do the former criteria.
5. It provides a crucial test to distinguish between imperfect equiproportions being due
to sampling error or to group factors.
—
6. This test is the constancy, or decrease of the partials in proportion to 1/√N , as N
increases.
7. It is based on familiar and well established formulas throughout.
8. It possesses the disadvantage of measuring the group factors with decreasing
accuracy as these grow larger and make the determination with of rag, less accurate.
Notes
1. Indebtedness is acknowledged to the criticisms of Professor C. Spearman and Professor
G. H. Thomson in preparing this study.
2. The proposal is made jointly with Prof. C. Spearman.
3. These group factors must be distinguished from the dice-like group dealt with by Professor
G. H. Thomson, else the present confusion will be increased. For an exposition of the
different properties of the different types of factors See Dodd, S.C.: “A Review of the
Theory of Factors” (shortly to appear in Psychological Review). Garnett has shown that the
dice-like group factors when arranged according to the laws of probability, are not
inconsistent with, but only a variant mathematical form of expressing the general and
specific factors of the two factor theory. Otherwise expressed, the dice-like group factors
are only mathematical functions of the general and specific factors, while conversely the
latter are only mathematical functions of the former. Either can be expressed in terms of the
—
other. The conversion formula is g = 1/√N (ε1 + ε2 + ε3 + ….+ εn), where g is the general
factor of the two factor theory and the ε's are the dice-like elements composing Thomson's
group factors. Garnett, J. C. M.: “The Single General Factor in Dissimilar Mental
Measurements”, British Journal of Psychology, Vol. X, 1920.
The group factors referred to in this paper may be defined as that part of the
observed variables which produces residual intercorrelation after the gown] factor has been
partialled out, as described later.
4. For an exposition of this criterion in detail together with the exact and the approximation
probable error format see Spearman, C.: "Abilities of Man", Macmillan Company, 1927, pp
415, Appendix II.
5. Garnett, J. C. M.: Proceedings Royal Society, A96, 1919, pp. 91-110.
6. Spearman, C. and Hart, B.: “Mental Tests of Dementia”, Psychological Review, 1914.
7. Ordinarily the partial correlation coefficient cannot be so cleanly interpreted. But when
dealing, as here, with independent factors, additively combined and completely determining
the dependent variable, the partial correlation with one factor controlled does accomplish
the clean elimination of that factor and leave the residual raw correlation of the variable with
the remaining fanfare.
8. After working out this proposal, the author discovered that, as often before, Professor
Spearman had anticipated it in explicitly suggesting the partial correlation with g controlled
as an alternative criterion to the tetrad difference. This paper, therefore, is but an
exploration of this alternative criterion and an attempt to point out what seems its superior
features. See Abilities of Man, Appendix IV (10). The nomenclature there used of calling it
the "specific correlation" seems unsuitable, for when factors are correlated they are group
factors and cease to be specific to one variable alone. This extension of the meaning of
specific factor will lead to confusion when it is wanted to denote its proper meaning of a
factor not shared by any other variable. Consequently the "partial correlation with g
eliminated" or "the group factor correlation" would seem more precise terminology.
—
—
9. Strictly this PE is not the PE of rxy•g for two reasons. First rxy•g is the average of the rxy•g.
partials. But the sampling error of an average of indices is less than of a single one of them.
—
Hence the PE formula overstates the error and may be used for the rxy•g as well as for the
rxy•g’s. Again in using the limiting PE value the maximal PE replaces the observed one. This
obviates the necessity of averaging the observed ones in summarizing the table. The error
is again on the conservative side of overstating the amount of sampling error. For both
reasons significance ratio of the Coefficient of Equiproportion may be depended upon (and
is actually larger than given by the feminist).
10. For a pioneer exploration into negative group factors, or "interference fedora" and their
correlations see Thompson, J. It.: British Journal Psychology, Vol. X, 1920.
11. This in part accounts for the divergent significance ratio given by the tetrad difference
criterion and the coefficient of equiproportion criterion in Table 6, the rows for Tables 5 and
1. Another part of this divergence in rows for Tables 3, 4 and 5 is due to a slightly differing
combination of the tests. Thus for the tetrad difference criterion for the Table 5 analysis, the
Test No. 4, Instructions (which bad only one form) was required for a fourth test, while for
the coefficient of equiproportion criterion it was left out in order to isolate the group factors
due to alternate forms of the same tests more purely.
#27. Note on an Index of Conformity
University of Washington, Seattle
An index to measure the degree of conformity to some norm, or single class interval, of
a variable is here proposed. It is an improvement in percentage form of the 4th moment (in
sigma units) or Pearson's beta sub-two (taken from an arbitrary origin), which Peters proposed
under the label of an "index of institutionalization” (1). This index had grown out of studies such
as those by Allport on the "J curve of conforming behavior." The formulas for our "index of
conformity" (Cfy) and graphs of its behavior are given in Fig. 1. Its derivation is simply that,
since the 4th moment varies from unity to infinity, its reciprocal will vary between the limits of 1
to 0. This measures nonconformity so that the complement from unity of this reciprocal is taken
to measure degree of conformity. This proportion is multiplied by 100 to express it in familiar
percentage units. The origin about which the moment is calculated is the norm or class interval
of expected behavior, i.e., any arbitrary origin to which the degree of conformity of the data is
to be tested.1
Cfy = 100 (1 - (Σx2)2/NΣx4)
where x = X - norm.
This index measures the degree of concentration in, or dispersion from, one class
interval which may be the norm in the social mores, or may be any class interval set up by the
analyst as a hypothesis for testing the degree of conformity of the data to it. It measures
kurtosis on a scale where 100% is maximum, around 67% is mesokurtic, around 50% is
platykurtic, and 0% is negatively leptokurtic or maximal anticonformity.2 In Fig. 1, the base
lines correspond to the amount of conformity of each graph in the conformity scale (Cfy) at the
left. Graphs are presented for the simple two-category eases of conforming or nonconforming
in the first column, and for the case of five categories or class intervals of the variable in the
second and third columns (which differ only in that the norm is at one extreme in the second
column and is at the middle in the third column). The graphs show clearly for some common
types of distributions how perfect conformity means that all the frequencies are in the norm
class interval; around 50% conformity means a rectangular distribution in general, with the
frequencies equally divided among all class intervals, and 0% conformity means that the
frequencies are all concentrated in the class intervals furthest away from the norm. Further
features of this index are seen to be that it is applicable to all shapes of distributions and that it
is independent of the unit in which the variable is expressed. It is a percentage or pure
number, since in its formula the dimensions of the numerator and of the denominator cancel
each other out, leaving a dimension-less ratio. It thus measures conformity to a norm
expressed as a percentage of maximal conformity. A further feature of this index is that in the
two-category cases of conformity or nonconformity it becomes identical with the simple
percentage of persons conforming, and thus is readily interpretable by laymen.
Within certain limits, this conformity index indicates unimodality to bimodality. It
indicates this best in symmetric distributions when the norm is central. Then a Cfy of 100 %
indicates perfect unimodality; while Cfy of 0% indicates perfect bimodality. Intermediate
degrees of Cfy, while not measuring the tendency for the distribution to have one versus two
peaks, measure the underlying tendency for the population to be concentrated around one
class interval (the norm at the center) or dispersed into two concentrations around the two
extreme class intervals. Cfy is not a constant measure of bimodality, since the relation of Cfy to
degree of bimodality will shift with the conditions. Thus, under conditions of an asymmetric
distribution with the norm not central, 0% will not indicate perfect bimodality.
As the index drops it reflects a homogeneous or unified population, becoming separated
into two opposite camps with respect to the characteristic measured. Cfy thus can measure the
degree of enmity or opposition of two groups along a given dimension. Cfy does not measure
the dichotomizing of a group into two camps under all conditions; rather it measures the
conformity or absence of deviation of a group from a norm.
This index should have wide usage and great convenience in sociology, psychology,
and other fields in measuring the degree of deviation from some expected behavior or
conditions wherever that is one of two or more possible class intervals of such behavior or
conditions. It can crucially test hypotheses.
For judgments of sampling reliability, one may use the standard error of the 4th moment,
whether calculated about the mean or about an arbitrary origin. For large samples, this is the
usual σ1 √96/N. Or one may use the standard error of beta sub-two about all arbitrary origin.
For an example of its use, Peters' data of automobile drivers keeping in their proper
lanes on the highway curves may be taken. His data are 85.3% in lane, 19.1% crossing less
than half, 1.7% crossing more than half, 0.8% crossing fully into the other lane, giving all index
of institutionalization of 15.3, which cannot be readily interpreted unless one has many such
indices in mind to compare it with. The index of conformity for these data is 93.5%, which has
immediate interpretation even to laymen as being 93.5% of maximal or perfect conformity to
the traffic regulation.
Section 4: Studies on Value in Polling
Focus on opinion factors of values, or things-liked, V.
SD: 67-421
#28. The Likability Theory
for predicting probable acts of men - a transactional theory of valves-in-context
Institute for Sociological Research, University of Washington, Seattle
I. The "Likes" Models
A. The Problem
Towards solving the problem of forecasting a man's acts, there are many historic
theories of man that have used many synonyms for "As a man feels, knows, and has done, so
he is likely to do again." These three we shall call the three "modes" of action. They weave
through man's experience in many ways. Here is a table of some of these ways for a quick
look. (See Refs. 15, 21.)
Table 1:
Correlates of the Three Modes of Speech Behavior
Psychology ..........
Affective
Cognitive
Conative
Sociology ............
General ...............
Philosophy ...........
Kant'"Ultimate
modes"
Appreciative
Emotions
The beautiful
Feeling
Cognitive
Intellect
The true
Knowing
Evaluative
Conduct
The good
Willing
Art
Science
Ethic
Chief institutional field
Somatology ..........
Neurology ...........
Physiology ...........
Endomorphs
Ectomorphs
Afferent and
Central nervous
autonomic nervous
system
system
The senses (and
The brain
glands)
Mesomorphs
Efferent nervous
system
The muscles
This table shows that a large part of the problem has been one of the lack of
standardized terms, like cgs unit in physics by which to name and measure these models.
B. The Observing
To solve the problem of words the Likes Model (Ref. 21) gives a three standardizing
indices of a Liking felt, a Likeness known and a Likelihood of an act done.
In "Likes" wording, the modes answer these questions: How well do you like A? How
alike is A to the standard B? Now likely are you to do C? This operationalizes the terms
"affection, cognition, and co nation." Exhibit A shows these "Likes-scales." It is an adaptation of
the Stapel scale and the Cantril ladder.
"I like" is the term used for feelings because of the weighty evidence that these words
work better than any others in English Thus Edwards’ much studied Personal Preference
Scale puts its 450 'items ' terms of "I like " and Cantril finds strong and growing theoretical and
empirical neurological considerations supporting this formulation.
Likeness indicates can measure knowledge about the behavior to be foretold. Thus one
might ask: Do you think Person X will be most alike to or unlike the best of our presidents
(checked by later using "worst" instead of "best")? Which of the income classes listed here is
most alike to yours? Which of these alternative answers is most alike to your idea of the best
answer? These "likeness scales" are often similar to a semantic differential scale.
To construct a likeness index simply take the n multiple-choice questions of any poll and
let each choice implicitly say: This answer, out of the set here, is most alike to my opinion.
Obviously a set of likeness indices may be called for to probe similarities and differences seen
in as many respects as may be relevant to the prediction at issue.
Exhibit A shows a Likeness rating scale. Liking ratings differ from Likeness ratings, is
that the former say, "I like this” while the latter say, "This is alike to that" Liking-ratings relate the
whole person to an object in an approaching or withdrawing response while likeness-ratings
relate object A to object B.
The doing mode is better measured by a Likelihood rating too. To estimate future from
present behavior, two questions may be risked: "How often could you have done this, and how
often did you do it during the past year?" "How often could you do this during the next twelve
months: and how often do you plan on doing it?"
Another and more easily answered question asks: "How Likely or unlikely is it that you
will do it in the next twelve months? Exhibit A also shows a Likelihood Rating format.
The liking, likenesses, and likelihood ratings may be averaged to get a geometric mean
"likes" rating. This likes-rating measures the ancient trio of feeling, knowing, doing in analytic
and standardizing way that might develop c.g.s. -like units for behavioral science.
C. The Likes Hypotheses
Let us now state our hypotheses to help in forecasting some isolable acts of men.
What men like most they are most likely to try to get, other factors being constant. But
this tendency is often complicated by inadequate knowledge and experience of the best means
to get what is wanted most. Also besides his wants and his knowledge, man's habits will
determine much of his current and future acts.
From such common-sense considerations, we will formulate our three likes hypotheses.
Instead of using the all-or-none "If A, then B" form, the graded form will be used which says,
"Insofar as A, the antecedent conditions, exist fully and solely, the consequent behavior B is
expected." The "likes model" jointly expects:
Polls foretell probable acts of men insofar as:
1.
2.
3.
4.
5.
6.
they measure and record well
what men see
as most liked,
most alike to it,
and most likely
while unseen factors are neglectible.
-- the subhypothesis re methodology
-- the subhypothesis re perception
-- the subhypothesis re feelings
-- the subhypothesis re knowings
-- the subhypothesis re doings
-- the subhypothesis re constant context
.
This statement includes explicitly the methodology along with substantive content. It
specifies the antecedent behavior of both the observer and the observee since either alone is
less than the "full context" needed for fullest prediction. It specifies that its component
hypotheses combine as factors in a product; not just as addends in a sum.
For the "core subhypotheses" on the three modes in terms of likes phrasing, a simpler
formula may help in daily use. It is to remind oneself that: Acts seen as most alike to the acts
that are most liked and recently likely are likeliest. This is our "like rule." It is the core of our
"likability transaction" for which all the other hypotheses in this paper are the context. Its
simplest phrasing in eight words is, "Acts alike to those liked are most likely."
All this can be extended into a "modes-and-tenses model. (Refs. 15, 21.) Ask each of
the questions about a person's likes, in the past, in the present, and in the future tense. (See
Exhibit B.)
The mode-tense model is useful in analyzing the formation of norms. For what are
norms if not the three likes factors among people with a common history? For example, in the
analytic "likes" language, the Golden Rule might be paraphrased as: "Do to others what seems
most alike to what you would like most to have other likely to do to you in like circumstances."
D. The testing (Ref. 26)
Likes hypotheses are to ted by people everywhere — who are mostly unaware of it. To
test them more explicitly, observe and record the following steps.
1. For each person, P, inventory his relevant "likes," i.e., appropriate antecedent acts,
A, such as:
a. P's likings, Af, pertinent to the predictand object or behavior, B, i.e. his acts of
preferring or exchanging some effort, time, or money to get (or keep) B.
b. and P's knowledge and skills Ak, most alike to the means of his attaining B;
c. P's recent acts Ad likely to forecast B;
2. Measure the degree of each of these three "likes', or predicter variables, Af' k' d'
3. Correlate them in all pairs including the criterion B to be predicted to get regression
weights;
4. Systematize them in a causal hierarchy, trying to assign causal shares to each factor
in the product. (Log scores should be tried out appropriately.) See Exhibit A.
5. Using computers this is not as hard as it sounds. In the absence of computers, a test
would be to have subjects rank items by their liking for each item and then by
likeness of each item to whatever other item is most liked. If the two rank orders
correspond, the likes hypothesis would be confirmed (as to its subhypothesis
expecting correlation of likings and relevant likenesses).
E. The Applications (Refs. 31 4, 5, 7, 10, 11, 15, 16, 21, 22, 23)
The likes and likability models may be more useful than three formal models extant for
predicting human behavior. To test this, compare the alternative models in the following
respects.
1. Testability -2. Precision,
3. Universality
the model seems to us to be in principle wholly testable by polling,
insofar as society may desire to do so.
in offering operationally specified scales for its variables.
in applying to most behaviors of most men at most times in most
situations.
Submit it to the test of experiment. How well will it predict under highly controlled
conditions: That percent of the variance of the dependent variable, B, is accounted for? A case
of such testing is reported below in Table 5.
F. The Systematizing
Next note how this model or operationally-specified theory is simply a part of the over-all
transact model which grew out of our earlier descriptive dimensional system. (Ref s. 1, 2, 13,
14, 17)
The transact model starts with proposing six basic necessary categories which are
logical factors forming a product, not addenda forming a sum, as analyzing any social event or
behavior. These six factors are
(1) actions of
(2) people for
(3) wants in
(4) time and
(5) space under f residual specified
(6) circumstances .
Next, each factor is always observed at some level or power, i.e., at the qualitative level
as a set of elements, at the quantitative level as a sum of units, or at the relational level as a
product of factors, or finally at the systemic level as a power or self-product of a specified
order. Of course, any combinations may be observed of these necessary four "facets" standing
for the four levels of each factor. Eight versions or paraphrases of these four levels of
thoroughness of observing and recording the data are tabled herewith.
Table 2:
Eight Versions of the Facets of Every Variable
Facet Lay
Geometric Math
¯
Cases Points
Sets
0_
I Degrees Line-sects Sums
II
Relations Vectors
III Systems Spaces
Logical
Cases
Subclasses
Products Classes
Systemic
Part
Subsystem
System- inhand
Powers Classifications Supersystem
Corner position scripts, s
Presuperscripts, sX
Presubscripts, Xs
Postsubscripts, sX
Postsuperscripts, Xs
The "corner script" (=s) standing for the four facets of every variable (=X) may then be spelled
s s
out as fully as needed in a notation standardizable throughout all the special cases of s Xs .
Our transact analysis, thirdly, analyses the operators (called "functors" — such as the
familiar signs +, -, x, /, Xe, =, =/, etc.) (Refs. 14 17) These functors tell exactly how to combine
and relate together both the symbols of the factors and facets and their empirical referents in
the behavior of the observees and observers.
The essence of our transact analysis then is to analyze any recorded behavior-incontext, any transact, into its "features" (factors, facets, and functors) so fully and exactly that
their synthesis restores that transaction at the current moment and predicts its recurrence in
recurring context in the future This formally develops the "If A, then B" form of a statement that
is the acme of all scientific hypotheses and laws.
So far in describing our liability model we have analyzed only of the Acts factor — as
polled responses of liking, likening, "making likely." Let us now enlarge this "like model."
II. The "Likables" Models
A. The Problem
The problem in our likables models can be stated as: How well do three likes-ratings
and indices of the things-liked predict an action of men? This enlarges the relevant context that
is explicitly observed and measured.
B. The Observing
To gather relevant observations, note first that combinations of man's likes and thingsliked involve much of the fields of psychology and physiology, economics and political science,
sociology and anthropology, religion and philosophy, education and recreation, in short the
social sciences and the humanities.
What terms "likes" and "things-liked" connote in diverse products may be further
suggested in the set of paired terms below.
Table 3:
Products with Acts and 0bjects or "Likes" and "Things-Liked" As Factors
(in further contexts)
Likes acts
Things-liked
Wantings
Desirings
Valuings
Demand
Effort
Means
Stimuli
Wanteds
Desiderata
Valued-objects
Supply
Achievement
Ends
Responses
Quid
pro quo
Give/
Strivings
Costs
Payments
Strivings
Sacrifices
Worth
/get (ratio)
Goals
Benefits
Purchases
Results
Rewards
Amount available
(of the object)
The most important case of functors, or combined algebraic behavioral operators, is to
represent interaction of people, or group behavior, always as an algebraic product and never
as a sum. This distinction spells the difference between chance prediction from r = 0 and near
perfect prediction from r = 1 as in Exhibit C. For here combining the knowers (=p) of an item
who interact with the non-knowers (=q) as their product (=pq) yields almost perfect correlation
and maximal prediction of the obsessed increments in knowers, i.e., r(pq)B = .997.
But combining the selfsame data from interaction inappropriately as a sum, p + q, must
always yield a zero correlation of a constant, p+q = 1, with a variable and therefore minimal
prediction, i.e., r(p+q)B = 0..
Along with products three other classes of functors are useful, namely, appropriate
forms of ratios, bits, and matrices.
A ratio answers the question: what will you give to get what you want? Let us call this
the "give-get" ratio.
Bits are the units of "a choice made between two nearly equal alternatives."
Matrices are usually needed in observing data for likables modeling to arrange sets of
elements in rectangular arrays along n axes. Matrix algebra seems to us an essential tool in
studying likables models.
C. The Hypotheses
Let the likes hypothesis be enlarged now to include things -liked in a system of eight
subhypotheses which may be christened the "likables" theory of the full likability theory
(below), viz:
"A Poll tends to predict a probable action of men in context (called a "transaction")
insofar as:
1.
2.
3.
4.
5.
6.
7.
8.
it measures and records well
that those men see.
as most liked,
most alike to it, and
most likely
all relative to some "thing-liked"
as fittingly recorded (in matrices of give/get ratios, etc.)
while nothing else changes."
D. The Testing
Out of the myriad implicit and partial testings of these likables of methodology as
hypotheses that go on daily among people everywhere, we report here (See Exhibit C) three
relevant sets of experiments. (Refs. 28, 22, 25, 31.)
The fuller meaning of the first of these eight subsystems of hypotheses namely “to
measure and record well” the transaction at issue, is spelled out in our Scient-scales. (Ref. 28)
These 100 scales rate and start testing the excellence of methodology as on hundred
itemized ways in which the behavior of the observer and reporter of the transaction at issue
may demonstrably contribute to better explaining of it and better predicting of its recurrence
under recurring conditions.
The objective of the second experiment was to spread news items through a set of
persons. The desired behavior to be predicted and controlled was that "Many people came to
know the item" (well enough to retell it).
The chief causal preconditions were that the target population reacts in pairing off,
steadily, and randomly. These preconditions were stated in the "If clause" or "insofar as"
preamble to the "logistic hypothesis.” This is paraphrased here in likables terms as:
"A poll tends to predict the probable acts of men in specified settings which are
here their learning of item messages through t periods in a growth curve
specified by the logistic equation, insofar as:
1.
2.
3.
4.
5.
6.
it measures well
what each of the P men heard
and told to a new men
with liking enough to retell it
with likeness to the first telling in wording and in speed enough to neglect 1 likeness
and with likelihood that is on the average nearly the same for each man, item, and
period
7. all relative to some "thing-liked" which is here the P news items
8. as in the give/get ratio: give a telling/get a hearing
9. while nothing else relevant changes."
A classroom of 58 college freshmen boys was selected so as to control 52 variables
within usual limits. Each boy was a starter of a different message item. All persons paired off at
will in t successive minute-long periods.
The findings (Refs. 22, 31) clearly and reliably confirmed the whole set of the
hypotheses of this base-line likables model. For the observed increments in diffusing
correlated with the expected increments almost perfectly at r = .997. We further more expect
that this logistic growth curve for the variance of an interact will become tested and recognized
as one of the most basic and exact laws of elementary social behavior (in communicating and
acculturating) that the social scientists can point to.
In this base-line experiment, the three modes were held constant in order to establish a
model for controlled experiments on human groups which contains a firm zero point on the
interaction continuum.
Our next step built upon the Logistic Experiment, was the third “Experiment on Clique
Size.” (See Exhibit C) This begins to measure from a baseline of random behavior (caused by
"many small independent influences") the non-random or systematic social influences in which
sociologists are primarily interested. In life each person's clique could be observed, much as
Moreno did, by asking him
(a) Whom he liked most to interact with in some way such as 'telling new to" here, or
(b) Who was most alike to his ideal listeners, or
(c) Whom he was most likely to tell news to.
This "cliques model" in logistic diffusing expects that limiting the retelling to one's own
clique would slow down or decelerate the diffusing and so reduce the steepness of the logistic
diffusion curve (see Exhibit C)
This "cliques model" varies the likables factors thus:
1 . The thing-liked (the clique for communicating) varies from 1 up to P-1;
2 . The liking varies between "persons chosen" and "unchosen" for communicating to;
3 . The likeness similarly varies as an all-or-none alikeness vs. unlikeness to whatever
defines membership in one's clique or set of daily contacts;
4 . The likelihood similarly varies from unitary to zero probability of any person being
told the news item by another person, according as the two persons were in the
same clique or not.
Our "correlating of modes" hypothesis expects that abnormally low or negative
correlations between feelings, knowing, and doings may flag schizophrenic personalities or
societies in respect to the predicted behavior, while high correlations may indicate integrated
or normal personalities and a society fulfilling its roles and living up to its norms in the
measured respects. For fuller testing of our "modes and tenses hypotheses" see Ref. 26.
The m resulting curves from this cliques submodel are shown in Exhibit C. The diffusion
curves did become steeper with increasing clique size as the hypothesis predicted.
The Controlled experiment seems to us to confirm crucially this of the likables model.
But more importantly it opens a vista whereby controlled social experiments under more and
more complete and lifelike conditions can help develop a more exact behavioral science. We
see scientific experimentation as highly promising for increasingly solving the ancient problem:
How may the probable acts of men be predicted?
E. The Applying
One potential application of this likables model is to help men answer the question:
Whither mankind? by the scientific answer:
(1) towards whatever men like most
(2) towards whatever seems most alike to what m n prefer most; and
(3) towards whatever seems to men most likely of achievement (within the limits of what
they like most).
This answer might be condensed to: "Whither man likes or seems alike to it, and likely."
Towards finding out how to achieve agreement on the content of statements such as
the above, we are at present engaged in a research project called "Project Consensus."
Results from a series of pilot studies and later rigorously controlled experiments indicate that
discussion with intent-to-agree is an efficient mechanism in norm formation. The hypothesis
implies the common-sense notion that discussion with intent-to-agree in a major factor in
promoting agreement. These experiments, not yet published, repeatedly demonstrate, on
diverse issues and in different groups, that sets of dyads discussing an issue with intent-toagree, versus matched control groups with no such intent, will, on the average, change their
opinions towards a consensus by reducing their initial variance within pairs by 25% to 50%.
The intent-to-agree factor here eliminated about a third of their initial disagreement.
This consensus experiment, from simple beginnings, can develop a technique for
helping to form and measure people's value systems as it explicitly, comprehensively, and
sincerely as people may like enough to pay its social costs. It further offers a technique for
reforming any person's or group's value-systems as far as mutual persuadings can do so when
all else kept unchanged in tightly controlled experiments.
This instrument may be found by many groups to be useful for discussion and education
of the members. It is the essence of the democratic process, focused and controlled in
scientific experiments. In short, it is a scientific technique for making the democratic process
and the educational process a little more efficient.
For example, let us hypothesize a high school faculty which has become polarized
around two views with regard to discipline. The principal, by judicious use of this technique and
with the consent of all and the coercion of none, can set up the group in such a way that
discussion will reduce the dispersion about the mean, resulting in change towards consensus
of the whole group.
Or, again high school may be aware that some of its students demonstrate socially
approved attitudes, while others are "troublemakers." By organizing the students in groups
weighted in favor of those demonstrating the approved attitudes, the "troublemakers” attitudes
are likely to move toward the group mean, or in other words, tend to "straighten out."
F. The Systematizing
This likables model can enlarge the transaction (called the "likables" transaction) to
predict some later behavior or second transaction by using as predictors both the three likes
acts and the things-liked. The dimensional formula for the likables model explicitly develops
the facets of the two factors — Acts and Values in give/get ratios — so as to try to maximize
the correlation between the earlier predictor behavior, the tension, and that behavior as it later
is observed to occur. The size (and the reliability) of this correlation between the likables
predictor and the outcome may be said to validate the model or verify its hypotheses — within
limits of the situation studied, of course.
III. The Full “Likability” Models
A. The Problem
What further factors beyond the Acts and Values (the three likes and things-liked), with
what facets modifying them and functors combining them, seem likely to raise the predicting of
the outcome behavior at issue (such as diffusion of knowledge in the logistic transaction
above)?
B. The Observing
We observe first the standard transactional context factors OS follows:
1 . Acts-in-the-setting. Thus in the logistic situation above the core acts were simplified
to the telling-and-bearings and recordings of items between persons who were
almost unaware of the contextual acts by the experimenter. These acts of the
observer included previously selecting a homogeneous set of core actors, controlling
irrelevant influences, etc.
2 . Actors of the core acts and also the actors of the conditioning acts.
3 . Timings of any end all acts as systematized in our dimensional analysis (Refs. 1, 2,
5, 6, 13, 14). These timing factors in the context of any transaction at issue include
most importantly of all its features the past history and experience and future
intentions of the transactors.
4 . Spacings of the foregoing in the points, lines, areas, and volumes that condition
human and all other action.
5 . Values or relevant things-liked that in the three tenses :( as past memories, current
likings, and future purposes) may operates consciously or not, on most if not all of
the human behavior that any be involved in the prediction.
6 . Material or relevant material equipment, energies, and resulting states (i.e., any
relevant factors expressible in c.g.s. units)
7 . Symbols insofar as they may correlate observably with the behavior to be predicted.
8 . Residual circumstances or whatever else that correlates with the predicted
transaction.
C. The Hypotheses
With to further hypotheses included, the ten hypotheses in the full "likability theory"
expect:
"A poll tends to predict a probable transaction (i.e., behavior-in-context) insofar as:
1.
2.
3.
4.
5.
6.
7.
8.
it measures and records well
that most men see
as most liked.
as most alike to that. and
as likely,.
all relative to specified things-liked,
as fittingly recorded in matrices of give/get ratios, etc., and
all in relevant setting of the eight factors called the antecedent Transaction A
(especially the transactors’ perceptions of the past and the probable future of the
"likables" factors in the transaction at issue).
9 . insofar as the features of the Transaction A and B are well matched
1 0 . while nothing else changes."
D. The Testing
Towards testing the predictivity of this inclusive likabilities model an empirical survey
and a logical argument will be reviewed here.
Empirically, we report briefly on a part of a nationwide poll by mail. This poll compared
the correlations from the nine questions in the likability model (in less-developed form) with
correlations from 162 rival scale (Ref 26) The Transaction B to be predicted was the
respondents' degree of support or opposition to a national voluntary membership organization
dedicated to one side of a highly controversial issue in the U.S. Jewish community. The 4-point
index of this supportive behavior, B, was the organization’s records for the 1,364 respondents,
classifying them into "paid-up members," "lapsed members," "non-members possibly
"susceptible" for recruiting," and "resigned ex-members."
The mailing went to a random sample of a universe defined by full lists of persons
supplied by the organization as in the four categories above.
The questionnaire measuring the likability questions was substantially the nine "modeand-tense" questions shown here as Exhibit B.
The comparison of the nine likes ratings with the other 162 "Comparison" ratings was
stacked against the former to make the test more convincing if the likes questions showed
superior predictivity as the likes hypotheses expected. The findings are shown in Table 4.
Table 4:
Criterial Correlations from Modal vs. Non-Modal Ratings (Ref. 21)
Predictor sets of variables
9 modes questions
9 randomly-selected non-modal scales (of 10 questions each)
9 most predictive non-modal scales (of 10 questions each)
9 least predictive nor-modal scales (of 10 questions each)
17 Thurstone and Likart non-modal scales (of 10 questions each
r
0.78
0.75
0.75
0.47
0.75
The nine single questions from the likes model out-correlated each of four sets of nine
scales each scale some ten questions. Clearly and against odds the nine likes questions outcorrelated a large field of 162 comparison questions, even when the latter were optimally
organized in scales with optimal regression weightings.
This poll:
1.
2.
3.
4.
involved phrasing largely in likes tea;
involved rough statements of amounts of things-liked with give/get tensions;
involved 162 largely alternative context variables;
involved a high degree of matching to the criterion behavior. (Hyp. #9)
In trying to predict the transaction celled "supporting organization X" we postulate that
predictive correlations will rise according as:
1. the predictive acts match the acts in the outcome:
2. the date of the predictor matches the date of the outcome; i.e. as the time-interval
shrinks;
3. the predictor population observed matches the predictand population;
4. the same four subsets of the population are observed in the poll as in the criterion;
5. mean behaviors are observed in both predictor and predictand;
6. etc. (Refs. 7, 8, 16, 20.)
The principle that matching all features of predictor and the predictand improves their
correlation is necessarily true at the limit where they become one and the same.
In short, the prediction-by-matching hypothesis seems useful always in need of better
specifying to avoid superficial matching and include more comprehensive matching of all
features and deeper functional even cause-effect matching.
E. The Applying
The likabilities model is in such constant daily use everywhere in informal pieces that a
formal list would far exceed the bounds of an article. We merely note that any aptitude tests for
a job, or college admission, or marital prospects, etc. seek to match and measure whatever is
most like the predicted behavior. Inquiries as to an applicant's experience and skills and
interests try to match what he has been likely to do and has liked with what is most like his
future role. Unconsciously do not most people; then forecasting their own or other persons'
probable acts in the future try in a common-sense way to base the forecast on as closely
matched past behavior in as closely matched contexts as they can?
F. The Systematizing
The likability dimensional formula can now be written as a special case of our full
transact formula. (Refs. 13, 14)
This means that the likability models can be exactly viewed, both their symbolic and
their in both their symbolic and their referent behavioral features, as a subsystem within the
larger system of human behavior. This transactive subsystem consisting of the measured and
recorded acts of men is offered as a useful analyzing and resynthesizing technique for both the
behavioral scientist interested in rigorizing laws and the psychologist interested in people and
what they want most.
Appendix A: The Likes-Rating Scales
Appendix B: Ratings of Organizations
After each opinion (A to L) below, please encircle the one rating (1 to 7) which best expresses
your own degree of that opinion about Organization Y.
A. "I like the GOALS of Organization Y"
B. "I like the ACTIVITIES of Organization Y"
C. "I like the ACHIEVEMENTS of
Organization Y"
D. "As far as I know, the GOALS of
Organization Y fills the needs of its
clientele"
E. "As far as I know, the ACTIVITIES of
Organization Y are appropriate to its goals"
F. "As far as I know, Organization Y has been
ACHIEVING its goals well"
G. "In some past years I HAVE SUPPORTED
Organization Y with money or
membership"
H. "At present I AM SUPPORTING
Organization with money or membership"
I. "Next year I INTEND TO SUPPORT
organization Y with money or membership"
J. "I LIKE THE REPUTATION among people
like me from supporting Organization r'
K. "As far as I know, supporting Y
IMPROVES ONE'S REPUTATION among
people like me"
L. "I try to BUILD UP THE DEPUTATION
among people like me for supporters of
Organization Y"
5
1
2
34
12
3
4
5
12
3
4
5
6
7
0
12
3
4
5
6
7
0
12
3
4
5
6
7
0
12
3
4
5
6
7
0
12
3
4
5
6
7
0
12
3
4
5
6
7
0
12
3
4
5
6
7
0
12
3
4
5
6
7
0
12
3
4
5
6
7
0
12
3
56
7
0
4
6
6
I have no opinion
I agree very much
I agree quite a lot
I agree slightly
I am indifferent
I disagree slightly
I disagree quite a lot
Ratings 1-7
I disagree very much
Opinions, A – L
About Organizations Y
7
0
70
SD:63-84
Appendix C:
Logistic Diffusion
When Clique Size
Varies
“Baseline” Experiment
Simulated Experiment
The Author's Bibliography on Values
A list of the author’s references cited here leads up to these summarizing Likability Models.
Title
1. Dimensions of Society, Macmillan, 1942, 944 pp.
2. Systematic Social Science, American University of Beirut, Social Science Series, No. 16,
University of Washington Bookstore, Seattle, 1947, 788 pp.
3. "Progress Inductively Defined," The International Journal of Ethics, Vol. XLTV, No. 3, April
1934.
4. "A Social Distance Test in the Near East" The American Journal of Sociology, Vol. XLI, No.
2, September 1935.
5. "A Tension Theory of Societal Action" American Sociological Review, Vol. IV, No. 1,
February 1939.
6. "Of What Use is Dimensional Sociology?" "A Report of Further Research upon the Utility,
Precision and Parsimony of Dimensional Analysis," Social Forces, Vol. 22, No. 2,
December 1943.
7. "A Verifiable Hypothesis of Human Tensions," International Journal of Opinion and Attitude
Research, Vol. IV, No. 1, Spring, 1950.
8. "How to Measure Values," Proceedings of the Pacific Sociological Society, Research
Studies of the State College of Washington, Vol. XIII, 1950.
9. "On Classifying Human Values," American Sociological Review, Vol. 16, No. 5, October
1951.
10. "Historic Ideals Operationally Defined," Public Opinion Quarterly, Vol. 15, No. 3, Fall, 1951.
11. "A Statement of Human Wants," Educational Theory, Vol. III, No. 2, April 1953.
12. "Symbolizing the Values of Others," with Willies R Catton, Jr., Chapter XXXIV, Symbols
and Values: An Initial Study, Conference on Science, Philosophy, and Religion, Harpers,
1954.
13. "A Dimensional System of Human Values," with Chahin Turabian, Transactions Second
World Congress of Sociology, International Sociological Association,
International
Sociological Association, 1954, pp. 100-105
14. "The Transact Model — a predictive and testable theory of social action, interaction and
role action," Sociometry, Vol. XVIII, No. 4, December 1955.
15. "A Predictive Theory of Opinion — using nine 'mode-and-tense' factors," Public Opinion
Quarterly, Vol. XX, No. 3, Fall, 1956, 23 pp.
16. "Conditions for Motivating Men — the valuance theory for motivating behaviors in any
culture " Journal of Personality, Vol. 25, No. 4, June 1957.
17. "Strengthening Technical Aid by Social Research," PROD, Vol. II, No. 2, November 1958.
18. "The Reiteration Rule — a cyclic system for syntax, neurograms, and all laws," Synthese,
Vol. XI, No. 1, March 1959.
19. "Can Science Improve Praying?" Darshana, Vol. I, No. October 1961.
20. "The Logistic Law of Interaction When People Pair Off 'At Will'," with Garbedian, P.G.,
Journal of Social Psychology, 1961, No. 53.
21. "Ascertaining National Goals: Project Aimscales," American Behavioral Scientist, Vol. IV,
No. 7, March 1959.
22. "The Logistic Law in Communication" with McCurtain, M. Symposia Studies Series No. 8
of the National Institute of Social and Behavioral Science, Washington, D.C., September
1961, pp. 9.
23. "The Concord Index for Social. Influence," with Klein, Louise B. , Pacific Sociological
Review, Vol. V, No. 1, Spring, 1962, pp.
24. "Like Ratings in the Prediction of Human Behavior," with Louise B. Klein, Language and
Speech, Vol. IV, part 2, April-June, 1962, pp. 54-66.
25. "Clique Size as a Factor in Message Diffusion" with Garbedian, P.G., Sociological Inquiry ,
Vol. XXXII, No. 1, Winter, 1962, pp. 71-81.
26. "A Testing of the Modes, Theory," with Anderson, Ronald, Pacific Sociological Review, Vol.
8, Spring 1965, pp. 23-34.
27. "Two Consensus Forting Experiments: Demonstrating Transactional Models for
Modernizing," Transactions of the International Conference on the Problems of
Modernization in Asia under the auspices of the Asiatic Research Center, Korea University,
Seoul Korea,1966.
28. "'Scient-scales' for Measuring Methodology — and Rating Scientific Excellence of
Research Behavior", American Behavioral Scientist, June 1966, Vol. IX, No. 10.
29. "Use Scientific Methods in Planning", Arab Journal, Vol. IV, No. 1, Winter 1967.
30. "Hypotheses Defining Scientific Human ", The Humanist, accepted for Fall 1967
31. "Logistic Diffusion in Randomly Overlapped Cliques" with McCurtain, M., 20 pp. (to
appear).
32. "How Momental Laws Can Be Developed in Sociology," Synthese, Vol. XIV, No. 4,
Figure 1
Conformity Distributions
Conformity =
Every person in the standard or norm class interval (underlined) which is
expected by the mores. Cfy = 100
Nonconformity = Every person furthest from the norm class interval (underlined) which is
expected by the mores. Cfy = 0%
* It makes no difference whether many or no categories intervene in a two-category
distribution.
Notes
1. At the limit of complete conformity, when all deviations from the norm are zero, beta subtwo becomes indeterminate, needing evaluation. Aside from the mathematics of this case,
its computation gives no trouble since conformity is evidently maximal and can be so
recorded on mere inspection.
2. The mesokurtic and platykurtic percentages will vary somewhat, depending on the shape of
the distribution, the number of class intervals, and the location of: the norm between the
center and one end. Thus the mesokurtic normal probability curve with B 2 = 3 has an index
of conformity of 67% when the norm is at the mode. For another example, the platykurtic
rectangular distribution of five class intervals illustrated in Fig. 1 has a conformity index of
50 % when the norm is at one end and 41% when the norm is the middle class interval. Cfy
measures kurtosis strictly only, when the norm is the mean.
3. Peters, C. and Van Voorhis, W.R. Statistical Procedures and Their Mathematical Bases.
New York; McGraw-Hill, 1940, Pp. 82-84
#29. A Tension Theory of Societal Action
American Sociological Review, Vol. IV, No. 1, February 1939, pp 56-77
I. The General Equation Measuring Societal Tensions Definitions
A. Definitions
The first assumption in this theory is that behavior is determined by the principle that
"People desire objects of value," meaning that people continually want objects of psychic and
physiological satisfaction and therefore behave so as to increase or retain such objects. Let
the term "desideratum" designate an object of value, i.e., anything desired by people, since
"value" has many confusing connotations. To make this platitude useful, as in an equation from
which we can solve for unknowns, let P be the number of people desiring a specified
desideratum, let D be their average intensity of desire for it. 1 and let V denote the available
quantity of that desideratum. Thus, twenty athletic competitors may desire one of three prizes
with an average intensity of desire expressible in standard deviation units on some
behavioristic scale, such as the number of hours each will spend in training for that contest, or
on an attitude scale of endorsing such statements as, "I would rather win this prize than
graduate." (P = 20, D = xσ, V= 3.) Towards defining these three basic concepts, note that a
positive desideratum is anything towards which people behave so as to increase it or retain it.
A negative desideratum is that towards which they behave so as to decrease it or avoid it.
Desiderata may be economic goods and services, or a political office, a mate, graduation,
prestige, "a clear conscience," or any other tangible or intangible object of conscious or
unconscious desire. Such objects are desiderata strictly relative to specified persons at
specified times. While the object may be unchanged, the desire for it, the evaluation of it, may
vary, of course, between persons and dates. This variable intensity of desire is very difficult to
observe objectively, but, with ingenuity in research, attitudinal and behavioristic indicators of
desire are being progressively developed.2 The concept of "desire" should be interpreted
broadly so as to include all synonyms of wishing, wanting, appetites, urges, attitudes, interests,
or any internal states of the organism which, combined with stimulus situations, result in
behavior tending to increase or decrease the desideratum in that person's experience. Now let
the ratio of total desire of a population to the quantity of the desideratum available for satisfying
it define a fourth concept denoted by the letter E, thus:
(P D/V)+ =E (Definition of "tension," E).
Equation (4)
This dependent variable, E, is an equilibration ratio relating the three observed
quantities in a balanced equation as more clearly seen (after multiplying both sides by V) in the
form:
PD = VE (The simple tension theory).
Equation (5)
The equilibration ratio may be thought of as measuring the degree of tension in a
population with respect to the specified desideratum. If the desire greatly exceeds the quantity
of the desideratum, societal tension is high; while if the desideratum becomes abundant
relative to the desire, tension decreases. At this point, our initial assumption enters; it asserts
that societal behavior or actions are determined by such societal tensions. This is a hypothesis
upon the degree of truth of which evidence is to be gathered. Equation (5) is not a hypothesis;
it is a self-consistent definition. In it, desiderata are defined as what is desired and tension is
defined as the resulting ratio. Therefore Equation (5) is true by definition. The open question is
whether Equation (5) is useful in describing and predicting behavior, or, more exactly, whether
tension, as defined by E, is a necessary and sufficient cause of societal action.
B. Numerical Illustrations
To explore the possible utility of this formulation in Equation (5), as well as to clarify its
meaning, let us consider some examples from various social disciplines.
1. A political example
In a democracy, each voter's electoral desire is arbitrarily considered to be a unit, equal
to any other voter's desire. This is a rough weighting adopted for practical reasons when the
privileges of nobility were given up and no other better political units for measuring individual
differences of desire seemed compatible with the widely prevalent philosophy of equality.
Thus, in a situation where a million voters elect members to 20 seats in a Congress, by
applying Equation (4)
P= 1,000,000 D= 1 V=20 E= 50,000,
we find that the worth of each seat, or the societal tension towards it, is 50,000 of these voteunits. The worth of an office is thus proportional to its electorate, i.e., the mayorship of New
York outweighs that of a small town; the presidency is more striven for than a governorship.
For the relative tensions towards the various candidates for one office, the equation is applied
to each of the candidates A to N, as the desideratum, which results again in an E directly
proportional to the votes cast for that candidate.
Here, DA= 1 VA= 1 .. . EA= PA
DB= 1 VB= 1 .. . EB= PB
DN= 1 VN= 1 .. . EN= PN
Thus, in a democracy, elections are a crude but simple device to measure and apply equation
(5). (The equation is extendible to cases of nonvoters, proportional voting techniques,
restricted franchise, and indirect voting, as in the Electoral College, etc.)
2. An educational example
To determine the relative tension towards graduating in two colleges, the intensity of
desire for the degree might be measured by various units such as years of collegiate effort
sufficient to avoid failing out, or money spent, or an attitude test score, or the discrepancy
between marks achieved and marks predicted by the multiple regression weighting of college
entrance criteria. Suppose that one college weeded out its students earlier than another so
that the average number of years of effort put in by all entering students is 3 years and 1.8
years respectively. If, with 500 entering Freshmen each, College P customarily eliminated 50
percent before graduation and College Q, 40 percent, then the number of degrees, the
probable quantity of the desideratum, would be 250 and 300 respectively. The tensions then
would be:
EP = 500 X 3/250
EQ = 500 X 1.8/300
=6
=3
showing a higher tension (in these units) towards a degree in the first college because
of its higher probability of failures and because of its more prolonged struggle among the
competitors for academic survival. This tension varies inversely with the probability of
graduating (measured by V/P) and directly with the duration of effort (taken as a measure of D,
in this case). By hypothesis, this tension motivates study, so that E is a summarizing measure
of the causes of behavior directed towards securing the college degree.
3. A biological example
In the Malthusian theory, a population, P in number, has a food supply of amount V.
They may be conceived as having various intensities of desire to eat and survive and,
therefore, an average desire D (in whatever units measured). The tension, the severity of the
struggle to survive will increase:
(a) with increased mouths to feed,
(b) with decreased food supply, or
(c) with increased desires, i. e., psychic standards of living.
Given Malthus' assumption of population increase outstripping food increase, Equation
(5) justifies his conclusion that competitive tension tends to increase by misery (poverty,
famine, etc.), which both checks the population increase and lowers the psychic standards of
living. But given the new inventions of contraceptives reducing the birth rates and agricultural
technology (which has increased the food supply faster than the population increase in the
past century), it could have been predicted from Equation (5) that one or both of two results
would follow:
tension would be eased, — the struggle to survive would become less rigorous; or
desires would expand, — people would want higher standards of living.
These conclusions are not new, but the tool for analysis provided by Equation (5) may
assist the student to reason out consequences with greater precision. Superficial analysis, for
example, might easily jump to either conclusion (a) or (b) and ignore the possibility, compelled
by Equation (5), of both results occurring. Another utility of an equation is to challenge
research to measure the phenomena its letters symbolize with ever increasing precision, in
order that the unknown quantities (such as E) and the relationship revealed in derived
equations may become more accurately known.
4. An economic example
In economics, our PD is total demand, V is supply, and E is the economic value which,
when related to monetary units, is the price. The price goes up with increased demand or
reduced supply and vice versa. This is the elementary principle which of course is qualified by
refinements, such as the volume and velocity of circulation of money and credit, which also
affect the price, and by monopolistic conditions, substitute goods meeting a given demand, etc.
Value theory is highly developed in mathematical economics in the marginal utility theory of
value, the cost of production theory, the displacement cost theory, and other theories of
economic value. Without going into detailed exploration of these, it simply may be noted that
the present tension theory, which includes more than economic types of values, seems
compatible with these as a special case. The present theory is also extendible to qualitative
desiderata, to negative desiderata, and to desiderata unlimited in amount which transcend
scarcity economics and deal with an "economy of abundance."
5. A philosophic example.
The population increases naturally by births, and desires tend to increase naturally in
creatures possessing intelligence and not rigidly grooved by instincts. Therefore, tension tends
to increase since the numerator of (4) tends to increase spontaneously, and the denominator,
the desideratum, only increases in general as a result of human activity induced by mounting
tension. In this sense, the goal of societal activity is the easing of tension4 to bring desires and
satisfactions into equilibrium.
PD =. V or E =. 1 (The goal of action).
Equation (6)
Three alternative philosophies of life have been offered in history to realize (6). One
philosophy for reducing tension is exemplified by Nietzsche's advocacy of ruthless elimination
of the less fit, reducing the population to supermen, fewer in number. A second solution is that
of Buddhist renunciation of desire as illusion until the saints ultimately reach Nirvana, the
blessed state of absence of all desire. A third philosophy is that of the go-getter who strives
mightily to increase the quantity of the desideratum available. By invention and machine
production and collective efficiency, he works to get more of whatever is desired. These three
solutions, singly or in combination, would seem to be the only possibilities of reducing the
tensions of life: increase the production of desiderata, decrease desires for them, or decrease
the number of sharers for limited desiderata.5
C. The Cases of Qualitative, Negative, and Unlimited Desiderata.
A qualitative desideratum, undifferentiated into quantities of it or into degrees of any
kind, can be considered a unitary desideratum which is mathematically denoted by an
exponent of zero.
V = 1(A qualitative unitary desideratum.)
Equation (8)
Thus, in nationalism, the desideratum may be the complex but unitary quality "national
aggrandizement." Then, substituting (8) into (4) yields the conclusion that
E = PD (Tension for a unitary desideratum).
Equation (9)
This states that the societal tension of nationalism is equal to the number of persons in
the nation times their average intensity of desire for national aggrandizement.6 For such
unitary desiderata, tension becomes identical with the total intensity of desire of the population.
Another special case is that of negative desiderata, or aversions. The object of value is not
necessarily negative since it may exist in amount greater than zero. The desire varies from
positive through neutral, or zero, to negative degrees, while often the object of value may
remain unchanged. A person may shift from love to hatred of some object, the amount of
qualitative existence of which may be a constant. These aversive situations are simply
symbolized by minus signs before the D, denoting aversion, displeasure, disapproval,
withdrawal, or in general, "negative" desire. This is balanced by a negative E denoting tension
away from the undesired desideratum.
P(-D) = V(-E) (The case of a "negative" desire).
Equation (10)
Care must be taken in manipulating this equation, since psychological causal
connections exist between the factors limiting their mathematical degrees of freedom. Thus, if
a negative desideratum, such as some danger, increases, the tension will not decrease
compensatingly as it might for a positive desideratum, because the negative desire, the
aversion, will increase as a response to the stimulus of danger, resulting usually in actual
increase of tension. A third special case is that of desiderata whose quantity is unlimited. Free
goods of economics such as the air we breathe, friendship, communion with God through
prayer, beliefs, are some examples of unlimited desiderata. Although potentially unlimited,
actually each person uses or experiences a finite amount which may be termed a "share." As
the shares, in whatever imaginable units expressed, vary in size between individuals, the
average share may be conceptually taken as the unit of V. The number of these average
shares is the quantity of the desideratum realized. Thus,
V/P= i and V=P, and .. . E=D (The case of an unlimited desideratum expressed in
units of the average "share").
Equation (11)
This states that for an unlimited desideratum, the number of individual shares of which
equals the number of sharers, the tension is the average intensity of desire for that object of
value. Thus, the tension toward the desideratum "communion with God" is proportional to the
average desire of a population for that desideratum and is not limited by the amount available. 7
For another example, the value, "friendships," is unlimited in potential quantity available. The
societal tension or urge towards friendships, then, is seen from (11) to be equal to the average
desire of that population for friendships. Alternatively, the tension towards the unitary value
"friendship" (V0) is seen from (9) to be equal to the total desire of that population for friendship.
Thus, (9) and (11) are consistent since the collective noun "friendship" means the sum of
"friendships."
D. Systems of Values
To describe societal situations more adequately the tension theory defined by the
simple equation (5) must be extended to a matrix equation. It is obvious:
that desiderata may occur in interrelated sets;
(b) that their evaluations, i.e., corresponding desires, may vary with different plurals;
and
(c) that all these resulting tensions may vary with time.
These facts may be tabulated using v columns for the different desiderata, v in number;
using p rows for the p different plurals that may be desiring each value; and using t pages for
the t dates or periods at which observations are made. The letters v, p, and t, attached as
scripts to the factors in the value equation will denote such a tabulation. 8
The tension equation (5) expanded by these scripts into such a tabulation becomes a
matrix equation and may be symbolized by
t
p (PD=
e
VE)v (The matrix equation for a system of values).
Equation (12)
This equation may be verbalized as: "For each of v desiderata at each of t dates there is
a societal tension determined by the ratio of the total intensity of desire of the p plurals to the
qualitative or the quantitative desideratum desired." For qualitative desiderata, the exponent, e,
is zero; for the quantitative desiderata, e = i. Two examples of this matrix equation may be
suggestive. A familiar example of a system of values is the process of making a budget
whether personal, institutional, or national. Here the various items of the expenditure-half of
the budget, are the v different desiderata, the quantity of each of which is measurable in
dollars (or alternatively in percentages of a limited total of dollars). The aim of the budget
making is to equalize the tension towards each of the desiderata. This may be represented in
the chain equation:
E1=E2=E3 = … Ev
or
D1/V1 = D2/V2 = D3/V3 = Dv/Vv
(where p cancels out, since it is a constant throughout).
As soon as the budget maker realizes that the intensity of desire (such as D3) for one
desideratum is so large in comparison with the dollars allocated to it (V3) as to make that ratio
exceed the others, this excess is reduced by taking dollars from desiderata showing subaverage ratios and reallocating these dollars to V3 till D3/V3 is equalized with the other ratios.
The process of determining D usually is crude. In the case of one person, he may not verbalize
D in any units but merely have a total feeling that V3 ought to get more dollars and that some
other item (Vx) can be pruned a bit. In the case of a group, speakers will voice a desire, a
canvass may roughly gauge its extent, or a majority vote on an amendment will formally show
it. These and other techniques roughly tend to actions equalizing the tensions for the various
budgeted items, i.e., equilibrating the system. For a second example of the matrix equation, in
the Russian Five-Year Plan a schedule was made up tabulating five annual values such as
coal and wheat production, kilowatts from waterpower, etc., for each district (i.e., plural) of the
U.S.S.R. If the intensities of desire of each plural for each value had been measured in some
common unit such as man-days of labor devoted to producing it, the societal tensions for each
district as planned, and for the whole system in Russia (12), would be determinable. Now a
similar matrix could be tabulated for the amounts not as planned but as achieved. The matrix
may be summarized into single indexes to facilitate comparison or may be left as detailed
tabulations. The discrepancy between prediction and achievement is definable as a measure
of societal fulfillment. The more closely the achievement corresponds to the plan the greater
the degree of fulfillment. The degree of fulfillment is the percentage to which the plan is
achieved. This tension theory of societal action briefly outlined here needs qualifying in several
respects, especially as to the mere difficulty of measuring the symbolized quantities, but more
important is the question of its utility when and if applicable at all. The next part of this paper
explores this utility further in showing how its derived equations can define some twenty
societal processes and reduce some of them to exact measurement, e.g., 'competition" and
"accommodation." I
II. The Effective Societal Processes Derived From the Tension Theory
The proposals towards defining and measuring the societal processes discussed below
are all derivable from the fundamental matrix equation (12) defining the tension theory of
societal action. Although in presenting these proposals, the deductive approach may seem
dominant (for clarity in exposition), actually their development was an inductive one. We set
out to develop a theory of societal processes which would provide criteria for reconciling the
highly diverse lists of "social processes" presented by different sociologists. Such criteria
should define each process in objectively observable terms and relate them into an orderly
system so as to show which processes are compounds of more elementary ones, which are
subclasses of more general subclasses, which are verbal synonyms masquerading as denoting separate content, etc.
The first step was to comb the literature and collect definitions and examples of all the
alleged social processes. Analysis of these revealed four highly general concepts running
through them. One concept, time, was universal; all authors define a process as something
going on in time. A second concept, a population, was implicit at least in all, as otherwise the
processes would be "natural" or mechanical, or at least nonsocietal processes. A third concept
denoting, in spite of many synonyms, a goal, a purpose, a motivation, a desire for something, a
cause of human action, seemed summarizable in our twin concepts of a desideratum as the
object of the desire and the more subjective "intensity of the desire" for that desideratum. The
possibility of relating these concepts of "population" (P), "desideratum" (V), and "desire" (D), as
in equation (4) next emerged. In applying this to the analysis of societal situations, the matrix
scripts specifying and allowing for variation between plurals, and between different desiderata,
as well as for variation in time, were found as summarized in (12).
A. First Order Processes
Four general processes emerge from this analysis defined by the increasing or
decreasing of each of the four factors P, D, V, and E. Confining our consideration at first to the
situations in which these factors can be clearly defined, let us consider each in turn and the
sub-processes each yields, as the particular plurals and desiderata they denote may vary.
When the population defined by a specified desideratum, or system of desiderata,
increases, there are numerous names for this two-way process, depending on the plural, such
as:
conscripting and discharging, as in an army;
hiring and firing, as in a factory;
enrolling and resigning, as in a society;
immigrating and emigrating, as in a country;
getting born and dying, as in a region;
baptizing and excommunicating, as in a church;
matriculating and graduating, as in a college; etc., etc.
Let the term "populating"9 denote these in any form, and let them be symbolized by:
+rP = the adpopulating process
-rP = the depopulating process
±rP = the populating process
Equation (13a)
Equation (13b)
Equation (13)
where the presubscript T denotes the period of time and the P with that subscript denotes the
numerical increase or decrease of the population in that period. The most important subtype of
depopulating is when it has the further characteristic of involving force, since this may be used
to define "conflict." In conflict, the opposed plurals of a population who desire a limited
desideratum carry their striving for it to the point of trying to eliminate the opponent. The
primitive case of conflict is that of the caveman, who, finding his competitor for game so
successful as to result in starving him, turns upon his competitor and tries to kill him. When
competition for national aggrandizement becomes too acute, it breaks over into war in the
attempt to eliminate, or reduce, the effective military population of the opposing nations.
Competition in sport turns into conflict when, instead of struggling for exclusive possession of
the desideratum desired by both parties, namely, "victory," the players start slugging and trying
to cripple each other so as to eliminate opponents from activity in the realm of that
desideratum. As means to the end of getting more of the desideratum, each party develops a
supplementary desire to decrease the population of the opponent. In conflict, the population
tends to become decreased, regardless of which party suffers most in the mutual attempt to
exterminate each other (as far as that particular arena of conflict is concerned). For a rigorous
definition to hold, this means that whenever conflict exists, (13b) tends to be true, and
conversely, whenever (13b) exists accompanied by force, conflict exists. "Tends to be" is
inserted since, obviously, actual slaughter may take time, as when a country mobilizes or a
Kentucky feudsman stalks his enemy for months. Also, obviously people may die from disease
and accidents making (13b) true, but here there is conflict with germs or physical forces as the
opponents. In this theory, however, we are concerned only with death, or elimination from
effective opposition in some value field, which is caused by other people, for only this is
societal conflict. The second factor of desire in the tension equation may be used to define the
paired process of "valuating," comprising "evaluating" and "devaluating," according as the
desire of the population for a specified desideratum is increasing or decreasing:
+TD = the evaluating process
- TD = the devaluating process
± TD = the valuating process
Equation (14a)
Equation (14b)
Equation (14)
A major subtype of devaluating, for example, is the familiar process of
"accommodating." In accommodating to each other, the parties curtail their mutually exclusive
desires for the limited supply of the desideratum so as to get along together without
competition or conflict. The intensity of the desire, D, is decreased. People realize that new
desiderata such as "living together peaceably" are more important. Some of the desire has
been transferred from the former desideratum, V1, to a new cooperative desideratum, V2
(which is in the realm described by another equation). If the decrement of desire of all parties
is such as to make their desires equal, we may have coordinate statuses resulting. If the
decrements yield unequal D's, we should expect to find superordinate-subordinate statuses
resulting. The ratios among the desires may be a measure of the steepness of ordination. (It is
relatable to a "social distance" margin.) Changes in the third factor of desiderata in the
Equation (12) are what is ordinarily connoted by "progress" or "regress," because "progress" to
a people is an increase of whatever they consider desirable. For a term meaning a change in
either direction, the prefixes meaning "forward" and "backward" may be dropped in the Latin
root which means "a step," giving the English participle "grading." This word, fortunately,
connotes an evaluation.
+TV= progressing
-TV= regressing
±TV= grading
Equation (15a)
Equation (15b)
Equation (15)
Two subtypes of this process distinguish whether the increase of the desideratum
desired is due to human effort or to other causes. If due to human interaction, it is what
commonly is meant by cooperating, the working together towards a common goal. If not due to
human effort, it may be termed "accumulating," as in the growth of forests or increased
abundance of a fishing ground. The opposite process from cooperating would be maloperating or "destroying" a desideratum, as in working together to plow under part of a crop.
Two charity agencies may cooperate by pooling records, etc., in order to reduce waste and
give more effective relief, which is the desideratum V. Our business system in the United
States, A.D. 1939, though called "competitive," is only partially so. Along with the competitive
attempt to get business away from the other fellow, this fear of losing, as well as desire to gain,
is a strong incentive to cooperating as here defined, i.e., to produce more desiderata for all.
Thus, automobile manufacturers are partly competing, partly differentiating in offering diverse
cars, partly accommodating in collective agreements, and partly cooperating in producing more
objects of value (automobiles) for the public to enjoy. Thus, the three conventional processes
of "conflict," "accommodation," and "cooperation" emerge as subtypes of three processes
which are all on the one continuum E, the societal tension. Any of these three processes being
sub-forms of depopulating, devaluating, and progressing, will decrease the tension. The
opposite processes to these three are the three possible forms in which tension can be
increased. Thus, internationally, increase of population, or of imperialist ambitions, or decrease
(or lag of growth) in possessions creates tensions resolvable only by conflict (as in war), by
accommodation (as in mutual compromises), or by cooperating (as in international
organization). For the case of the individual, where P= i, the alternative ways of reducing his
tension are only two: conditioning him emotionally so that he desires less, or operating on his
environment so that more of the desideratum comes to him.
+TE = (P-TP)(D+TD)/(V- TV) - (E) =the attensing process
-TE = (P-TP)(D-TD)/(V+ TV) - (E) =the detensing process
±TE = the tensing process
Equation (16a)
Equation (16b)
Equation (16)
The change in tension, rE, is the difference between the new tension defined by
Equation (7) and the old tension, E. Tensing is thus seen to be a compound process
composed of particular forms of the three elementary processes. Mathematically, all four
processes defined by (13) to (16) are derivable from the matrix equation of the tension theory
(12) by isolating one factor on one side of the equation for one desideratum and expanding it
for at least two dates. These definitional equations hold whether the population consists of one
plural (as assumed in (13) to (16)) or of many plurals. Since in all four processes the exponent
of the factor isolated is unity, these processes may be called first order processes. The scripts
in (12) therefore are the following:
t > 1, p> 1, v= 1, e= 1
Equation (17)
B. Second Order Processes
The next set of processes whose essential characteristics, induced from our survey of
the literature, fitted them into the categories deduced from Equation (12), are such standard
processes as "competition" and "mobility." Since in the indices which measure, and are here
proposed to define, these processes, the four factors P, D, V, and E, appear to the second
power as their squares, these processes are here classed as "Second Order Processes." The
outstanding characteristic of competing is that each party wants to get a larger share of the
limited desideratum, V, for which they compete. By as much as one competitor wins, the other
loses. Each customer gained by A is one less customer, whether potential or actual, for B.
Thus, Japan's textile trade gains in South America are at the expense of other competitors.
The victory of one football team is a defeat for the other. Each competitor strives for exclusive
possession, or at least a larger share, of the limited desideratum, thereby displacing other
competitors. Now this shift of some of the quantity of the desideratum from some competitors
to others offers a way to measure, and so to define, competing. Let us then define effective
competing during a period as the transferring of the desideratum from some competitors to
others. The standard deviation of these gains and losses seems the best measure to
summarize the extent of such transfers. (Note that the mean transfer is zero as gains must
equal losses by definition. Any net gain or loss for the whole population measures the
"progressing" or "regressing" process, as defined above that has taken place in addition to the
competing process.) To illustrate the measurement of effective competing, consider the simple
case of two competitors. In 1820, 90 percent of U.S. foreign trade was carried in U.S. vessels
but only 10 percent was so carried in 1900, the rest being in foreign vessels, almost all British.
1st competitor's V
2nd competitor's V
Beginning
of Period
(1820)
90%
10%
End of Gain or
Period Loss
(1900)
10%
-80
90%
+80
0
Square of
Difference
6400
6400
12800
¯¯¯¯¯¯¯ = 80% = percentage of maximal competing.
Mean change = 0 σv = √12800/2
The index of effective competing between the American and foreign marines in these
eighty years was 80 percent.11 This is 80 percent of the maximum possible intensity of the
process of competing such as would exist when the initial monopoly of one competitor became
the terminal monopoly of the other yielding a σv of 100 percent. The minimum competing at the
other limit would be 0 percent, as when all competitors end up with the same relative shares
they started with, so that no transfer of the desideratum would have taken place.
This formula is general to any number of competitors, P, and a desideratum in any units
which can be re-expressed as each competitor's percentages of the total desideratum. Let V
denote a competitor's percentage. Let V denote any change in that percentage whether gain or
loss in the period T. Then the index of effective competing is
Cp = √¯¯¯¯¯¯¯
.5Σ1PTV2 (The index of effective competing).
Equation (18)
This index is a percentage measure of the actual amount of shifting of the desideratum
between competitors. The amount is taken as a percentage of the maximum possible amount
of shifting.
It is derived as a ratio of the standard deviation of the gains and losses of the
competitors to the standard deviation of the situation where the initial monopoly of one
competitor becomes the terminal monopoly of another.12
"Competing" as defined above is "effective competing" and must be distinguished from
"causative competing." It is measured (as are all the processes in this paper) by its effect. The
intensity of competing is measured by the gains and losses that result. "Causative competing"
is the effort put forth by competitors, which may or may not result in any effective competing,
since intense efforts of competitors may cancel each other and result in no gains or losses as
often happens in advertising. Causative competing, though highly important, is less tangible
and requires inventing special indices such as the cost and area of advertising space, etc.
Effective competing has the immense advantage for scientific purposes of being readily
measured, but the reader should clearly note that the effective competing (as with all
"effective" processes defined here) is only part of the phenomena of competition. In terms of
the basic tension equation, the causative competing would be measured by E, the tension,
because the tension increases, the competitive struggle becomes keener, when the number of
competitors increases or their desires go up or the desideratum becomes relatively scarcer.
The effective competing is defined as the redistributing of a given amount of a desideratum
between the competitors. It should thus be evident that our competitive capitalistic economic
system is only partially competitive. While each competitor is struggling to get as large a share
of the desideratum for himself as he can, this causative competing (i.e., motivation or tension,
E) makes him work to produce more of the desideratum which is our definition of progress in
its subform of effective cooperating. In the production of wealth (and other desiderata), its
redistribution is a byproduct, and a byproduct controlled in varying degrees by governmental
action to prevent its becoming monopolistic. The formulas proposed here enable the isolation
of the competitive from the cooperative component in any situation where the desiderata
involved are measurable (including the case of unitary qualitative desiderata). The cooperative
component is measurable by the mean amount of change in the desiderata, the statistical first
moment of changes (= ΣrV/P = M); while the competitive component is measurable by the
redistribution of the desiderata, the statistical second moment of changes (= Σ rV/P = σ2);
Consider next the process definable by the second moment of the other factors of population,
desire, and tension in the basic tension equation (4).
Suppose that in a constant population, there is a redistribution of persons between the
plurals composing that total population. Some plurals may gain in membership while others
may lose. Employees may be hired and fired, migrants come and go, social status plurals
climb or sink, or enrollments in non-overlapping organizations rise and fall. This is "mobility"
with its many particular sub-forms depending on the values which define the plurals given in a
population. A standardized measure of the amount of mobility of a defined population in a
period is the equivalent of the index of competing, an index of net mobility, Mb. This is
definable as the percentage that the standard deviation of the observed net gains and losses
between plurals is of the maximum standard deviation. This is symbolized by:
MbN= √¯¯¯¯¯¯¯
.5Σ1PTP2
(The net mobility index)
Equation (23)
where p is the number of plurals and TP is the net gain or loss in percent of persons in a plural
in the time T. The index varies from 0 percent where no membership shifts occur to the upper
limit of 100 percent, as when the initial monopoly by one plural of all the population becomes a
terminal monopoly of another plural which gains the whole populations. 13
Second order processes may next be illustrated in the case of the redistributing of the
intensities of desires of the persons, or of the plurals, in a population. For a simplified
numerical illustration, consider the shifts of public opinion in the United States as measured by
one of the successive polls14 before the presidential election of 1936.
January 1936
October 1936
gains or losses= TD
Votes for Roosevelt
60.8%
39.2%
-1.6%
Votes for other candidates
59.2%
40.8%
+1.6% Rv =1.6%
According to these data, the election campaign hardly affected the desires of the voters
as the redistribution of valuation as measured by our index of "revaluating" is only 1.6 percent.
This index of revaluating has the same form of formula as before with the desire (TD)
substituted for the desideratum (TV) and for the population in the formula for net mobility (or
"repopulating" as it might be termed (23)). As before, the revaluating index varies from zero on
up to a maximum sigma which would occur when all the population's desire was initially
concentrated on one desideratum and was completely transferred finally to another
desideratum.
2
¯¯¯¯¯¯¯
Rv = V/√.5Σ
TD
(Index of revaluating).
Equation (24)
The election illustration given above happens to be a versatile one as it illustrates
secondary processes of all the four factors in the tension equation (4). The votes are a
desideratum competed for by the Roosevelt and anti-Roosevelt parties; they also represent the
net mobility of persons between the pro-Roosevelt to the anti-Roosevelt plurals; they also
measure the shift of popularity, or of relative desiring, between two values by the electorate,
and finally, since in this situation the societal tension equals the average desire (see 1,
Example A, above), the shift of votes measures the relative "retensing," or shift of tension of
the electorate, towards two camps. The votes happen to measure these four factors
simultaneously in this situation because the unit of desire in a democracy is defined as one
vote and is identical with one person. The votes are the competing for a desideratum only to
the candidates; to the electorate, each candidate is the unitary qualitative desideratum, the
desire for whom is measured by votes.
Whether some quantity measures the quantity of a desideratum or the intensity of
desire for it depends on the plural relative to which the desideratum is defined. What is a
desideratum to one plural may not be such to another. The desideratum and the plural and the
desire can only be defined relatively to each other.
Revaluating is here illustrated as between several desiderata, with corresponding
variable plurals; but revaluating may also exist between several constant plurals which may
differentially change their intensities of desire for one desideratum. Thus, on revaluating
between desiderata, the Σ in (24) is over the v desiderata; while in revaluating between plurals,
the Σ in (24) would be over the p plurels.15
C. Zero Order Processes
The first order processes where the factors, P, D, and V, have an exponent of one (e =
i) in equation (12) and the second order processes where the factors have an exponent of
two16 (e = 2) as in (18), (23), and (24), have been sketched. There remain the important
processes definable by an exponent of zero and which may therefore be termed "Zero Order
Processes."
Reversing the laborious inductive approach by which these zero order processes were
identified in the research reported here, consider for neater exposition the deductive approach.
Starting with the matrix equation (12) as given, isolate the population and the desideratum
factors in turn on one side of the equation and consider the case where:
the exponent is zero (in order to reduce the quantity of the factor to unity and isolate its
quality);
the date script is (at least) two (t1= 2) (so that we are considering a time interval and not
an instant, a process not a status); and
the script that does not correspond to the factor isolated is unitary (in order to isolate
the varying of one script at a time).
These two cases are expressed in the formulas:
2
1
V v = 2(PD/E)p = 21v = Tv
0
t
2
1
P v = p (VE/D)0 = p 1v = Tp
Equation (28)
and
0
2
2
Equation (29)
Since any quantity with a zero exponent is unity, these two cases simplify to:
the number of desiderata, v, desired by one plural on each of the two dates (28); and
the number of plurals, p, desiring one desideratum on each of the two dates (29).
These mean the change, positive or negative, in the number of desiderata (±TV) and the
change in the number of plurals (±TP), respectively, during the time interval (T1) between the
two dates.
A desideratum was defined as any qualitatively distinguishable object of human desire.
The increasing of the number of different desiderata is substantially the kind of process, for
phenomena involving human valuations, that sociologists currently call "differentiating," or
"dissimilarizing." Thus, in increasing the styles of beds to 78, manufacturers have been
dissimilarizing this particular economic desideratum. The Department of Commerce in seeking
to reduce these to 4 standardized styles is reversing the process.17 When two doctors, who
have been competing for the desideratum "all patients," start to specialize and thereafter one
desires the desideratum "medical patients," and the other "surgical patients," differentiating has
gone on, increasing the number of desiderata from 1 to 2. When charity organizations
standardize their diverse case report blanks, v in number, to one blank, the reverse process,
opposite to dissimilarizing, has taken place to the extent v or 100 percent. This reverse
process might be termed "assimilarizing," leaving the root "similarizing" without any prefix to
denote the process going in either direction.
-Tv = assimilarizing
+Tv = dissimilarizing( qualitative differentiating)
±Tv = similarizing (= t1v)
Equation (30a)
Equation (30b)
Equation (30)
This process is readily measurable whenever the desiderata are definable and the
number of them desired by the plural on given dates can be tabulated. 18 Next, whenever the
plurals in a population are non-overlapping groups of one kind, increasing of their number is a
measure of the process "dissociating" and decreasing of their number measures "associating"
as ordinarily understood by sociologists. For whenever the groups are definite, effective
dissociating must increase their number, and effective associating must decrease their
number. Merging of several corporations into one, federating of churches or states, the
membership of several unions collecting into one of them alone, are examples of associating
as here defined. The separation of Ulster and the former Irish Free State, of the Northern and
the Southern Baptists, of the old Standard Oil Co. into the Standard Oil of New York, of New
Jersey, etc., are examples of dissociating. These processes may be symbolized via (29) as:
-Tp = effective associating
+Tp = dissociating
±Tp = sociating
Equation (31a)
Equation (31b)
Equation (31)
where Tp denotes the change in the number of groups (i.e., interacting plurals) with
non-overlapping membership during the time T. These formulas measure the end result, the
effective sociating, which culminates all the complex behavior tending towards sociating. 19
Summary
The exploration of the societal processes definable by equations derived from the
tension theory equation (12) may now be systematically summarized in a tabulation.
Summary of the Effective Societal Processes Based on the Tension Equation (12) a,b
t
e
p (PD = VE)v
Factors
e=exponent P
the population
0
e=o
Pp = 1 p = p =
zero order number
of
processes
plurals
frequencies Sociating, ±Tp
in
Associating, - Tp
distribution Dissociating, +
Tp
e= i first Populating, ± TP
order
processes
means
in Adpopulating, +
distributions TP
Depopulating, TP
subvariety:
Conflict
e=2 second Mobility
or
order
Repopulating
processes
standardc,d
2
deviations
¯¯¯¯¯¯¯
Mb= √.5Σ
TP
in
distributions
D
intensity of desire
Valuating, ±TD
Evaluating, + TD
Devaluating, - TD
subvariety:
Accommodating
Popularity
Revaluating
2
¯¯¯¯¯¯¯
Rv= √.5Σ
TD
V
quantity
of
desideratum
0
Vv = 1 v = v =
number of plurals
E
social tension
Similarizing, ±Tv
Assimilarizing, - Tv
Dissimilarizing, +
Tv
Grading, ±TV
Tensing, ±TE
Progressing, + TV Attensing,
Regressing, - TV
TE
subvariety:
Co- Detensing,
operating, +TV
TE
Accumulating,
+TV
or Competing
or Retensing
Regrading
2
¯¯¯¯¯¯¯
Cp= √.5Σ
TV
+
-
2
¯¯¯¯¯¯¯
Rt= √.5Σ
TE
a. Completely generalized processes, of which the above being limited to tension
phenomena are special cases, have been worked out from an entirely general matrix
equation to be published later in fuller form.
b. The processes above, strictly, are limited to those societal situations where the
quantities denoted by the symbols are measurable. They may be extended as
hypotheses to situations where quantification is conceivable but not achieved.
c. Other types of second order processes based on the correlation coefficient instead of
upon the standard deviation as here, have been derived and classified but are not here
presented.
d. Higher order processes involving the third moment (e=3) as in skewing of a distribution
and involving the higher moments are possible but still rare in the sociological literature.
The processes defined by a single observed factor such as P, D, and V are elementary
processes in distinction to compound processes such as defined by E or any sum, product, or
other function of single observed factors. To date some score of such compound processes
have been tentatively defined from the sociological literature and fitted to formulas derived
from (12). Such processes include redefinitions of "opposition," "toleration," "persecution,"
"coercion," "exploitation," "socialization," "variation," "ordination," "stratification,' "stabilization,"
"integration," etc. The purpose of this system of definitions and symbols for societal processes
is partly classificational, i.e., to make an orderly system of concepts built on a logical basis that
can be indefinitely extended. Perhaps more useful than this is the more precise observation
and even measurement of these processes that is promoted by these definitions which reduce
the concepts to measurable entities. The measurement of the effective or completed aspects
of these processes can be used as more stable criteria against which to correlate and check
the less stable, more intangible, and varied phenomena which constitute the causative aspects
of these processes. Furthermore, if the effective processes can be measured with increasing
reliability their relations to other phenomena of society can be progressively worked out with
greater precision, furthering the aim of science to understand phenomena in order to predict
and control them as desired.
Notes
If each person's intensity of desire, in whatever units it may be expressed, be denoted
by d, then the mean desire is
D=Σ1pd/P
PD= Σd = the total desire of that population for that desideratum.
Equation (1)
Equation (2)
Many readers are likely to balk at this point, wanting evidence of the measurability of
sociological phenomena most of which seems so unsusceptible of measurement. Fuller
treatment of this issue will shortly be published in the following form:
A discussion of the philosophic foundations of such measurement: In two papers by
George A. Lundberg, "The Nature of Sociological Laws," Foundations of Social Philosophy,
(forthcoming), and "The Concept of Law in the Social Sciences," Philosophy of Science, Vol. 5,
No. 2, April 1938.
A presentation of postulates, methodology, and a system of quantitative sociology built
up of equations of measured entities in a pair of volumes in preparation by Mr. Lundberg and
the author.
This may be suggested by modifying the stimulus response formula B=f(O, S) (behavior
is a function of the organism's internal states and the stimulus-situation) to
Bv= f(D, S)
Equations (3)
meaning that behavior with respect to an object of value is a function of the resultant of
all internal states predisposing the organism towards or away from that object of value and of
the stimulus situation.
Of course, activity to ease one tension may increase others. The essence of analysis by
this intellectual tool, equations (5 and 6), is to define rigorously the desideratum involved and
the desire, the population, and the tension relative to it alone. Interaction of desiderata in
systems is dealt with later.
Using the presubscript T to denote change with time and a plus sign to denote an
increase and a minus sign, a decrease, these three philosophies together are symbolized by:
(P - TP)(D - TD)/(V + TV) + = (E - TE)
Equation (7)
which is very similar in essential meaning to (6) where E at the mathematical limit,
which is the psychic goal, is unity. This requires expressing the factors in suitable units and
this is a very difficult matter technically, though it is simple theoretically.
This use of a zero exponent to denote unitary qualities has far-reaching consequences
when combined with the notation of matrix algebra. It makes possible the handling of
qualitative phenomena in the same equations with quantitative phenomena by rigorous
mathematical methods.
In these cases of unlimited desiderata, the concept of tension, E, is usually superfluous
as it is simpler to think in terms of average desire. But this is a special case; usually in life, with
limited desiderata, tension and desire do not coalesce as in (11).
Any such rectangular arrangement of numbers, in rows and columns, is called by
mathematicians, a matrix. Matrix algebra provides a useful tool in dealing with such
aggregations of facts. Only an example of this matrix technique is presented here; its fuller
development will be published in a two-volume study, which is in preparation in collaboration
with George A. Lundberg. This study presents a quantitative systematics for sociology of which
the tension hypothesis (5) of the present paper is a section of one chapter put out here to invite
criticism and trial.
The present active participle ending in -ing will be consistently used to name processes,
as this emphasizes them as an activity-in-time and distinguishes them from their
corresponding relationships. Thus, "competing" is a process and "competition" will denote here
the competitive relation.
Throughout this paper, the presubscript T denotes a period of time. Any letter with the
subscript T attached symbolizes the change in that period in the quantity denoted by the letter,
(such as above). A plus sign denotes an increase; a minus sign, a decrease; the absence of a
sign denotes a change, the direction of which is not specified.
The velocity of this process of competing was one percent per year. This is the average
velocity, disregarding fluctuations from the steady trend, for foreign shipping to outcompete
American vessels.
The derivation and a special case or two may be noted here although fuller discussion
with derivation of standard errors is reserved for treatment elsewhere. The algebraic sum of
the changes is zero, and therefore the mean is zero also, i.e.,
p
Σ1 TV/P = 0.
The standard deviation of these changes is given by
2
σv = ΣTV2/P.
Equation (19)
This is maximal when one competitor loses all the desideratum and another gains it all,
while others have none. Thus, two cases show 100 per cent and the rest 0 percent. This
maximum σ is given by:
2
max σv = 2(ΣV)2/P.
Equation (20)
The ratio of these two sigmas, (i.e., the ratio derived from (19) divided by (20) multiplied
by 100 to make the ratio a percentage), gives (18) when it is recalled that by definition ΣV=
100%. For one special case, note that if the change, V' is given not as the difference in
amounts on two dates as in the case above, but as events, earnings, or happenings in a
period, TV becomes each competitor's percentage share of such change expressed as a
deviation from the mean share (100%/P):
TV
= 100V'/ΣV' - 100/P
Equation (21)
Thus, four basketball teams winning points in a season of 400, 200, I100, and 300
respectively, converted into percentages and then into deviations from the mean give an index
of competing of 15.8 percent.
Cp=√ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
.5(152 +52 +152 +52)
Whereas, were their winnings more equal, as in 300, 230, 210, 260 points, the effective
competing would be reduced to Cp=4.8%, as here there is less dispersion and no reranking of
the competitors. In this situation, since no team plays in all games, it is physically impossible to
get a monopoly of the points, as it would be, were the number of competitors reduced to two.
Even with two competitors, the maximum Cp is 50 percent since their status at the start is
equal and only at the end is it a monopolistic status. 100 percent competing would be
represented by the initial possession of a championship cup by one of the competitors who
loses it while the other gains it, thus reversing one initial monopolistic status. Such a cup is an
all-or-none desideratum, V0= I. (8) A second special case of common occurrence is when the
data are given as rankings. The formula here is:
Cp = √ ¯¯¯¯¯¯
.5 - .5r
= Cp from rankings.
Equation (22)
For the sigma of gains or losses in ranks is
¯¯¯¯¯¯
σv'' = σv'√.2(1-r)
Equation (22a)
which is the usual sigma of a difference when the two sigmas are equal as the initial and
terminal sigma of ranks, σv', must be as long as P is unchanged.
(σv'2= (P2-1)/12
Equation (22b)
(22a) is maximal when r = - 1.0 making:
max σv'' = 2 σv'
Equation (22c)
Finding the percentage that (22a) is of (22c) gives (22), since (22) is but a special case
of (18) where the ranked data are by definition of the units in a rectangular distribution and
cannot appear in the two all-or-none categories of a monopoly. Equation (22) reaches 100
percent whenever the ranking of the competitors is completely reversed. CP being a sigma of
a difference is a function of the change in the dispersion and in the correlation (i.e., reranking)
of the competitors. In the case of ranked data, the dispersion (σv) is held constant by the
nature of the units, leaving the reranking (r) to determine the intensity of the competing, Cp. A
third special case is where the situation in some way limits the complete transfer of the
desideratum between competitors even in the maximal instance. Thus, among pupils in school
marked 100 percent for perfect work and 0 percent for none, no one pupil can accumulate the
marks of all the others as in a monopoly. Maximal effective competing is where initial perfect
marks become zero ones or vice versa. Here, instead of only two deviations of 100 percent
being possible, P such deviations are possible so that the 2 in the denominator of (18) must be
replaced by P, resulting in Cp becoming identical with σv of (19). In this case, the standard
deviation of gains and losses in percentage marks is also itself the index of effective
competing, expressing the percentage of maximal competing. For illustration:
Pupil
Marks on date T
Marks on date U
Gains or losses = TV =
Gains or losses squared = TV2 =
2
¯¯¯¯¯¯
√Σ
TV /P
A
75%
74%
-1
1
B
70%
72%
2
4
C
80%
80%
0
0
D
80%
85%
5
25
E
65% Etc.
65% Etc.
0
0
ΣTV2 = 30
= Cp = 2.4%
showing very small effective competing for higher school marks in this illustration
Special cases occur again. Chiefly, net mobility, and gross mobility (or "turnover") must
be distinguished. The sum of the incoming and the outgoing members is gross mobility, while
their difference is net mobility (23). There is no definite upper limit of gross mobility; it may
exceed 10 percent indefinitely if calculated by formula (23). For one plural alone, the point
where Mb = 100 percent is the conventional "100 percent turnover," as where the number of
replacements equals the total number of employees. But Mb of gross mobility depends on the
number of plurals in the situation as well as on the rate of turnover. To make Mb, when applied
to gross mobility, measure the velocity of turnover alone, free from the factor of the number of
plurals, p, Equation (23) may be divided by p to make it an average measure thus:
p
2
¯¯¯¯¯¯¯¯
Mba = √Σ
1 TP /2p
(The gross mobility index).
Equation (23a)
This is the simple standard deviation of gains and losses. It is a percentage measure
varying from zero on up to 100 percent and may exceed 100 percent. As with the index of
competing, the mean gain or loss is zero by definition. Any amount of the mean above or
below zero measures the populating process, = TP, defined above as increases or decreases
in a total population. Mobility as here defined is purely the redistribution of the population
between plurals within the total population and is therefore measured by a function of the
statistical second moment, i.e., either the index of net, or the index of gross mobility.
Of course, if gain in membership is a desideratum desired by the plurals, mobility and
competing become one and the same thing, i.e., (18) = (23) since in this case TP=TV.
Another special case is that of plurals with overlapping membership as where persons
may be members of more than one fraternal society. This and other more complicated cases
of mobility are dealt with more fully in the forthcoming two volumes on systematic quantitative
sociology which is in preparation in collaboration with George A. Lundberg.
D. Katz, and H. Cantril, "Public Opinion Polls," Sociometry, Vol. I, No. 1and 2, July-Oct.
1937, 175.
A special case for all these secondary processes (Cp, Nb, Rv), but particularly useful in
the case of revaluating where the desires may be unmeasurable, is the case in which the
persons (or plurals) start equal and end up in a monopoly by one of them. Thus, alternative
desiderata, v in number, may be rated as equally desired at first, but on study, one finally
emerges as the best and all the desire of the plural is concentrated on it alone. Here, the
formula becomes a function of the number of desiderata, v. The initial equal desires for each
desideratum is the mean, namely,
v
M = Σ1 D/v.
The standard deviation of the terminal "monopoly," recalling that the shifts balance each
other (i.e., ΣTD= 0), is
2
¯¯¯¯¯¯
σD = √Σ
TD /v
As the deviation of the one monopolistic value is ΣD-MD and the v-1 other values is MD,
(25a) becomes:
σD = √ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
((ΣD - MD)2 + (v -1)MD2)/v
Equation (25b)
which on expanding simplifies to:
σD = MD√ ¯¯¯¯¯
v - 1 = revaluating between values in the "equality-to-monopoly"
case
Equations (25)
Analogously:
¯¯¯¯¯
σv = MV√P
- 1 = competing in the "equality-to-monopoly" case
Equation (26)
and
¯¯¯¯¯
σp = MP√p
- 1 = competing mobility in the "equality-to-monopoly" case.
Equation (27)
The largest numerical value of this index (25, 26, 27) is 50 percent, which is reached
when v = 2 (or P = 2, or p = 2), because this initial "equality to terminal monopoly" is half the
range from an initial monopoly to a terminal reversed monopoly
The second order processes and attendant relations of competition, mobility, and
popularity have been defined in terms of standard deviations of shifts. They may be also
defined in terms of the average deviation (A.D.) or other measure of dispersion of the shifts in
the factors during that period. Substituting A.D. for σ in (19) and (20) gives (18) as: ADCp = Σ (+
TV) (Index of effective competing in average deviation terms) (18a) where + TV denotes the
positive shifts, i.e., gains in percentages. (Note that (+TV) = - Σ (-TV) =.5 Σ (TV) =.5 (AD)P,
since ΣTV = 0). Similarly mobility and popularity may be expressed:
ADMb = Σ (+ TP) (Index of net mobility in terms of the average deviation)
Equation (23a)
ADMb = Σ (+ TD) (Index of revaluating in terms of the average deviation).
Equation (24a)
These indices in average deviation terms are simpler for persons not trained in statistics
to comprehend, but as a basis for correlation and further statistical study the standard
deviation is usually preferable.
This item is quoted from a list of other similar ones given in Stuart Chase, Tragedy of
Waste, 168-9, New York, 1929.
The completely general societal process of differentiating or dissimilarizing and its
opposite is dealt with in the systematic volumes in preparation; the present paper on tension
theory is limited to those societal processes definable by the factors in the tension equation
(12).
The limits of the range of the zero order processes may be noted in passing. If T p or Tv,
denotes the number of plurals or values at an initial date, T1, the limits of change up to a
terminal date, U1, are:
Uv =. 1, (limit of assimilarizing)
Up =. -1, (limit of associating)
Uv = Tv, (boundary between assimilarizing and dissimilarizing)
Up= Tp, (boundary between associating and dissociating)
Uv =. ∞, (indefinite limit of dissimilarizing)
Up =. Up, (limit of dissociating (the number of groups equals the number of persons)).
#30. On Criteria for Factorizing Correlated Variables
Biometrika, Vol. XIX, Nos. 1 and 2, July, 1927
Princeton University, Fellow of the National Research Council and Rockefeller Foundation,
U.S.A.
(1) Let us suppose that the system of correlated variables x1, x2, ... xt are functions of the
variables v1, v2, ... vm, where the number of the latter is less than that of the former, i.e. m < t.
Mathematically we may say that
x1 = f1(v1, v2, …, vm)
x2 = f2(v1, v2, …, vm)
x3 = f3(v1, v2, …, vm)
…………………….
Xt = ft(v1, v2, …, vm)
…………………………………………………..Equation (1)
Theoretically we can eliminate the v's from these equations and we shall be left with t –m
equations of condition, say:
xm+1 = m+1(x1, x2, …, xm)
xm+2 = m+2(x1, x2, …, xm)
xm+3 = m+3(x1, x2, …, xm)
……………………………
xt = t(x1, x2, …, xm)
…………………………………………...Equation (2)
Statistically xm+1, xm+2, …, xt will be absolutely determined by x1, x2, …, xm; or, looked at
geometrically, the regression surface for xm+1 on x1, x2, …, xm contains all the points, i.e. there
is no variability in any array of xm+1 for given x1, x2, …, xm. This amounts to saying statistically
that the multiple correlation ratio of xm+1 on x1, x2, …, xm is perfect or:
xm+1• x1, x2, …, xm = 1 …………………………………………………Equation (3)
Now if in the population sampled such a relation holds, it must also hold for every sample from
that population. For every individual randomly selected can only have variates lying on the
surface
xm+1 = m+1(x1, x2, …, xm),
and thus there can be no variation of xm+1 for given values of x1, x2, …, xm. In other words
every truly random sample will also give
xm+1• x1, x2, …, xm = 1
and there is no variation of this correlation ratio with sampling, i.e. this mth order multiple
correlation ratio has a zero "probable error." Accordingly if xm+1• x1, x2, …, xm be not unity or
any sample, then either the original series is not factorizable into in factors, or the deviation
from unity is not to be estimated by the "probable error" of random sampling, for the "probable
error" is essentially zero, but only by some measure of the accuracy of the observations on
which the data depend. All tests by means of the standard deviations of random sampling are
illusory, when we are seeking to determine whether a given correlation ratio (or a correlation
coefficient) is perfect in the population sampled. If it be perfect in that population it will be
perfect in the sample, for the standard deviation in samples will be zero. Errors of observation
are of course a different matter, but their extent and variation are not controlled by the same
laws as those of random sampling, and it is not legitimate to apply the latter laws to treatment
of such errors.
In the above discussion we have taken the variate xm+1 and supposed it expressed as a
unique function of x1, x2, …, xm so that xm+1• x1, x2, …, xm, is unity. But clearly we are dealing
here with any m + 1 of the t variates, and we must therefore conclude that it is a needful
condition of t variables factorizing into m other variables that all the possible mth order multiple
correlation ratios should be perfect. There are
t! / m! (t - m-1)!
such correlation ratios and every one of these should be perfect in the sample. It does not,
however, follow that there are this number of conditions to be independently satisfied. As a
matter of fact there are only t - m. For, speaking statistically, if
xm+1• x1, x2, …, xm = 1
then we may interchange any of the x1, ... xm variates with xm+1, and the perfect association will
still be maintained. In other words the perfect association indicated by a certain number of
multiple correlation ratios carries with it the perfect association of a number of other multiple
correlation ratios. An elementary illustration of this may be given for a simple case. Let us
suppose that three variates x1, x2, and x3 are linear functions of v1, and v2. Then it follows that
any x variate is a linear function of the other two. Accordingly the three multiple correlation
coefficients
ρ1.23, ρ2.31 and ρ3.12
are perfect, or we have:
2
2
2
r 12
+ r 12
- 2r12r13r23
ρ 1.23
= ——――——―—
=1
1 - r223
2
2
2
r 23
+ r 21
- 2r23r21r31
ρ 2.31
= —――———―—
=1
1 - r223
2
2
2
r 32
+ r 31
- 2r32r31r12
ρ 3.12
= ——――——―—
=1
1 - r223
………………………………………Equation (4)
These give each of them
2
2
2
1- r 12 - r 23 - r 31 + 2r 12 r 23 r 31 = 0
1, r 12, r 13 = 0 ……………Equation (5)
r 21 , 1, r 23
r 31, r 32, 1
or the discriminant is zero, which is the essential condition of the three equations not being
independent.
Now the "probable error" of a multiple correlation coefficient for a system of linearly
~2
¯¯
related variates is well known to be •67449 (1 - ρ 1.23 )/√N
, where N is the size of the sample
~2
and ρ 1.23
the value in the sampled population. Accordingly if ρ is perfect, i.e. unity, the
probable error is zero, or the value in the sample can have no variation. Hence we conclude
generally that since when the discriminant vanishes there is no variation in the discriminant:
(i) The discriminant must always factorize the standard deviation of the discriminant of
samples.
(ii) The difference between the value of the discriminant in the sampled population and
its mean value in samples must contain the discriminant in the sampled population
as a factor.
The latter conclusion follows from the fact that it must always be possible to draw a
sample having its discriminant equal to that of the population value, hence since when the
discriminant of the sampled population vanishes there is no variation of the discriminant of the
samples, the mean value of the discriminant in samples must take the population value, i.e. if
D be the discriminant in the sampled population, Δ¯¯ , the mean value of the discriminant in
samples Δ , its value in a single sample and σΔ - the standard deviation of Δ, then
¯¯
Δ
= D + X1D ......................................................................................Equation (6),
σΔ =
X2D
= ……………………………………………………………….Equation (7),
where X1 and X2 are certain functions of the correlation coefficients.
(2) We may illustrate this by determining X1 and X2 directly in a simple case.1 Let r¯¯ts be the
mean value of rts in samples and δrts in any sample be measured from r¯¯ts . Then if ρts be the
value of the correlation in the sampled population for the t th and s th variates we know that2
2
r¯¯ts = ρts ( 1- (1- ρ ts )/2N) ………………………………………………….Equation(8),
if we neglect terms of the order 1/N2 and higher orders.
It simplifies matters if we do not put
rts = rst
but only in our final result. Now let ¯¯
Δ be the mean value of Δ in samples, then ¯¯
Δ is not equal
to D, nor on the other hand is it equal to D~ , where we write
D~ = 1, r¯¯12, ,… r¯¯1n,
r¯¯21, , 1, … r 2n
…………….
r¯¯n1, , r¯¯n2,, …1
…………………………………………………….Equation (9)
~
We put rst = r¯¯st + δrst and note that Δ is a linear function of rst. Accordingly if D st be the (s, t)
~:
minor of D
~
~ + δΔ = D
~+Σ D
D
st δrst + product terms,
st
'
and we shall write
~
~=Σ D
δΔ = Δ - D
st δrst + product terms ...........................................Equation (10).
st
'
~. Squaring and taking mean values,
Here δΔ is not measured from its mean value ¯¯
Δ, but from D
which mean values we shall denote by curled brackets, we have
~2 2
~ ~
{(δΔ)2} = Σ
D
δ
r
st } + 2 Σ ' D st Duv {δrst δruv} + higher terms . …..Equations (11),
st
st
st uv
'
'''
where Σ denotes a sum for all values of s, t, (s not equal to t) and Σ' denotes a Sum for all
values of s, t, u, v, where s and t are not equal at the same time to u and v respectively. Now it
has been shown by Pearson and Filon3 that we have for material following a normal
distribution:
2
{δ2rst} = (1- ρ st )2/N ……………………………………………………....Equation (12),
where N is the size of the sample, and
{δrst δruv} = 1/2N[ (ρsu - ρst ρut) (ρtv – ρtu ρvu) + (ρsv – ρuv ρsu )(ρut - ρst ρsu)
+ (ρsu - ρsv ρuv) (ρtv - ρst ρsv) + (ρsv - ρst ρtv) (ρtu – ρtv ρuv)] .Equation (13)
But if we put u = s, v = t in this expression, remembering that ρkk = 1, we find that it reduces to
2
{δ2rst} = 1/N (1- ρ st )2,
as before. We can accordingly combine the Σ and the Σ' summations and write
~ ~
{(δΔ)2} = 1/2Ns tΣu v Dst Duv [(…)(…)+(…)(…)+(…)(…)+(…)(…)] + higher terms
'''
Equation (14)
the eight factors in round brackets being as above, and Σ being a sum which extends to all
values of s, t, u, v, including the cases where s = u and t = v.4 Here to the first order of
~
approximation we may write Dst for Dst, etc.
Consider the first pair in round brackets first:
Σ Dst Dst (ρsu - ρst ρut) (ρtv – ρtu ρvu) ………………………………….Equation (15)
'' '
Σ
Σ Dst (ρsu - ρst ρut) Σ
Dvu (ρtv – ρtu ρvu) ……………………………….Equation (16)
tu s
v
'
stuv
But
Σ
Dst ρst = D, Σ
Duv ρuv = D……………………………………………Equation (17)
s
v
Σ
Dst ρsu = 0, unless u = t when it equals D
s
…………………….Equation (18)
Σ
Duv ρtv = 0, unless u = t when it equals D
v
If εut be Kronecker's symbol, = 0 when u is not equal to t and = 1 when u = t, we have
our expression in (16)
= D2 Σ
(εut - ρtu)2 .............................................................................Equation (18a),
tu
'
2
'' (ρ 2 ) .................................................................................Equation (19),
=D Σ
tu
tu
'
where Σ" denotes a sum for all values of t and u except those for which t = u.
We proceed in exactly the same way for the other pairs of factors. In the second pair we
sum first over the suffixes v and t; in the third over u and t, and the fourth over s and u.
Accordingly there results:
2
{(δΔ)2} = (1/2N) D2 4 Σ
(ρ tu )
tu
'
2
2
Any correlation coefficient will occur twice, i.e. as ρ tu , and as ρ ut . Accordingly we find
{(δΔ)2} 4D2(sum of squares of correlation coefficients)/N ………..Equation.(20)
2
We have not, however, yet established that this is the standard deviation of Δ, i.e. σ Δ.
~
2
The above is the mean square deviation of Δ from D and we have for σ Δ:
2
σ Δ = {(δΔ)2} – ({δΔ})2…………………………………………………….Equation (21)
It is needful accordingly to find how the mean value of Δ in samples, i.e. ¯¯
Δ ,, differs from
~
the population-value D and the value D of the determinant with the mean correlation
coefficients, like ¯¯
r ts inserted. Now for any sample:
~
~
Δ=D+Σ
Dst δrst + ½Σ
Dstuv δrst δruv + higher order products …...Equation (22)
st
st uv
'
'''
where the odd subscripts refer to rows and the even to columns, and we retain initially the
distinction between rij and rji. The only limitation is that s and t shall not be equal to t and v
respectively. The ½ is again introduced to allow u, v freedom from s, t. (See note 4)
When we take mean values the linear terms vanish and we have:
~
{δΔ} = Δ - D = (1/4N)s tΣu v Dstuv [(…)(…)+(…)(…)+(…)(…)+(…)(…)]5 Equation (23)
'''
where the curved brackets contain the same factors as in Equation (13). We shall evaluate the
four series in order:
Q =s tΣu v Dstuv (ρst - ρst ρtu) (ρtv – ρtu ρuv)
'''
= t Σu v (ρtv – ρtu ρuv) Σs' Dstuv (ρsu - ρst ρtu)
t='| v'
s=| u
= t Σu v (ρtv – ρtu ρuv) Duv (εtu - ρtu) …………………………………….Equation (24)
t='| v'
where εtu is, as before, Kronecker's symbol,
=Σ
(εtu - ρtu) Σv Duv (ρtv – ρtu ρuv)
tu
'
v=| t
2
=Σ
(εtu - ρtu) [Σv Duv (ρtv – ρtu ρuv) – Dut (1 - ρ tu )]
tu
'
2
=Σ
(εtu - ρtu) [(εtu - ρtu) D - Dut (1 - ρ tu )]
tu
'
2
2
= D tΣu (ρ tu ) + Σ
ρtu Dut (1 – ρ tu )
tu
'
t=|' u
2
3
= D tΣu (ρ tu ) + nD – t Σu Dut (1 – ρ tu )………………………………….Equation (24a)
t='| u
t='| u
The second term
Q2 = Σ
Dstuv (ρsu – ρsu ρuv) (ρut – ρus ρst)
s t u v
' ' t' ╪v
s ╪u
'
= tΣ
(ρut – ρusρst) Σ
Dstuv (ρsv – ρsuρuv)
s u
v
's ╪u
'
v ╪t
= tΣ
(ρut – ρusρst) (– Dut – ρsuDst)
s u
's ╪u
'
Because Dstuv = – Dsvut,
2
= – n(n – 1) D + D sΣu ρ su ………………………………………..Equation (25)
'
s ╪u
The third term
Q3 = s tΣu v Dstuv (ρsu – ρsv ρvu) (ρtv – ρtsρsv)
' ' t' ╪v
s ╪u
'
= s Σt v (ρtv – ρts ρsv) Dst (εsv – ρsv)
t '╪v'
3
= sΣv (εsv – ρsv) [D(εsv – ρsv) – Dsv (ρ sv )]
'
2
3
= D sΣv (ρ sv ) + nD – sΣv (Dsv ρ sv )] …………………………………Equation (26)
'
'
Finally:
Q4 = s tΣu v Dstuv (ρsv – ρstρtv) (ρtu – ρtvρvu)
' ' t' ╪v
s ╪u
'
= s Σt v (ρsv – ρstρtv) Σu Dstuv (ρtu – ρtvρvu)
t '╪v'
v ╪t
= s Σt v (ρsv – ρstρtv) [–Dsv – ρtv – Dst]
t '╪v'
2
= – n(n – 1) D + Σ
(ρ tv )] …………………………………………Equation (27)
t v
'
Substituting Q1, Q2, Q3, Q4 in
~
{δΔ} = {Δ} - D = (1/4N)(Q,+Q2+Q3+Q4)
2
3
= (1/4N) [D t Σu (ρ tu )] + nD – t Σu (Dut ρ ut )
t='| u
t='| u
2
– n(n – 1) D + D sΣu (ρ su )
'
s ╪u
2
3
+ D sΣv (ρ sv ) + nD – sΣv ( Dsv ρ sv )
'
'
s ╪v
s ╪v
2
– n(n – 1) D + D – t Σv ( ρ tv )…………………………………………..Equation (28)
'
t╪v
2
2
we see that there are in all four complete summations of the ρ ij , and that in each ρ ji can occur
2
as well as ρ ij owing to our generalization of the original sums. In the same way we have in the
3
3
cubic terms Dut ρ ut and Dtu ρ tu . Thus we have
2
3
{δΔ} = (1/4N)[D 4Σ
(ρ tu ) + 2nD – 2n(n – 1)D – 2Σ
(Dut ρ ut )……….Equation (29)
t u
t u
t='| u
t='| u
Now: D = | r¯¯ut | but, as Soper has shown,
2
r¯¯ut = ρut ( 1 – (1– ρ ut )/2N) + hi + higher order terms ..........................Equation(30)
Hence:
~
2
D = | ρut | – uΣt [ρut 1 (1– ρ ut ) Dut]/2N + etc
u='| t
3
= D – nD/2N + 1/2D uΣt (Dut ρ ut ) + etc ………………………………..Equation (32)
u='| t
~
~
Clearly, in terms divided by N, D or Dut
or Dut. Accordingly
may be replaced to our order of approximation by D
~
~
2
{δΔ} = {Δ} – D = D – D + (1/4N)[D 4Σ
(ρ
) – 2n(n – 1) D]……Equation (33)
tu
t u
'
t=| u
2
2
Or, disregarding the double occurrence of ρ tu and ρ ut we have as far as order 1/N:
{Δ} = D + 1/N [2D (sum of squares of correlation coefficients) – ½ Dn (n – 1)
Equation (33)
Now the above equations show that {δΔ} is of the order 1/N and accordingly its square
is of the order 1/N2. Hence unless in our evaluation of {(δΔ)2} we go to terms of the order 1/N2,
we are not justified in retaining the term ({δΔ})2 in equation (21).
Thus to the order 1/N:
{Δ} = mean discriminant in samples
= D + D/N [2(sum of squared correlation coefficients) – ½ n (n – 1)] Equation (34)6
2
σ Δ = 4D2/N (sum of squared correlation coefficients)………………..Equation(35)
It is clear from these equations, that to the above order, when D = 0 in the sampled population,
σ Δ = 0 in the samples, and every sample must take for its Δ also the zero value. But it might be
argued that the mean and standard deviation of Δ depend, when D =0, on the terms in 1/N2
and higher powers, and that if these could be worked out they would show that Δ is subject to
variation although D = 0. Such terms however, albeit they might be hard to compute, must also
have D for a factor, and vanish like the terms in 1/N for D = 0. This follows at once from the
previous and more general method of treating the problem at the beginning of this paper, and
the present inadequate consideration of a special case has only been included because it
represents the line of approach adopted by certain psychologists. It is clear that no deviation
from zero in the discriminant of a sample can be attributed to errors of random sampling. Either
the discriminant does not vanish in the population sampled or, if it does, the deviation from
zero in the sample must be due to inaccuracy of observation, and the laws of such
inaccuracies are not those of random sampling. The obvious course is to repeat the
observations with greater care; if on reduction the discriminant shows no closer approach to
zero, it is fairly certain that it cannot be zero in the sampled population. This study owes much
to the assistance of Mr. Philip Hall and of Professor Karl Pearson, in whose laboratory it was
made.
Notes
1. It is needful to warn the reader that this is only a very special case, as it has been too often
supposed that the vanishing or non-vanishing of the discriminantal determinant is an
adequate criterion of factorizing.
2. H. E. Soper, Biometrika, Vol. IX. p. 105
3. Philosophical Transactions. Vol. 191 A, pp. 259 and 262 (1898).
4. The additional 2 in the denominator is introduced so as to generalize Σ the sum, and render
u, v not controlled by s, t. It is equivalent to writing ab + bc + ca = 1/2 (ba + cb + ac + ab +
be + ca), where every pair is taken in both orders.
5. We will represent the four summations as Q1, Q2, Q3 and Q4
6. This agrees with the value given by Isserlis for the special case of n=3. See Philosophy
Magazine, Sept. 1917, pp. 205-220. His formula on p. 211 is correct, but on p. 213 there is
a misprint in equation (11), i.e. 2A for 4A2.
#31. The Concord Index for Social Influence (short published
version)
by
Stuart C. Dodd
and
Louise B. Klein
University of Washington
I. Introduction – From “Social Control” to “Social Influence
The development of concordance theory began with a re-examination of the venerable
sociological concept of "social control." The literature yielded little in the way of consensus on
definition or projection of researchable hypotheses. This raised the question of whether the
concept was sterile as a basis for a unified theory that could lead to cumulative research. The
literature, however, is rich and suggestive, and with some refining of the concept of "social
control" it appears that it could be made to bear fruit in experimental science.
The inefficacy of the concept to date may be due in part to the tendency to reify and
consequently view phenomena as unitary entities rather than as a relationship of component
parts. The difficulties that have beset the sociological study of "leadership," for example, may
be due to one-sided efforts to correlate characteristics of leaders, while ignoring those of their
followers and the context of interaction.1 Likewise, in situations of social control, it appears
essential to focus upon relations between the actions of controllers and the reactions of
controllees, and secondarily upon the relations of both with the actions of any agents who
might execute the controls. While other writers have not altogether ignored the relational
aspects of social control, there has not yet been a systematic comparative analysis of these
two or three different kinds of action-within-a-situation, nor any index proposed for measuring
their common interaction. This lack has left us without tools for deriving or testing causal
hypotheses. In short, we have been stuck at the speculative level with no outlook for predicting
or "controlling" control.
When social control is looked upon as a relation among a number of relatees, a
potential measure of the amount of control in a situation becomes apparent. It could be simply
some sort of correlation — thought of as a degree of harmony or agreement among the
behaviors of the relatees. Thus interest is directed toward whether controllers get what they
want, whether they want what controllees do and want, and whether agents use appropriate
means. In other words, our concern is with the over-all "concord," in a situation — with the
extent to which wants, deeds, and other conditions are in balance. A salient question, for
example, would concern the extent to which concord is increased by the alignment or
convergence of contextual factors in the parties to a concord situation, e.g., is concord
increased in proportion as the parties share similar values, or similar cognitive attitudes, or are
near each other in space or time, or are the same people, etc.?
The emphasis upon relationships suggests that the term "social control" might not be
the most fortunate name for a sociological domain that aims to study the processes of social
influence upon behavior. Moreover, the term seems to have acquired certain negative
connotations which may make it a poor symbol for the range of phenomena its fabricators had
in mind. Experience in public opinion polling indicates that "social control" suggests to the
modern mind exploitive manipulation of one group by another, whether by totalitarian ogres,
hidden persuaders, or whatever other objectionable characters. The coiners of the term
certainly did not intend this limitation since they were interested quite as much in the desirable
and necessary controls, including reasonable "law and order," shared ideals, democratic selfcontrol, education and guidance of the young, etc., without which our personalities and
communities would fall apart. Further, the word "control" perhaps suggests that whatever it
designates is relatively successful or complete, whereas many attempts to influence may not,
and may not even be intended to, exercise control in a thorough-going or regulatory sense. We
wish to include in our study all shades of interhuman influence, including the situation in which
intended control fails completely and that in which a gentle influence succeeds in having a
discernible effect.
One further consideration has influenced the present choice of terminology. Taken
collectively, the writers on social control have concerned themselves with almost every
conceivable kind of social situation — every sort at least in which people influence the
behavior of people, directly or indirectly, intentionally or unintentionally, consciously or
unconsciously. The present treatment will not include all of the behaviors that all of these
various writers have treated as manifestations of social control. In particular, situations in
which control is so unconscious that no intention or purpose can be articulated by any group of
controllers are specifically excluded. This stipulation departs from the position taken by a
number of sociologists of whom Paul H. Landis is representative. Landis believes that an
assumption made by other writers "which is questionable (is) that social control should deal
with the conscious, purposeful regulation of society by its members, primarily." Rather it must
include the "impersonal and abstract social and cultural influences which permeate the
personality of each individual, and act as implementing, although often unconscious, forces in
the forming of his behavior."2 The importance of the phenomena to which Landis refers is not
questioned. However, for the present purpose it seems impractical to observe control or
influence with any degree of precision unless there is a statement of intent to control or
influence which can be measured against a degree of its fulfillment. Indeed, to call social
influence without a pre-announced intent to influence a case of "social control" is to make
"social control" equivalent to all social behavior and so a wholly redundant term.
The foregoing considerations have led us to adopt terms such as "intentional social
influence" and the "concordance system of models," rather than to lay claim to a theory labeled
"social control." If the reader finds the range of the system more limiting than the old theoretical
concept of social control, he may perhaps find the limitation compensated by the prospect of
going beyond verbal description to ways and means of measuring, planning, executing, and
evaluating units of socially influential behavior, and eventually compounding these units into
systems of intended influencings.
This article undertakes two pieces of groundwork:
(1) it establishes a set of basic categories of variables common to all the situations of
intended social influencings;
(2) it suggests a way of combining indices of variables from these categories in an
appropriate correlation or index of over-all "concordance."
These are initial and simple steps, and interhuman influence clearly is not a simple
phenomenon. But complexity and process must be observed in terms of some kind of core unit
which can be observed to differ in different time periods or under other differing conditions. It is
such a core unit and an index of it that we attempt to delineate here.3
II. The Basic Categories
Briefly, the categories we propose to use are simply formalizations of the familiar triad,
"goals, means, and results." That this is a fundamental set of categories is an old and
common-sense idea reflected in ordinary language, which has many synonyms for them, such
as "aims, efforts, and achievements." Their use here, however, requires more exacting
specification.
To begin with, "goals, means, and results" or their analogues in situations of social
influence, refer here to three distinct kinds of things that people do. These should not be
confused with the people who do them; influencers, agents, and influencees are not
necessarily three distinct sets of people. The three human groups can be identical, e.g., a club
membership might vote unanimously to assess itself (the goal), members might bill themselves
(the means), and all pay (the results). Or the groups may be only partly distinguishable — a
church group might set up a program for a youth group with overlapping membership. Or all
three groups may be entirely distinct, as when in the mercenary part of the American
Revolutionary War of 1775 Britain hired German troops to help restore political control of
American colonials.
Although we cannot always distinguish our categories by distinguishing groups, we can
always distinguish them as acts. The club membership votes, bills itself, and pays. Britain hired
troops; the troops fought; American colonials resisted control. People, of course, will always be
involved, as will people's desirings and valued objects and various other circumstances; but
these may occur as further context, not as the core determiners, i.e., variables directly
observed in a situation of social influencing.4
Toward operational definition, we need first appropriate terms and symbols. The folk
terminology is loose for present purposes, since we do not mean to include in our study the
influencing of animals, machinery, weather, or a multitude of other legitimate but nonsociological "goals, means, results" sequences. Let "pro-act" stand as a name for the setting of
goals (which are pre-announced target amounts of change in somebody's behavior), "meansact" for any behavior serving as a means that may implement a pro-act, and "re-act" for
behavioral results or actual outcomes in the form of the responses of intended influences
(which can include no response). Let the whole sequence when explicitly recorded be called a
"tri-act." We can symbolize the pro-act as AI, the means-act as AII, the react as AIII, and the triact as A3.5
An operational definition of "conscious or intended social influence" may be stated as
pro-acting which the influencers have pre-announced or verbalized beforehand, or which they
can verbalize under the conditions imposed by whatever scientific instruments of observation
are employed. Since ours is a polling background, our definition might be phrased "... under
the fully standardized conditions of the modern public opinion poll when adapted to basic
research." A similar distinction in terms of an instrumental threshold of observation could
doubtless be made with other instruments, such as the psychologist's tests or the
anthropologist's field techniques.
Pro-acts are thus "speech acts" consciously symbolized in oral exchange or writing.
Familiar examples are such pre-announced intentions as the statement of purposes in a
constitution, a legislative bill, a contract, a plan, or other assertion of objectives by one party in
any interaction. They can be oral responses to the most delicate probing by skilled
interviewers, or at the other extreme they can be the most formally documented political or
religious laws. There is probably no universally fixed or absolute dividing line between speech
and sub-speech acting, but a poll's questions and instructions can operationally define a
working boundary with measurable reliability. Some folk norms, for example, may be
verbalizable by only the more articulate members of a society. Social influence will certainly be
easier to measure the farther we go along a sub-speech to speech continuum, from pro-acting
that may be semi-articulate or insincere or rationalized (and so must be "squeezed" out of the
respondent and reinterpreted by the researcher) to clearly formulated action programs. On the
other hand, the bringing to the surface of semi-conscious intentions of groups can serve the
"social-psychiatric" function of making them subject to conscious manipulation.
Incidentally, it is not necessarily the case that an area of control is likely to be more
latent in simpler societies and more manifest as "law and order" proliferate explicitly in
societies of increasing complexity. For example, it seems probable that a fairly high proportion
of American youth never very consciously entertains the concept of "incest taboo," yet it seems
likely that a very small proportion ever seriously entertains the notion of committing incest and
could tell you so readily enough if asked in terms they understood. By contrast, on the
Micronesian Island of Yap (where the taboo applies to a much wider kin group than the nuclear
family) a perfectly explicit agency of control exists in the form of ancestral ghosts. These hold
the entire kin group responsible by causing the death of any one of its members as
punishment for an occurrence of incest.6 Fortunately, since incest does occur, the kin group
also possesses limited control over the ghosts and is sometimes able to avert the catastrophe.
Means-acts may be any acts in such general categories as teaching, suggesting,
persuading, limiting, or coercing. In face-to-face situations they often consist of "unspoken"
speech acts, e.g., the pro-actor simply expects that his intended reactor shares his own
motivating norms, or will observe relevant social amenities, or will expect certain rewards or
penalties. It may sometimes be convenient to observe bi-acts in place of tri-acts. This might be
the case where means-acts are tacit expectations of this sort, though these too can usually be
talked about in response to skilled questioning and so can be made explicit if it is desired.
The re-acts may fall in categories such as learning, imitating, concurring, acquiescing,
complying, or their opposites from not learning to defying. We should not be confused by the
fact that in cases of unsuccessful influence responses may be observed in zero amounts. As
long as some specified group of re-actors is intended by the pro-actors, a re-act is observable
and it will always be a factor in our theoretical construction. Re-actors may, of course, be
aware or they may be unaware that they are being influenced.
The scope of a tri-act will be determined by the observer. One could take as tri-acts the
United Nations' attempting to settle an international conflict, or a hostess sending out
invitations to tea, or simply a tea-drinker asking another to pass the cream. A parent might be
observed to pro-act by wanting to punish a child, to means-act by spanking it, and the child
then observed to re-act by refraining from or not refraining from his misdeed for the rest of the
day. Or one might be interested in the long-term results of a mode of discipline in childhood in
some large population of parents and children. Pro-acting would then presumably include the
purpose of instilling eventual self-discipline, means-acting might consist of corporal
punishment, and re-acting might be observed in some such form as delinquency or nondelinquency in adolescence. As a framework for research, the tri-act must, of course, be
limited by contemporary ability to manipulate the variables at issue while keeping all else
constant. Research may profitably begin with simple situations. The essence of our modeling
is to build it out of and test it by sets of subsets of elements of behavior, i.e., observable
instances of human acts of specified kinds and contexts, and to avoid vague ill-bounded
concepts.
III. The Concordance Index — A Coefficient of Social Influence
Social influence will not be thought of here as some sort of stick (whether big, mediumsized, or a little one). (Recall that we abandoned the term "social control" in part because it
seemed to suggest a "big stick.") "Social influence" is intended here as a strictly relational
concept: in its simplest form it includes at least the three relatees (the three acts), the three
binary relations among them, and the combination of all these in a relationship. Suppose that
we have before us a table of indices for pro-acts, means-acts, and re-acts of some particular
tri-act observed in some particular population in a specified time and place. We wish to
combine the three indices in a relational index that will give us a measure of the over-all
amount of social influence operating in this situation.
A possible relational measure would be a standard multiple correlation with the pro-act
taken as the dependent variable. There is an advantage to be gained, however, in using a nonregression based correlation. This will be more apparent later when we describe a systematic
rational method for developing hypotheses. In so doing we shall want to be free to let a
condition of any one of the three acts or its context vary independently without having to switch
dependent and independent variables. Two measures which meet this requirement are
Robinson's A and Kendall's W, the former for cardinal scales, the latter for ranked scores or
ordinal scales.7 (Intra-class correlation would also work for a bi-act.) These two methods, both
based upon analysis of variance, give identical results when both are used upon the same
ranked data. We will use W as a simpler illustration in the following hypothetical situation. By
"index of concord," or "Cd," we shall, however, refer henceforth to either one of these
measures when used appropriately8 on ordinal or cardinal data. Then subscripts can
distinguish the two forms wherever desired as CdA for the agreement form and CdW for the
concordance form.
Imagine four communities wishing to organize a blood-bank, i.e., to influence blooddonating behavior of their citizens. Each community pro-acts by setting itself a goal of x cc's of
blood per inhabitant. They vary in size of goal chosen and can thereby be ranked as in the proact column of Figure 1. Their means-acting, observed in terms of man-hours of work, words of
publicity, and dollars' worth of materials is identically ranked (in this example of perfect social
concord) in the three means-acts columns of Figure 1. Finally, the reacting in actual per capita
donations is ranked in the same way. The formula for W then shows the concordance, as
crudely measured here in ordinal units, to be maximal at unity.
Figure 1:
How to compute the index of concord Cd from fictitious ranks of P parties (4 towns) on n acts
(when starting blood banks) to show full concord of efficiency of social control, as defined by
Cdw=1.0
ACTS- PRO-ACT
n=5
wanting AI
cc of blood
per person
THE END
Col. 1
MEANSACT1
using
AIII:1
man
hours
of work
Col. 2
MEANSACT2
using AIII:2
words in
publicity
THE
THREE
MEANS
Col. 3
MEANSACT3
using
AIII:3
dollars
worth
of
materials
Col. 4
RE-ACT
Giving all
AII
cc of
blood
per
person
THE
RESULT
Col. 5
Parties
P=4
P1
2nd rank
2nd
2nd
2nd
2nd
P2
3rd rank
3rd
3rd
3rd
3rd
P3
1st rank
1st
1st
1st
1st
P4
4th rank
4th
4th
4th
4th
Cdw = 12 S/n2 (P2 – P) = 12 • 25/52 (42 – 4) = 300/25 • 12 = 1.00
Sums Rj
Of
Ranks =
n
Rj = Σ1 Ri
10
15
5
20
¯j = 10
R
P
= Σ1 Rj/P
Squares of
deviations
from
(Ri - Ri)
1.25
1.25
11.25
11.25
S = 25 =
P
¯i ) 2j
Σ (Ri -R
j= 1
= 40/4
Notice that when concord is perfect as in Figure 1, the variance with each tri-act (within
each row here) is zero. When variance within tri-acts is greater than zero (if among
comparably scaled acts or columns in Figure 1) this measures the amount of discord of social
influence. The greatest possible discord would occur if there were no variance among the
parties (in the columns here) so that the total tri-act variance (i.e., the total variance within
rows) would account for the total variance over all cells. A ratio of tri-act variance to total (act
plus party) variance provides a discord ratio which is free of the original measuring units. The
index of concord is the complement of this ratio.
To illustrate, suppose that the acts in Figure 1 had not all been identically ranked. Let us
say, for example, that in the react column P1, the first community, is ranked first and P3
second. The variance within this column remains unchanged, since it is obviously unaffected
simply by changing the order of the ranks. (It would be affected only in the case that tied ranks
were introduced, in which case there is a correction factor to be introduced into the formula.)
But while column variance is unchanged, the row variance is no longer zero. The discord ratio,
formerly zero, will likewise have increased and computation of CdW reveals concord now to be
reduced from 1.00 to .84.
The fact that the concord index summarizes an analysis of variance may not be directly
apparent in the formula for CdW. (The derivation of CdW, which need not concern us here, is
based on the fact that a set of ranks corresponds to the first N natural numbers.) The variance
analysis is easier to see in the formula for Robinson's A. The symbol, A, reflects the fact that
the formula was designed to be a better measure of agreement among judges than ordinary
multiple correlation. Essentially, it is a multiple intra-class correlation. For the reader who is at
home with statistics, Robinson's formula may be stated as 1-D/Dmax where D measures
disagreement among observations within groups by summing the sums of squared deviations
from the group means (more convenient than variances) and Dmax represents the maximum
possible disagreement, the sum of squared deviations of all the observations from their
common mean. For the less statistically minded reader it will suffice here to think of the
concord index as the complement of the discord ratio described above. It may also help to
state the formula in variant alternative forms as:
CdA = concord index = 1- discord index
= 1- (σ2 acts/(σ2 acts + parties))
= 1- (mean variance within rows/variance of all cells)
= 1- (mean variance between rows/variance of all cells)
= 1- (σ2 parties/σ2 cells)
= 1- (σ2 within tri-acts/σ2 within and between tri-acts)
Note that Cd has limits of zero and one and can thus be viewed as a proportion, or
percentage index when multiplied by 100. It then becomes an operational definition of "the
concord of social influence" in percentage form.
Some of the statistical properties of this concord index in its cardinal subform as
Robinson's coefficient of agreement may be noted. In terms of variance analysis, Cd is a ratio
of variances. Its statistical significance therefore may be tested by an F Test. It is also an index
of correlation among n variables that is not an average intercorrelation nor a multiple
correlation. It is an intra-class correlation extended from two to n variables. From this viewpoint
the classes are the communities or rows of Figure 1, and the concord index is a measure of
the correlation, agreement, like-amount, or absence of dispersion, of the three (or more) action
indices within each community-class when expressed relative to the dispersion or variance
between those community-classes or rows in Figure 1. Thus whatever may make the cell
entries of a row more alike (and so like their mean or "level") makes for more concord;
whatever makes a row's cells unlike each other, and so deviate from their mean, makes for
discord — providing the column acts are comparably rescaled to secure equated means and
variances.
Thus the potential usefulness of the concord index seems to us to be much more than a
tool for summarizing the agreement aspect or concord among goals, means, and results in a
situation of social control (whenever that is specified as "intentional interhuman influencing").
The concord index can serve for much more than measuring the homeostatic or equilibrative
state of the social system that has been observed in Table 1 as actuarial data from a specified
set of groups in a specified set of influential actions.
The concord index when backed by standardizing norms such as in Table 1 can yield
exact predictive inferences as hypotheses, deduced rigorously from the Cd formula and
empirically testable by further observations of such organized group behavior. Thus
sociologists can increasingly predict what the exact percentage effect on the concord of social
influence in a specified population and range of behaviors will be, if specific changes are
engineered or observed in the behaviors represented by any specified cells in Figure 1. Thus
we see this operationally defined index as a useful step in stating testable hypotheses in "IF A,
then B" form. Then, insofar as relevant antecedent A's become more fully and exactly stated
(as by enlarging their Figure 1), in just so far the consequent behavior B of social influencing
can be predicted with improved probability. (A fuller mimeographed version of this paper with
more discussion and application is obtainable from the author). (Editor’s note: see following
paper)
Notes
1. An excellent analysis of leadership which takes full cognizance of its relational nature
appears in: Robert Tannenbaum and Fred Massarik, Leadership: A Frame of Reference,
Institute of Industrial Relations, Reprint No. 68, University of California at Los Angeles,
1958.
The dimensions of groups involving explicit formulas for leader-follower relations and
other relations in social control along with much of the present article's theory of social
control was first printed in the chapter under that title in S. C. Dodd, Systematic Social
Science (offset edition), University Bookstore, Seattle, 1947, 788 pp.
2. P. H. Landis, Social Control, New York: J. B. Lippincott, Chapter I, p. 13.
3. In later articles we hope to develop a more adequate model of "intended social influence."
These articles will progressively expand this core model -- first, to a model which deals with
its full context (i.e., to a transaction), then to models dealing with feed-back, multiple
influencings, sequences of the foregoing, and other compounded situations.
4. Analysis of the context is of utmost importance and will be undertaken in the next paper of
this series. Here, however, we are engaged in operationally defining the elementary unit or
instance of influential behavior.
5. The subscripts III and II for means-act and react, respectively, are given in the reverse
order of their time sequence for the reason that it may sometimes be useful to deal with a
situation as a "bi-act," A2, a pro-act-react or interact or influencer and influencee which
omits any observed means-act of a different agent.
6. David M. Schneider, "Political Organization, Supernatural Sanctions, and Punishment for
Incest on the Isle of Yap," American Anthropologist, 59 (No. 5), pp. 791-800.
7. W. S. Robinson, "The Statistical Measurement of Agreement," American Sociological
Review, 22 (February, 1957), pp. 17-25; M. G. Kendall, Rank Correlational Methods, New
York: Hafner Publishing Co., 1955, Chapter 6.
8. In any case, the variables must be comparably expressed, i.e., in scales with like origins,
means, and variances as in using the same number of ranked parties, or in using
percentiles (ordinal units), etc. Robinson's coefficient of agreement can only be used
legitimately here among different indices of acts (such as columns 1-5 in Figure 1) if those
indices have first been converted into comparable scales, i.e., with the same mean and
dispersion. Unless comparable indices are used, the discordance of social influence that is
at issue will be confounded with difference in the indices and units of the different acts. By
using rankings, although they are less exact than cardinal scales, they assure full
comparability and usually easier observing. A refinement of the concord index in Figure 1
could weigh the cell indices so that the three means-acts combined would have a relative
weighting equal to each of the other single acts, i.e., as if constituting a single column with
three functionally alternative subdivisions. Pro-acting or reacting might, of course, also be
observed as sets of multiple acts (i.e., with more than one index for each of them). In
addition, to cc's of blood per person, for instance, the communities might have set other
goals such as the acquisition of permanent facilities or building up of a stable set of annual
donors of blood, etc.
SD:62-3
#31. The Concord Index for Social Influence (Full version of paper)
by
Stuart C. Dodd
And
Louise B. Klein
University of Washington, Seattle
I. Introduction – from “social control” to “social influence”
The development of concordance theory began with a re-examination of the venerable
sociological concept of "social control." A search of the literature yielded little in the way of
consensus on a definition of the term or of hypotheses so formulated as to lend themselves
readily to research. This raised the question of whether the concept was sterile as a basis for a
unified theory that could lead to cumulative research. The literature, however, is rich and
suggestive and we most certainly should avoid throwing out the baby with the bath. With some
refining and perhaps renaming, we felt that the idea -- or some part of it -- could be made to
bear fruit in experimental science.
The inefficacy of the concept to date may be due in part to the tendency so common in
the development of human thought, to reify, and consequently to view relational phenomena
as unitary entities before it becomes apparent that much of their essence lies in the relating of
component parts in various ways and degrees. The difficulties that have beset the sociological
study of "leadership," for example, may be due to one-sided efforts to correlate characteristics
of leaders, while ignoring those of their followers and the context of the interaction. 1 In
situations of social control likewise, it seemed to us crucial to focus upon relations between the
actions of controllers and the reactions of controllees, and secondarily upon the relations of
both with the actions of any agents who might execute the controls. We do not mean to
suggest that other writers have altogether ignored the relational aspects of social control, but
rather that there has not yet been a systematic comparative analysis of these two or three
different kinds of action-within-a-situation, nor any index proposed for measuring their common
interaction. This lack has left us without tools for deriving or testing causal hypotheses. In
short, we have been stuck at the speculative level with no outlook for predicting or "controlling
control."
When we look upon social control as a relation among the two or three relatees named
above, a potential measure of the amount of control in a situation becomes apparent. It would
be simply some sort of correlation -- better thought of here as a degree of harmony or
agreement amongst the behaviors of the relatees, since we will be interested not only in
whether controllers get what they want, but just as much in whether they want what controllees
do and want, and in whether agents use appropriate means. In other words, we are concerned
with the over-all "concord," as we shall henceforth call it, in a situation -- with the extent to
which wants, deeds, and other conditions are in balance. A salient kind of question, for
example, would concern the extent to which concord is increased by the alignment or
convergence of contextual factors in the parties to a concord situation, e.g., is concord
increased in proportion as the parties share similar values, or similar cognitive attitudes, or are
near each other in space or time, or are the same people, etc.?
As our thinking progressed, the emphasis upon relationships seemed to point to a
phase of the concept which is not explicit in "social control," suggesting that the term might not
be the most fortunate name for a sociological domain that aims to study the processes of
social influence upon behavior. Moreover, the term seems to have acquired certain negative
connotations which may make it a poor handle for the range of phenomena its fabricators had
in mind. The antennae of our polling laboratory indicate that "social control" suggests to the
modern mind exploitive manipulation of one group by another, whether by totalitarian ogres,
hidden persuaders, or whatever other objectionable characters. The coiners of the term
certainly did not intend this limitation, since they were interested quite as much in the desirable
and necessary controls, including reasonable "law and order," shared ideals, democratic selfcontrol, education and guidance of the young, etc., without which our personalities and
communities would fall apart. Further, the word "control" perhaps suggests that whatever it
designates is relatively successful or complete, whereas many attempts to influence may not
be, and may not even be intended, to exercise control in a thoroughgoing or regulatory sense.
We wish to include in our study all shades of interhuman influence, including the situation in
which intended control fails completely and that in which a gentle influence succeeds in having
a discernible effect.
One further consideration has influenced our choice of terminology. Taken collectively,
the writers on social control have concerned themselves with almost every conceivable kind of
social situation -- every sort at least in which people influence the behavior of people, directly
or indirectly, intentionally or unintentionally, consciously or unconsciously. We wish to dispel at
once any thought that we include in our treatment all of the behaviors that all of these various
writers have treated as manifestations of social control. In particular, we specifically exclude
situations in which control is so unconscious that no intention or purpose can be elicited from
any group of controllers. By this stipulation we depart from the position taken by a number of
sociologists of whom Paul H. Landis is representative. Landis believes that an assumption
made by other writers "which is questionable (is) that social control should deal with the
conscious, purposeful regulation of society by its members, primarily." Rather it must include
the "impersonal and abstract social and cultural influences which permeate the personality of
each individual, and act as implementing, although often unconscious, forces in the forming of
his behavior."2 We do not question the importance of the phenomena to which Landis refers,
but it seems impractical to observe control or influence as such with any degree of precision
unless there is a statement of intent to control or influence which can be measured against a
degree of its fulfillment. Indeed, to call social influence without a pre-announced intent to
influence a case of "social control" is to make "social control" equivalent to all social behavior
and so a wholly redundant term.
The foregoing are the considerations that have led us to adopt terms such as
"intentional social influence" and the "concordance system of models," rather than to lay claim
to a theory labeled "social control." If the reader finds the range of the system more limiting
than the old theoretical concept of social control, he may perhaps find the limitation
compensated by the prospect we hope may be brought into view of going beyond verbal
description to ways and means of measuring, planning, executing, and evaluating units of
socially influential behavior, and eventually multiples, sequences, feed-backs, and
combinations of these together with their four standard forms of compounding into systems of
intended influencings.
II. A Unit of Social Influence and Its Coefficient
This first article undertakes two pieces of groundwork:
(1) it establishes a set of basic categories of variables, i.e., those categories which are
common to all the situations that are included in our frame of reference, namely,
those human behaviors correlated with all instances of intended social influencings;
(2) it suggests a way of combining indices of variables from these categories in an
appropriate correlation or index of over-all "concordance."
These are initial and simple steps, and interhuman influence clearly is not a simple
phenomenon. But complexity and process must be observed in terms of some kind of core unit
which can be observed to differ in different time periods or under other differing conditions. It is
such a core unit and an index of it that we attempt to delineate here.
Then in later articles we hope to develop a more adequate model or operationally
defined theory of "intended social influence" (or "intended social control" if the reader prefers).
These articles will progressively expand this core model--first, to a model which deals with its
full context (i.e., to a transaction), then to models dealing with feed-back, multiple influencings,
sequences of the foregoing, and other compounded situations.
A. The basic categories
Briefly, the categories we propose to use are simply formalizations of the familiar triad,
"ends, means, and results." "Ends" (or "goals," as we shall call them, since the word "ends"
can be confused with "results") refers here to those goals which involve the purposive
influencing or motivating of human behavior That this is a fundamental set of categories is an
old and common-sense idea reflected in ordinary language, which has many synonyms for
them, such as "aims, efforts, and achievements." Their use here, however, requires more
exacting specification.
To begin with, we specify that when we use "goals, means, and results" (or their
analogues in situations of social influence -- see below), we refer to three distinct kinds of
things that people do. These should not be confused with the people who do them; influencers,
agents, and influencees are not necessarily three distinct sets of people. The three human
groups can be identical, e.g., a club membership might vote unanimously to assess itself (the
goal), members might bill themselves (the means), and all pay (the results). Or the groups may
be only partly distinguishable--a church group might set up a program for a youth group with
overlapping membership. Or all three groups may be entirely distinct, as when in the
mercenary part of the American Revolutionary War of 1775 Britain hired German troops to help
restore political control of American colonials.
Although we cannot always distinguish our categories by distinguishing groups, we can
always distinguish them as acts. The club membership votes, bills itself, and pays. Britain hired
troops; the troops fought; American colonials resisted control. People, of course, will always be
involved, as will people's desirings and valued objects and various other circumstances; but
these may occur as further context, not as the core determiners, i.e., variables directly
observed in a situation of social influencing. Analysis of the context is of utmost importance
and will be undertaken in the next paper of this series. Here, however, we are engaged in
operationally defining the elementary unit or instance of influential behavior.
Toward all operational definition, we need first appropriate terms and symbols. The folk
terminology is loose for our purposes, since we do not mean to include in our study the
influencing of animals, machinery, weather, or a multitude of other legitimate but nonsociological "goals, means, ends" sequences. Let us take "pro-act" as a name for the setting of
goals (which are pre-announced target amounts of change in somebody's behavior), "meansact" for any behavior serving as a means that may implement a pro-act, and "react" for
behavioral results or actual outcomes in the form of the responses of intended influencees
(which can include no response). Let the whole sequence when explicitly recorded be called
for convenience a "tri-act." We can symbolize the pro-act as AI, the means-act as AII, the react
as AIII and the tri-act as A3. (The subscripts III and II for means-act and react, respectively, are
given in the reverse order of their time sequence for the reason that it may sometimes be
useful to deal with a situation as a "bi-act," A2, a pro-act-react or interact of influencer and
influencee which omits any observed means-act of a different agent.
We stated in the introduction that we do not propose to study unconscious or
unintended social influence, but an operational definition of "conscious or intended social
influence" is still required. We define it as pro-acting which the influencers have pre-announced
or verbalized beforehand, or which they can verbalize under the conditions imposed by
whatever scientific instruments of observation are employed. Since ours is a polling laboratory,
our own definition might be phrased ". . . under the fully standardized conditions of the modern
public opinion poll when adapted to basic research." A similar distinction in terms of an
instrumental threshold of observation could doubtless be made with other instruments, such as
the psychologist's tests or the anthropologist's field techniques.
Pro-acts are thus seen to be "speech acts," acts consciously symbolized in speech or
writing. Familiar examples are such pre-announced intentions as the statement of purposes in
a constitution, a legislative bill, a contract, a plan, or other assertion of objectives by one party
in any interaction. They can be oral responses to the most delicate probing by skilled
interviewers, or at the other extreme they can be the most formally documented political or
religious laws.
There is probably no universally fixed or absolute dividing line between speech and subspeech acting, but a poll's questions and instructions can operationally define a working
boundary with measurable reliability. Some folk norms, for example, may be verbalizable by
only the more articulate members of a society. Social influence will certainly be easier to
measure the farther we go along a sub-speech to speech continuum, from pro-acting that may
be semi-articulate or insincere or rationalized (and so must be "squeezed" out of the
respondent and re-interpreted by the researcher) to clearly formulated action programs. On the
other hand, the bringing to the surface of semi-conscious intentions of groups can serve the
"social psychiatric" function of making them subject to conscious manipulation.
Incidentally, it is not necessarily the case that an area of control is likely to be more
latent in simpler societies and more manifest as "law and order" proliferate explicitly in
societies of increasing complexity. For example, it seems probable that a fairly high proportion
of American youth never very consciously entertains the concept of "incest taboo," yet it seems
a safe bet that a very small proportion ever seriously entertains the notion of committing incest
and could tell you so readily enough if asked in terms they understood. By contrast, on the
Micronesian Island of Yap (where the taboo applies to a much wider kin group than the nuclear
family) a perfectly explicit agency of control exists in the form of the ancestral ghosts. These
hold the entire kin group responsible by causing the death of any one of its members as
punishment for an occurrence of incest. Fortunately, since incest does occur, the kin group
also possesses limited control over the ghosts and is sometimes able to avert the
catastrophe.3
Means-acts may be any acts in such general categories as teaching, suggesting,
persuading, limiting, or coercing. In face-to-face situations they may and often do consist of
"unspoken" speech acts, e.g., the pro-actor simply expects that his intended reactor shares his
own motivating norms, or will observe relevant social amenities, or will expect certain rewards
or penalties. We mentioned earlier that it may sometimes be convenient to observe bi-acts in
place of tri-acts. This might be the case where means-acts are tacit expectations of this sort,
though these too can usually be talked about in response to skilled questioning and so can be
made explicit if it is desired.
The re-acts may fall in categories such as learning, imitating, concurring, acquiescing,
complying, or their opposites from not learning to defying. We should not be confused by the
fact that in cases of unsuccessful influence responses may be observed in zero amount. As
long as some specified group of reactors is intended by the pro-actors, a react is observable
and it will always be a factor in our theoretical construction. Reactors may, of course, be aware
or they may be unaware that they are being influenced.
The scope of a tri-act will be determined by the observer. One could take as tri-acts the
United Nations' attempting to settle an international conflict, or a hostess sending out
invitations to tea, or simply a tea-drinker asking another to pass the cream. A parent might be
observed to pro-act by wanting to punish a child, to means-act by spanking it, and the child
then observed to react by refraining from or not refraining from his misdeed for the rest of the
day. Or one might be interested in the long-term results of a mode of discipline in childhood in
some large population of parents and children. Pro-acting would then presumably include the
purpose of instilling eventual self-discipline, means-acting might consist of corporal
punishment, and reacting might be observed in some such form as delinquency or nondelinquency in adolescence. As a framework for research, the tri-act must, of course, be
limited by contemporary ability to manipulate the variables at issue while keeping all else
constant. As in all controlled research, we must begin with simple situations. The essence of
our modeling is to build it out of and test it by sets of subsets of elements of behavior, i.e.,
observable instances of human acts of specified kinds and contexts, and to avoid vague illbounded concepts. We seek to use and develop extensional language to displace the current
largely intensional language of sociology -- that is, to use words that point out instances of the
word's referent instead of words that merely rename the property (or properties) of the referent.
Thus by using the term "an influential tri-act" one points to the three necessary and sufficient
behavioral factors (i.e., the setting of goals, Al, the using of means thereto, AII, and the
changing of the reactions of influences, AIII in every occurrence more definitely than one does
by using the term "influential behavior."
The problems involved in gathering and analyzing data will differ enormously in different
situations, since the nature of tri-acts can vary over the whole range of interhuman influential
activity. Not being unique to the testing of this particular theoretical construction, such
problems will not be enlarged upon here. It may avoid confusion, however, to point out that
when measuring the three acts of a tri-act in a population, each "individual" that is to be ranked
or scored for each act will consist of a sub-group or unit (which we will call "a tri-party") of the
total population. This tri-party in turn will consist of at least one pro-actor, one role-actor, and
one reactor (who may be wholly, partially, or not at all the same set of persons). In the
illustration which is portrayed in Figure 1 and described in the next section, the total population
consists of three communities engaged in analogous tri-acts, and each community is taken as
a tri-party. In the matrix of Figure 1 one tri-party is represented in each row, and the three
kinds of acts of the tri-act are represented in the columns.
Figure 1
How to compute the index of concord, Cd, from fictitious ranks of P parties (3 towns) on n acts
(when starting blood banks) to show full concord or efficiency of social control, as defined by
CdW = 1.0
ACTS- PRO-ACT
n=5
wanting AI
cc of blood
per person
THE END
MEANSACT1
using
AIII:1
man
hours
of work
MEANSACT2
using AIII:2
words in
publicity
THE
THREE
MEANS
MEANSACT3
using
AIII:3
dollars
worth
of
materials
RE-ACT
Giving all
AII
cc of
blood
per
person
THE
RESULT
Col. 5
Col. 1
Col. 2
Col. 3
Col. 4
Parties
P=4
P1
2nd rank
2nd
2nd
2nd
2nd
P2
3rd rank
3rd
3rd
3rd
3rd
P3
1st rank
1st
1st
1st
1st
P4
4th rank
4th
4th
4th
4th
2
2
2
2
Cdw = 12 S/n (P – P) = 12 • 25/5 (4 – 4) = 300/25 • 12 = 1.00
Sums Rj
Of
Ranks =
n
Rj = Σ1 Ri
10
15
5
20
¯j = 10
R
P
= Σ1 Rj/P
Squares of
deviations
from
(Ri - Ri)
1.25
1.25
11.25
11.25
S = 25 =
P
¯i ) 2j
Σ (Ri -R
j= 1
= 40/4
B. The concordance index - a coefficient of social influence
Suppose now that we have before us a table of indices for pro-acts, means-acts, and
reacts of some particular tri-act observed in some particular population in a specified time and
place. We wish to combine the three indices in a relational index that will give us a measure of
the over-all amount of social influence operating in this situation.
As was anticipated in the introduction, a measure of social influence can be thought of
as a measure of the degree to which the three acts stand in a relationship of agreement or
balance, or, to use our adopted term, of "concord." As such it is not intended and cannot be a
measure of some sort of absolute power or efficacy possessed by influencers. To the extent
that we may be inclined to think of social influence as some sort of stick (whether a big,
medium-sized, or little one), we must modify our concept here. (Recall that we abandoned the
term "social control" in part because it seemed to suggest a "big stick.") "Social influence" then
is intended here as a strictly relational concept: in its simplest form it includes at least the three
relatees (the three acts), the three binary relations among them, and the combination of all
these in a relationship. (Further factors of the context of the core tri-acts in the total relevant
situations which we call the “transact” will be specified later.)
A possible relational measure would be a standard multiple correlation with the pro-act
taken as the dependent variable. There is an advantage to be gained, however, in using a nonregression based correlation. This will be more apparent later when we describe a systematic
rational method for developing hypotheses. In so doing we shall want to be free to let a
condition of any one of the three acts or its context vary independently without having to switch
dependent and independent variables. Two measures which meet this requirement are
Robinson's A and Kendall’s W, the former for cardinal scales, the latter for ranked scores or
ordinal scales.4 (Intra-class correlation would also work for a bi-act.) These two methods, both
based upon analysis of variance, give identical results when both are used upon the same
ranked data. We will use W as a simpler illustration in the following hypothetical situation. By
"index of concord," or "Cd," we shall, however, refer henceforth to either one of these
measures when used appropriately5 on ordinal or cardinal data. Then subscripts can
distinguish the two forms wherever desired as CdA for the agreement form and Cdw for the
concordance form.
Imagine four communities wishing to organize a blood bank, i.e., to influence blooddonating behavior of their citizens. Each community pro-acts by setting itself a goal of x cc's of
blood per inhabitant. They vary in size of goal chosen and can thereby be ranked as in the proact column of Figure 1. Their means-acting, observed in terms of man-hours of work, words of
publicity, and dollars' worth of materials is identically ranked (in this example of perfect social
concord) in the three means-acts columns of Figure 1. Finally, the reacting in actual per capita
donations is ranked in the same way. The formula for W then shows the concordance, as
crudely measured here in ordinal units, to be maximal at unity.
Notice that when concord is perfect as in Figure 1, the variance within each tri-act
(within each row here) is zero. When variance within tri-acts is greater than zero (if among
comparably scaled acts or columns in Figure 1) this measures the amount of discord of social
influence. The greatest possible discord would occur if there were no variance among the
parties (in the columns here) so that the total tri-act variance (i.e., the total variance within
rows) would account for the total variance over all cells. A ratio of tri-act variance to total (act
plus party variance provides a discord ratio which is free of the original measuring units. The
index of concord is the complement of this ratio. Multiplying by 100 expresses the concordance
as a percentage of the total variance among acts-plus-parties in the situation or transaction
observed. "Transaction" here denotes an action-in-full-context. The context of each
community's action here is the actions of the other communities. The concord of social
influence is measured relative, not to a single act or community, but to the whole set of acts
and communities.
To illustrate, suppose that the acts in Figure 1 had not all been identically ranked. Let us
say, for example, that in the react column P1, the first community, is ranked first and P3
second. The variance within this column remains unchanged, since it is obviously unaffected
simply by changing the order of the ranks. (It would be affected only in the case that tied ranks
were introduced, in which case there is a correction factor to be introduced into the formula.)
But while column variance is unchanged, the row variance is no longer zero. The discord ratio,
formerly zero, will likewise have increased and computation of Cdw reveals concord now to be
.84, or 84 per cent of maximal concord.
The fact that the concord index summarizes an analysis of variance may not be directly
apparent in the formula for Cdw. (The derivation of Cdw, which need not concern us here, is
based on the fact that a set of ranks corresponds to the first N natural numbers.) The variance
analysis is easier to see in the formula for Robinson's A. The symbol, A, reflects the fact that
the formula was designed to be a better measure of agreement among judges than ordinary
multiple correlation. Essentially, it is a multiple intra-class correlation. For the reader who is at
home with statistics, Robinson's formula may be stated as 1-D/Dmax where D measures disagreement among observations within groups by summing the sums of squared deviations
from the group means (more convenient than variances) and Dmax represents the maximum
possible disagreement, the sum of squared deviations of all the observations from their
common mean. For the less statistically minded reader it will suffice here to think of the
concord index as the complement of the discord ratio described above.
It may also help to state the formula in variant alternative forms as:
CdA = concord index = 1- discord index
= 1- (σ2 acts/(σ2 acts + parties))
= 1- (mean variance within rows/variance of all cells)
= 1- (mean variance between rows/variance of all cells)
= 1- (σ2 parties/σ2 cells)
= 1- (σ2 within tri-acts/σ2 within and between tri-acts)
Note that Cd has limits of zero and one and can thus be viewed as a proportion, or
percentage index when multiplied by 100. It then becomes an operational definition of "social
influence" in percentage form.
Some of the statistical properties of this concord index in its cardinal subform as
Robinson's coefficient of agreement may be noted. In terms of variance analysis, Cd is a ratio
of variances. Its statistical significance therefore may be tested by an F Test. It is also an index
of correlation among n variables that is not an average intercorrelation nor a multiple
correlation. It is an intra-class correlation extended from two to n variables. From this viewpoint
the classes are the communities or rows of Figure 1, and the concord index is a measure of
the correlation, agreement, like-amount, or absence of dispersion, of the three (or more) action
indices within each community-class when expressed relative to the dispersion or variance
between those community-classes or rows in Figure 1. Thus whatever may make the cell
entries of a row more alike (and so like their mean or "level") makes for more concord;
whatever makes a row's cells unlike each other, and so deviate from their mean, makes for
discord -- providing the column acts are comparably re-scaled to secure equated means and
variances.
In proportion as the concord index approaches its maximum at unity, one may interpret
this in various terms such as: There is a tendency for the tri-act indices to form "an equilibrium"
(among the variables specified by the Figure 1 matrix) "homeostasis" (i.e., group-maintained
constancy under changing conditions), a parity or single "level" with each other, a common
proportionality; to be alike or in agreement or highly correlated; to vary less within the tri-party
classes (Figure 1 rows) than between those classes; to approach optimal effectiveness 6--in
short, "to concord" with one another.
Thus this concord index, Cd, of social influence can, whenever so specified, measure
the degree of concord among:
(a) one or many goals or indices of each goal
(b) one or many means to each goal or set of goals
(c) one or many results or indices of each result
A refinement of the concord index in Figure 1 could weight the cell indices so that the
three means-acts combined would have a relative weighting equal to each of the other single
acts, i.e., as if constituting a single column with three subdivisions. Pro-acting or reacting
might, of course, also be observed as sets of multiple acts (i.e., with more than one index for
each of them). In addition, to cc's of blood per person, for instance, the communities might
have set other goals such as the acquisition of permanent facilities or building up of a stable
set of annual donors of blood, etc.
.
III. Application
It may be helpful to consider the simplest possible case of social influences, namely, a
dichotomous bi-act of just one tri-party as specified by the sentence: A says to B, "Apologize!"
and B reacts by either complying or defying.7 If B complies as in Figure 2, Case 1 concord is
maximal, i.e., Cd = 1. If B refuses, concord is minimal; Cd = 0.
Figure 2
Pro-act by A
Case 1
1
(= apologize!)
React by B
1
(= apologizes)
2
2
σact= 0
Case 2
1
0
(= apologize!) (= won't apologize)
Cd = 1 =
σact
2
σ act + party
= 1 – 0/0 = 1
2
σact= .5 Cd = 1 – .5/.5 = 1 – 1 = 0
By now it can be seen that the concord index is versatile in the following respects: it
serves
(1) for any number of parties,
(2) for any number of acts or multiple indices of acts, and
(3) for all-or-none, ordinal, or cardinal variables (if comparably scaled).
The statistically minded reader may note also that the index serves for all statistical
moments of the distributions of each variable, i.e., concord can be maximal only if total
frequencies, means, variances, correlations, skewnesses, and kurtosis are all matched or
made comparable over the set of variables.
Concord, as operationally defined by the formula for the concord index together with the
concord matrix, such as Figure 1 for Cdw, is much more than simply a way of summarizing the
data. It is a tool that may be used for making inferences and predictions and eventually for
better social engineering. Let us look for a moment at the social behaviors that are measured
in various cells of the matrix in Figure 1. Imagine now that the acts are measured by scores on
some sort of comparable scales instead of by ranks so that for convenience we can consider
an alteration in only one cell at a time instead of a switch between two cells as contained in
any ranking. Now in any normally balanced situation or set of parties suppose
(1) that some one of the parties doubles its pro-act by setting a goal of twice as many
cc's of blood per person, or suppose
(2) that pro-acting does not change but that one of the parties gives its program only
one third as much publicity, or suppose
(3) that one community's reacts are lower because the pro-actors failed to take into
account an unusually high proportion of persons in that community who are ineligible
to donate blood by reason of their age.
In each case we can hold all else constant in the matrix (Figure 1 modified) and ask what
effect we would expect the specified change of behavior to have upon its cell index -- will the
index in that cell change toward or away from the row mean? For we know by definition that if
the change is toward the mean, concord will increase, and that if it is away from the mean,
concord will decrease. In these three cases clearly the behavioral changes will cause
deviations away from the row means (since we assumed a normal well balancing initial state of
affairs) and so we predict decreases in concord. Thus we can predict a lessening of net social
influence in such situations by exact inferences from specified changes in any parts of the
situation. This is what an operational definition in a statistical index such as Cd enables the
sociologist to do. Hitherto sociologists have all too often attempted to reason or predict from
isolated facts rather than from relational networks of interdependent facts such as "a
transaction" formula specifies. The concord model thus enables the social scientist to predict
the consequences to the whole system (defined by Cd which will follow from modifying in
specified amounts any part of that system as specified by cells in Figure 1.
Next, going beyond prediction, the behavioral scientist can better control or engineer a
social influence, the structure of which may have been systematically measured in a set of
communities. Suppose the State Director of blood bank campaign studies Figure 1 as
observed for the last year when the coming year's program for the communities of his state is
being planned. Which cells in Figure 1 for that state in the previous year show largest
deviations, each from its community level or row mean? If these largest discrepancies from a
well-balanced or highly concordant situation can be reduced, both the local and statewide level
of effectiveness and of concord of social influence may be improved. He may note such
modifiable cell entries, for example, as:
1. Community X's goals greatly exceeded its level. Can more realistic goals that are
only slightly above last year's level of means and results serve better this year to
raise the net results?
2. The publicity in Community Y was greatly below par (i.e., below the level or average
for the acts of that community). Can a special effort to increase its publicity
percentage this year help to raise the net results?
3. The results in Community Z were 50 per cent below its level, even though its means
taken were well above par for Z. Why were the means here so ineffective? What
special factors hitherto unnoticed may be operating in Z to make the usual means
produce less results here than elsewhere? Was the large publicity for the blood bank
overdone somehow? Did the Committee wastefully spend too much of its many man
hours talking to each other in meetings, rather than in trying to talk to every fellow
townsman, urging donations to the blood bank? Was the budget used up on too
much radio time when listeners are few? Are there special resistances in Community
Z? Etc.
4. Community W's 25 per cent failure to fulfill its goals last year seems due to 50 per
cent deficient effort in man hours devoted by the Organizing Committee. Will a drive
to recruit a Committee this year who will promise to devote 50 per cent more time to
the campaign bring the results more up to par?
In short, diagnosis of the weak spots in situations of social influencing or intended
achievement can help direct remedial effort more effectively.
Then when annual standardized records from many communities become cumulated,
one can expect that first norms, and later laws, of social practice might emerge. Is it not
reasonable to expect that norms could be compiled in time for communities of such and such
relevant characteristics maximizing the absolute size of their results (or else their per cents of
fulfillment, etc.) by using such and such amounts of each means and balance among them?
With such norms increasingly built up, inquiries and even experiments can go forward to
produce simple laws or highly general and predictable "if . . . then" statements. We
hypothesize, for example, that the logarithmic law of diminishing returns in this field of group
social work will become visible and tested. Thus if, in any normally balanced transaction of a
measured social influence, one single means-factor is increased excessively and in isolation,
then the result on influencees will increase, but diminishingly so, in a predictable logarithmic
curve.8
Thus to know that a goal was doubled does not tell us whether "social control" was
improved or worsened thereby, until we also know its context. This context includes the
relation of that doubling of the goal to any change of the means-act or of the react of that triparty and also the relation of all this to all other tri-parties in the situation studied. For every
degree of social influence is always relative to the whole situation observed since the variance
of the rows (Figure 1) is always expressed as a proportion of the variance of all the cells in
defining the concord, Cd, or evenness of that case of social influencing.
The last sentence of the preceding paragraph implies a warning, familiar to
statisticians, but often overlooked among social scientists. Every correlation index (including
this extended index of intra-class correlation called the concord index, Cd here) is relative to
the range of the population in which it is observed. One can modify the index therefore either
by modifying the range of the population (the variance in columns in Figure 1) or by modifying
the range of their acts. Thus, for example, an observed degree of half-perfect social influence
(Cd = .5) can be increased either
(a) by aligning the, goals, means and results more closely together in the observed
situation, or
(b) by altering the population as in omitting extreme tri-parties (with highest or lowest
mean tri-act scores).
The latter, (b), shrinks the inter-party variance (within columns) and so relatively to it
raises any constant inter-act variance (within rows). In short, Cd changes with changes in
either its numerator or denominator in the dispersion of either its parties or their acts. Since
sociologists are ordinarily interested in the variance of the acts, comparison of situations with
differing degrees of social influence should be made strictly only insofar as the populations are
standardized or comparable. This is most readily achieved by keeping a constant set of triparties, i.e., observing one population over time periods to note changes in social influence
which are then due solely to changing agreement or concordance among the pro-acts, meansacts, and reacts.
In submitting this manuscript in earlier draft to various critics, the following questions
emerged as needing further clarifying:
1. Just what is the sociological meaning of the extended intraclass correlation specified
by the formula for the concord index, Cd, when the five column variables in Figure 1
are very different sorts of acts and not just one variable (like "height" in a population
of N families each having 5 brothers as the 5 entries in each row). We reply: The
index of concord, Cd, here has at least the following twelve meanings or
interpretations or paraphrasings:
(1) Cd, the index of concord, is an extension from 2 to n variables of an intra-class
correlation where the classes are the tri-acting groups (i.e., the communities in
the rows of Figure 1) and the variable is a "social goal-pursuing behavior
observed in at least three sub-indices.
(2) Cd measures the degree of correlation among comparably scaled indices of
aims, efforts, and achievements within each of a set of parties, showing whether
these three types of acts are in "due proportion," or of "like-amount," with each
other in the situation observed.
(3) Cd measures the tendency for each of the three acts of a given party (i.e., AI,
the goals set; AII, the means taken; and AIII, the results achieved) to be "on a par
with" the other two acts, thus tending to form a "level" profile when comparably
graphed.
(4) Cd measures how level (i.e., how even to uneven) is each community's profile of
the three types of action in Figure 1. If Cd = 1, every community's profile is a
horizontal line so they form together one perfect ladder with the communities P1,
P2, P3, as rungs, i.e., the profiles are separate and parallel straight lines. If Cd =
0, each community's profile is a different ladder with rungs ranging from top to
bottom and entirely unrelated or unaligned with each other community so that
the profiles become maximally tangled and interlacing.
(5) If Cd = 1, perfect prediction of the means taken and results achieved is possible
from given knowledge of the goals set, for any one community and for all
communities; whereas if Cd is zero no such prediction better than pure chance
(6)
(7)
(8)
(9)
(10)
(11)
(12)
is possible. (For intermediate degrees of Cd between these limits of 1 and 0,
prediction would be made by the usual regression formulas from the appropriate
zero order Pearsonian correlations or else from a multiple correlation.)
Similarly, the other interpretations of correlation of two variables become
extended or generalized appropriately to n variables.9
Cd measures the excellence of the social control in showing how fully the goals,
the means, and the results are in a specified state of dynamic equilibrium, or
homeostasis, forming a balanced system in the set of open communities
studied.
Cd measures how disproportioned to well-proportioned the pro-acts, the meansacts, and the reacts are to each other in the set of parties observed.
Cd measures just how much the social influencing in the whole situation has
been, or will be, improved or worsened by any specific amount of change in any
component index or part of the situation.
Cd therefore can help to test exactly any hypothesis that purports to describe,
predict, or control the causes, contents, or consequences of the behavior called
"social influencing."
Cd is a formula (built as a ratio of variances) which operationally and reliably
defines the efficiency aspect of "intended social influencing."
Cd is an analytic and synthetic model which can aid in describing reproducibly,
predicting verifiably, and controlling usefully specified aspects of the behavior
called "intended social influencing."
2. Are not the complex mass behaviors hitherto called "social control" oversimplified
when reduced to a single index? We reply: Like every scientific index, Cd measures
only those aspects of any complex situation which its component terms specify,
leaving all other aspects to other indices. Thus Cd only measures the correlation or
efficient balance of goals-means-results in a specified set of parties. Other indices
can measure other parts of the total behavior such as:
(1) the absolute size of each of the goals, the means, or the results; or
(2) their average size which is here called the tri-party "level"; or
(3) their size relative to
(a) a base period,
(b) a total amount or percentage base,
(c) any other standard desired;
(d) including the "per cent fulfillment" defined by the result achieved relative to
the goal intended; etc.
We further reply: The Cd index defines and measures only the unit of
efficiency or balance within any transaction of "intended social influencing." Its
further compoundings
(1) among many parties,
(2) in many contexts,
(3) for many kinds of influence,
(4) towards many goals,
(5) by many means and
(6) with many results,
(7) in many combinations and
(8) sequences,
(9) with many interactions and
(10) feedbacks and
(11) with additive or
(12) multiplicative or
(13) other compoundings of all these represent thirteen further inquiries which are
not developed in this introductory article.
Some fuller development may be found in the senior author's Systematic
Social Science (University Bookstore, Seattle, 1947, 788 pp) or in a 62-page
seminar mimeograph and its 34-page revision entitled "The Concordance Models for
Social Control, U:57-17 and U:58-17" or in articles subsequent to this one (if reader
reactions call for such). In short, we reply that this start upon transactional analysis
of "social control" into its three always-present factor of actions is offered as a sharp
and standardizing tool for social work and for social research.
3. Is the fallacy of interpreting correlation as causation implicitly involved in this index of
concord? We reply: It is involved if the loose speech of users commits that fallacy;
but the definition of Cd is explicit and, if followed, leaves no "fallacy." Cd measures
correlation. It can also measure causation if three further conditions are fulfilled in
the variables correlated, namely, that they be changes and in a time sequence with
all else uncorrelated. For we define causation as "a part correlation of sequenced
changes" as follows: A is called "a cause" of B to the extent that positive and
negative changes in A (i.e., ± 1ΔA) are correlated to positive and negative, lateroccurring changes in B (i.e., ± 2ΔB) which are uncorrelated with any other antecedent
changes in the context of circumstances, C (i.e., ± 1ΔC).
Thus 100 r 2 Δ ( Δ Δ ) ≠ 0 is an index of percentage causation of B by A
1 A 2 B1 C
when isolated.
Then insofar as our Cd index of the concord in a set of observed instances of
intended social influencings among a set of parties fulfills this definition of causation,
in just so far the influencing may be called "causal." Thus if positive and negative
changes from the status quo at the start of a period in respect to indices of goals,
means, and results are observed to be correlated, and to be uncorrelated with all
else, then the pro-act and means-act may properly be called causes" of the react at
issue. In short, Cd when fully observed as a partial correlation of sequenced
changes becomes an index of causation in addition to being an index of intra-class
correlation.
4. Does not the sequence of pro-acts stimulating means-acts which stimulate reacts
limit this concordance model for intended social influence to an outmoded simple
stimulus-response psychology that neglects interaction, feedback, situational
context, etc.? We reply: This depends on how the indices for the pro-acts meansacts and reacts are internally structured and interrelated to each other and to other
relevant indices. They can be structured, i.e., compounded out of simple stimulus
and response indices or out of more complex indices which take interaction,
feedback, context, and their compoundings into account. But these complexities of
the concord index, Cd, are beyond the scope of the present paper.
Notes
1. An excellent analysis of leadership which takes full cognizance of its relational nature
appears in: Robert Tannenbaum and Fred Massarik, Leadership: A Frame of Reference,
Institute of Industrial Relations, Reprint No. 68, University of California at Los Angeles,
1958.
The dimensions of groups involving explicit formulas for leader-follower relations and
other relations in social control along were given by us along with much of the present
article's theory of social control in the chapter under that title in S. C. Dodd, Systematic
Social Science (offset edition), University Bookstore, Seattle, 1947, 788 pp.
2. P. H. Landis, Social Control, New York: J. B. Lippincott, Chapter I, p. 13.
3. David M. Schneider, "Political Organization, Supernatural Sanctions, and Punishment for
Incest on the Isle of Yap," American Anthropologist, 59 (No. 5), pp. 791-800.
4. W. S. Robinson, "The Statistical Measurement of Agreement," American Sociological
Review, XXII, No. 1, February, 1957, pp. 17-25; M. G. Kendall, Rank Correlational
Methods, New York: Hafner Publishing Co., 1955, Chapter 6.
5. In any case, the variables must be comparably expressed, i.e., in scales with like origins,
means, and variances as in using the same number of ranked parties, or in using
percentiles (ordinal units), etc. Robinson's coefficient of agreement can only be used
legitimately here among different indices of acts (such as columns 1-5 in Figure 1) if those
indices have first been converted into comparable scales, i.e., with the same mean and
dispersion. Unless comparable indices are used, the discordance of social influence that is
at issue will be confounded with difference in the indices and units of the different acts. By
using rankings, although they are less exact than cardinal scales, they assure full
comparability and usually easier observing.
6. Perhaps the expectation here that the effectiveness of social influencing rises as concord
rises needs expounding. It may be explicitly stated as the hypothesis that: If any set of
increments, A, in a set of means-to-ends is a full and sole cause of (i.e., an effective means
to produce) the observed increments in the result B, such that A and B are in perfect
concord (Cd=1) and if an observed set of increments-in-means, A', is found to be
imperfectly correlated with B when isolated, then we infer that, to the extent of the deficit of
the correlation of A' with B, A is not here the full and sole cause of B and may be said to be
a less effective means than the A-set of means. (The difference, A-A', may represent A'
being not the best kind or degrees, or combinations, or systeming of means to the result B
and/or it may represent A' being overlaid or masked by unmeasured variables which dilute
or replace or partly cancel the A set.) Thus we hypothesize that the discord index between
the observed means and the actual results measures the "ineffectiveness" of these means
to these ends in the situation studied.
7. Note that here A is both pro-actor (in setting B's apologizing as A's goal) and means actor
in making a verbal demand as means for attaining that goal), while B is the reactor.
Whenever two of the three parties in a tri-partite tri-act thus merge and become the same
persons, the tri-act may be loosely called a "bi-act."
8. This logarithmic curve forecasting a social process is one of eight "reactants models" (to be
published) which we have developed and tested in our Washington Public Opinion
Laboratory both as rational model with exact specification of the conditions under which it
will recur and as an empirical model giving close fits to data in groups approximating those
conditions.
9. For 24 different interpretations and forms of correlation each presented in prose, in
algebraic formulas, and in geometric diagrams or graphs, see Chapter 6 in Stuart C. Dodd,
Dimensions of Society (Macmillan, 1942), 944 pp.
#32. The Concord Model for Social Control
#33. Racial Attitude Survey as a Basis for Community Planning:
The Broadview (Seattle) Study
by
Stuart C. Dodd
And
Robert W. O'Brien
Dr. Stuart C. Dodd and Dr. Robert W. O'Brien are Professors in the department of Sociology,
University of Washington at Seattle
Journal of Educational Sociology, Vol. 23, No. 2 (Oct., 1949), pp. 118-127
I. Relationship of Research to Planning
Attitude surveys may be conducted to develop instruments of measurement to get at
"the facts" in intergroup relations. Community planning is often a device used to effect "socially
desirable" change. The Broadview Study is an attempt to utilize precise instruments of
measurement to direct community planning.
II. The Incident Pointing up Broadview
In September, 1948, a racial issue flared up in the Broadview community when a mixed
white-Negro family established residence. There had been charges that anti-Semitic and antiCatholic feelings also existed in the area. Tensions became focused, however, when petitions
were circulated requesting that the family be forced to move. On the other hand some
residents held that there should be no racial or religious restrictions in the district. Broadview is
a middle-class neighborhood, which, like other sections of the Metropolitan District, has grown
rapidly during and since World War II. Local civic leaders say that about one-half of the
families living there have moved in within the past five years, and they estimate the average
income of families is now between $4,000 and $4,500 per year. This had been an all-white
section until 1946 when a Filipino and a Chinese family moved in. But their children took such
a psychological beating that the parents felt it necessary to move.
On July 20, 1948, the present family purchased a home in the district and moved in.
The husband, a Caucasian, was a postal clerk with twelve years of postal service experience:
eleven years in Los Angeles, and one in Seattle. During the first month, his wife, a Negro, was
aware that some of the neighbors stared at her; but none was either overtly hostile or friendly.
By the end of the fourth week, August 15, several threatening, anonymous telephone calls had
been received urging that the family move; they were not wanted. A week later a friendly
neighbor visited to say that she refused to sign a petition to force the family to move, and that
she thought there were many who felt as she did. By early September several attempts had
been made by residents and real estate men to buy out the family; one party offered a $3,000
profit which was refused. The son entered school for the first time and met considerable
hostility from children who called him derisive names. But with the help of teachers,
acceptance began before the first week had passed.
III. Move to Promote Good Feeling
In view of the threatening telephone calls, the family contacted the Urban League for
assistance in working out better relationships. A League staff member visited various
neighborhood leaders in an effort to establish cordial relationships for the family. Finally these
leaders formed a committee consisting of educators, a business man, a house-wife, and
various clergymen to ascertain what constructive steps might be taken to work out the
adjustment of the new family to the community and the community to the family. Pastors of
several churches visited them and also spoke to members of their congregations and parishes
regarding community responsibility for demonstrating democratic attitudes. Some of the
neighbors began to drop in to visit the family socially, to play pinochle, and to invite them to
their homes. The mother was encouraged to attend the meeting of the Broadview P.T.A. and
from all accounts it appears she was cordially received.
IV. Survey of Interracial Attitudes Is Requested
As the committee discussed the problem further, it was decided that more facts were
needed on the opinions and attitudes of householders in order to ascertain the extent and
degree of tensions. The committee considered this a pre-requisite to intelligent community
planning. Accordingly, the Urban League was asked to request the University of Washington to
make a survey. Professors Dodd and O'Brien met with the committee on October 27 to discuss
the desirability and scope of the survey. Between the 28th and the 30th, the preliminary
questions were formulated with the help of students and the University Public Opinion
Laboratory. On November 1, Professors Dodd and O'Brien and staff met in the Broadview
School with the committee to evaluate the questions and to agree on a possible questionnaire.
One significant factor of the study was a dimensional plan of a poll, which was drawn up by Dr.
Dodd. It was a job analysis of the fifty processes. Included were the dates of each process, the
number of man hours, the location of each, and the persons responsible, with their motives
(pay, academic credit, civic service) and the documents needed or resulting. This analysis
revealed in advance exactly what was to be done, where, by, with, and for whom it was to be
done, why it was to be done, and with what materials. The use of such a plan compels
completeness in advance, in an analysis of a social organization. As originally planned, all 400
houses in the area were selected to be surveyed rather than a sample of these. They were
divided into units and were assigned to a staff of forty interviewers. Three hundred and four
residents were interviewed, thirty-five others refused to give information, and no con-tact was
made at sixty-seven places. In the latter case, either no one was at home, or the houses were
unoccupied, or the residents were too busy to be interviewed during the two days of the
survey.
V. Background Factors Are Explored
Approximately eighty per cent of the residents interviewed were high school graduates,
and more than thirty-two per cent had attended college or university. The area is a white
gentile neighborhood with only the one non-white couple. No Jews were indicated by the
replies on religious preference. Twenty-one per cent of the respondents preferred the Roman
Catholic faith and sixty-five per cent were Protestants. The remainder indicated either "none"
or some "other" preference. Church attendance seemed to be higher than for Seattle
generally. One person in four attended every Sunday and one in two attended regularly or
occasionally. A substantial number of the residents had recently moved into the area. Nineteen
per cent had lived there less than a year and sixty per cent had moved there less than five
years before. Only one family in five lived in the neighborhood at the beginning of World War II.
This factor of mobility is borne out by the geographic backgrounds of respondents in terms of
the places where they attended grade school. Only twenty per cent attended grade schools in
the Seattle area and forty-two per cent in the State of Washington. Twenty-nine per cent
received their grade school education in the Mid-West; eleven per cent in other western and
Pacific Coast states; seven per cent in Europe; three per cent respectively in Canada, the MidAtlantic states, and the South; and one per cent in Alaska. It is more likely that the amount of
mobility is underestimated rather than over-estimated, since three interviews in five of the
study were with women, instead of with the more traditionally mobile males.
VI. The Controversial Question of Property Values Is Appraised
The contention of realtors in the area that property values had decreased because
Negroes were now living in the neighborhood was not supported by the study. To the question:
"Do you feel that property values in your block have increased or decreased recently?" only
about ten per cent of the residents said "decreased." Fifty per cent said "increased." Twentyseven per cent did not know. Nine per cent said there had been no change and four per cent
gave no answer.
Many respondents were not even aware that a Negro family was living in the area. To
the question: "Are there any Negro families living in your neighborhood?" (within one-half
dozen blocks or so) 193 or about sixty-three per cent of the respondents answered "No" or
"don't know"; 109 or thirty-six per cent said "Yes"; and two or less than one per cent gave no
answer. Only two per cent were acquainted with the family.
All respondents were asked whether they "approved" or "disapproved" of a Negro family
living in the area. More than fourteen per cent said "approve" and another sixteen per cent said
"don't care." Nearly sixty-three per cent said disapprove."
Mathematical analyses will be made of the correlations of the responses to the various
questions. Only one has been completed to date. It relates the "approval" or "disapproval" of
respondents that the Negro family or a Negro family should live in the district, to whether or not
respondents knew that a Negro family was already living there.
We can assume that the attitudes of those who did not know of the family at the time of
the survey were representative of the attitudes of all respondents prior to the moving in of the
family. Any difference in the attitude of those who knew of the family from that of those who did
not know of the family, is then due to changes caused by knowledge of the family's presence in
the district.
Quite contrary to common expectation, a larger per cent of those respondents who
knew that the family was living in the district were favorable to a Negro family being there than
of those who did not know. The presence of the Negro family in the district has resulted in nine
more persons becoming favorable, and nine less persons being unfavorable than would have
been expected on the basis of the percentages among those who did not know of the family.
Not all who expressed disapproval of Negroes as neighbors felt that property values
would decrease because of them. It has been stated that sixty-three per cent of the
respondents expressed disapproval. Those who disapproved and who also thought property
values would depreciate constituted only about twenty-nine per cent of all the respondents.
This added to the seven per cent, who while approving of the Negro family — though believing
property values would depreciate, makes a total of thirty-six per cent which represents all of
the respondents who believed that property values decrease because of the residence of a
Negro in the area.
A comparison of opinions on property devaluation before and after the respondents had
gained insight that the survey was concerned with the residence of a Negro family in the
neighborhood, reveals an interesting phenomenon. Before any questions on race relations
were asked, only ten per cent of the respondents thought that their property had decreased in
value recently. But after such questions had been posed, thirty-six per cent were of the opinion
that the presence of a Negro family had or would cause property devaluation. Since the family
had lived in the neighborhood for more than three months at the time of the survey, it is clear
that not the presence of the family, but the fears of respondents were responsible for this
opinion.
Among the ten per cent of respondents who originally said that their property values had
declined recently were those who gave as a reason the presence of a Negro family. However,
these amounted to only two per cent of all respondents. The fifty per cent of the respondents,
who originally said that their property had increased recently, gave more acceptable reasons in
terms of: increase in value of all property, new building activity in the area, and the prospect of
more street improvements.
VII. Petitions Pro and Con Are Investigated
Of significance also is an expression of democratic intent as revealed by those who
would sign a petition "to protect the right" of a Negro family to live in the neighborhood. This
response was given by nineteen per cent of the respondents, including some of both those
who approved of a resident Negro family and those who, although disapproving, would not
sign a petition, "to get this Negro family out of the neighborhood." A similar expression of
democratic intent is perhaps implied in the responses to the questions: "Would you sign a
petition to get this family out of your neighborhood?" asked of those who knew of the family;
and, "Would you sign a petition to keep Negroes from living in this neighborhood?" asked of
those who did not know of the family. About thirty per cent of the respondents said "No." Fortynine per cent said "Yes," with the bulk of this response (thirty-six per cent) coming from those
who were unaware of the family's presence in the area. Thus again there is a substantial
difference between those who disapprove and those who would activate their disapproval by
protest in signing a petition. Respondents who said they would sign such a petition were asked
if they had signed one. Those who said they had signed the petition that had been circulated to
get the family out amounted to only eight per cent of all respondents and not the ninety per
cent alleged by the realtors.
VIII. Tolerant Attitudes Are Shown
Despite the fact that sixty-three per cent of the respondents did not favor Negroes living
in the neighborhood, more than eighty-eight per cent said that they would not go out of their
way to make people of different racial groups feel unwanted. On the other hand, nearly sixtyeight per cent would not go out of their way to make them feel wanted. This may have an
important bearing on tolerance. Not only does it indicate the relative lack of attempts at
intimidation, but there is an indication that more people would favor making people of other
racial groups wanted than would favor making them unwanted. Only five per cent of the
respondents, approximately, said "Yes" they would go out of their way "to make them feel
unwanted." But seventeen per cent said they would go out of their way "to make them feel
wanted."
A majority of the respondents in the Broadview study showed only limited or no
experience with minority group persons in the normally accepted categories of community
living. About seventy-nine per cent had not attended school where there were a "considerable"
number of Negroes, Japanese, or Chinese. The highest number, forty-five per cent, occurred
where persons had worked on jobs with them — primarily Negroes. A similar number,
approximately twenty-nine per cent, had had persons of all three groups as "close friends" and
as "neighbors." Again these were more largely Negroes than others, but not significantly so.
The apparent lack of experience in school may be taken as an indication of the absence of
association with minority group persons during childhood. Hence the limited experience with
minorities which was indicated in the replies to the other questions can be assumed to have
occurred largely during adulthood.
Six social distance questions were included in the study primarily as a pre-testing for a
statewide survey of intergroup relations which the University of Washington will conduct.
These questions, however, are pertinent to this study and constitute an attempt to measure the
attitudes of respondents (all Caucasian gentile) toward Catholics, Chinese, Japanese, Jews,
Negroes and Protestants. Each respondent was handed a card on which were printed the
group names listed above. Each was then asked: "Are there any on the list
1) you would not want to have as close friends?"
2) you would not be willing to marry?"
3) whose teen-agers you would not want to see attend parties with teenage boys and
girls of your own group?"
4) whose teenagers you would not want to have in the same schools as teenagers of
your own group?"
5) that you would avoid sitting by?"
6) you would not want to work beside as equals on the job?"
Respondents consistently expressed greater social distance for Negroes than for any
other groups. The order of increasing acceptance was Negro, Japanese, Chinese, Jew,
Catholic, and Protestant. Social distance for these groups decreased from marriage, to close
friends, to teenage parties, to teenage and schools, to work, to sitting beside; and in reverse
order the percentage of respondents who showed no distance to groups increased from three
per cent at the question of marriage, to seventy-six per cent and seventy-seven per cent
respectively for the questions on teenagers and schools, and sitting by members of the groups.
The rising curve was broken at the question about working beside members of groups as
equals. Here it dipped to fifty-six per cent of the respondents who showed no social distance.
IX. Summary and Conclusion
There are good opportunities in Broadview for developing better attitudes toward
colored minorities. The following points are an indication of this:
1. It should be noted that a much smaller percentage of petition signers (eight per cent)
was found by the study than the ninety per cent alleged by local realtors; and that
nineteen per cent of the respondents would be willing to sign petitions to protect the
right of a Negro family to live in the area.
2. Thirty-six per cent of the respondents are willing to have Negroes in Broadview.
3. Seventeen per cent would go out of their way to make a family of a different racial
group feel wanted.
4. Property devaluation was the main reason given by realtors for not wanting Negroes
in the neighborhood. But fifty per cent of the respondents after a mixed racial family
had lived there for three months felt that property values had increased recently.
Only ten per cent said "decreased" among which only two per cent gave a racial
reason, and only 3.6 per cent thought that a resident Negro family would cause a
decrease in values.
5. There are strong suggestions of tolerance which can grow into understanding. The
feeling that people should not go out of their way to make a resident of another race
feel unwanted, is one.
The study findings imply the presence in the neighborhood of a potentially articulate
minority who feel quite strongly about keeping Broadview an all-white and perhaps an allgentile area. It seems likely that this group is represented among the five per cent who would
go out of their way to make a family of a different race feel unwanted, and among the eight per
cent who had signed a petition to get the family out. This intolerant minority, however, is
balanced by a minority who feel strongly that democracy can be practiced in daily life. The
latter group is represented among the thirty-six per cent who are willing to have Negroes in
Broadview; among the seventeen per cent who would go out of their way to make a family of a
different racial group feel wanted; and by the Broadview Committee which being organized can
be a strong force for influencing the people who feel less strongly about democratic principles
to implement the democracy they would all say they believed in.
U:58-2
#34. Can We Be Scientific about Humanism?
Stuart C. Dodd has taught in London and Beirut, at Harvard, and at the University of New
Mexico. During the last war, he served with the Army's Psychological Warfare Branch, and
since 1947 has been Director of the Washington Public Opinion Laboratory. He is a member of
the Board of Directors of the American Humanist Association.
The Humanist, No. 5, 1958, pp 259-265
Can we be scientific about Humanism? I propose to answer "Yes," and to sketch
forthwith the know-how involved. The know-how stems from the five steps in reasoning handed
down to us by John Dewey, one of the patron saints of Humanism. Dewey analyzed problemsolving into a felt difficulty which is defined by further observations until suggestions occur and
their implications or consequences are foreseen; whereupon one suggestion is tried out.
Let us apply these steps to the Humanists' problem of reformulating the Humanist
Manifesto of 1933, or the principles and beliefs which state the humanist philosophy and way
of life, in some terms appropriate to the present generation.
In John Dewey's time there was still a good deal of hangover from the time when words
were interpreted with theological content. Even scientists were less willing to use words of an
impersonal nature. At that time, values were still "spiritual." In 1933 the use of the in wants for
the word "values" would have been considered evidence of a crass materialistic philosophy,
because up until that time the spiritual was held in contrast to the material. Nowadays we are
willing to admit that people's needs may be materialistic, and we shy away from the word
"spiritual" as being mystical—as having no referent to point at.
We describe values today as being personal or social, abstract or concrete, and they
can be arranged in a spectrum from the immediate to the ultimate. "Higher" values today would
mean values that referred to all humanity, or to a longer time, or were more comprehensive.
Now that we believe in the unity of all substance, and have discarded Descartes' dualism, what
used to be called spiritual values are simply the most permanent and comprehensive ones.
Another difference between 1933 and the present is that the scientific method has
became well established in the social sciences. The techniques of polling, which are depended
on throughout this article, were undeveloped twenty-five years ago, so that the scientific
formulation of what men desire would have been impassible at that time. A scientific value
system can emerge only when men's choices can be made known and classified.
Let us point to some of the recently developed techniques for observing that were only
embryonic in 1933. How many readers even know the meanings of the following terms?
Cybernetics;
information theory;
axiology;
general semantics;
deontic and symbolic logics; the algebra of sets, matrices, and probabilities;
transactional psychology; dimensional sociology?
These are all tools which today have practical importance.
For our problem of reformulating humanist value systems, the most relevant technique
of observing facts is called a "demoscope," which is, defined simply, an instrument for
observing people by sampling. Demoscopes or pollings can deal with whatever people can talk
about — their needs or behavior, their material conditions, aspirations, or anything else. In its
ultimate significance many social scientists believe this technique will be as important as the
microscope for biology, or the telescope for astronomy.
Hundreds of hours of pretesting go into finding questions which will evoke useful
answers on subjects that have been personally or socially taboo, and will bring out conscious
expression of hitherto unconscious phenomena. No one claims we can do it completely, but
we can make the unconscious more and more conscious, as we devise ways of asking
questions that get around subconscious blocking. In my forthcoming book, which develops
techniques for world polls, I have listed a dozen techniques for overcoming psychological
resistance.
In the war, for example, you could not ask a person if he was doing anything to help the
enemy, for a direct question would certainly get a lying answer. I was transported into Lebanon
in 1942 to make an official test of polling techniques which could be used in administering
occupied territory after the war. I worked in Lebanon, Syria, and Palestine, asking questions
about radio listening habits and reactions to enemy and allied propaganda. Nearly everyone
listened to the enemy radio but no one would admit it. But by asking each man whether in his
opinion the people across the street, or in the next village, or someone of the other religious
sect, listened, we were able to absolve the respondent of blame for his answers, and to get an
indirect measure of the amount of listening. I was, in fact, able to solve by objective,
mathematically demonstrable techniques, the problem of measuring the amount of lying you
would get in a hostile population.
I was able to apply these techniques soon afterwards in Italy, when the question
became acute as to whether Italy would be an active ally after her conquest or whether she
would be a drag. As a matter of fact I found that the Italians were so leaderless after the
dissolution of the Fascist party that they had very little confidence in anyone — and the man
who got the most straw votes for a possible president (though he received less than 5% of the
total) was the British BBC announcer.
Before the war I had been concerned with the values of public health, having been sent
into Syria on Rockefeller Foundation funds to study beliefs, practices, and knowledge
conducive to health among Syrian villagers. Travelling clinics visited certain villages, and by
pairing these with other villages to which the clinic did not go, the improvements effected by
the clinics could be measured.
It is easy to see how these methods work on practical values such as public health, but
it may be more difficult to believe that it can work on "absolute" theological values.
Nevertheless, it has been tried with a set of Protestant ministers.
These men listed their "infinite" value-concepts — involving "God," "life," "character,"
"creation — and then chose the more important in each pair when confronted with every
possible pair in their own set of value-concepts. From these pair-comparisons using
Thurstone's Law of Comparative judgment (a sharp tool, incidentally, for our humanist
research) William R. Canon, Jr., found out that the so-called "absolute" values of religion were
proved to be relative — as is the case whenever people will choose between values (as Adam
and Eve learned some time ago!).
For another example of a demoscope, we point to the incipient Barometer of World
Opinion. This international polling agency is developing in various forms. Commercial polls
now operate in forty or more countries. The World Association for Public Opinion Research
(WAPOR) has become a Non-Governmental Consultant Organization to UNESCO and speaks
for the polling profession in any United Nations research involving polling. Our laboratory has
just completed a contract from UNESCO to prepare the techniques for the Barometer of World
Opinion. The reporting volume, Techniques for World Polls, now ready for press, points out
ways of measuring humanity's value-systems. The new tools for use on our Humanist
problems are thus being forged.
Knowing What We Want
Let us look at some of the consequences of having public knowledge of a universal
value system, so that we can see if it would be worth the vast research.
For most men their religion is only a small part of all they actually feel, know, and do in
living. A. polled value system could make religion and living one and the same. For it would
keep "religious life" defined as "all of a man's life," including whatever preponderance of
"upward" striving and aspiring each person might currently express.
All religions and ideologies hitherto have been limited to some fraction of humanity,
however much they may claim all men as potential believers, or in-group members. Perhaps
the United Nations' Universal Declaration of Human Rights states an explicit value-system that
includes more people (by the test of government endorsement at least) than any other today.
But the distribution of responses to polls of values could by definition include every human
being insofar as he becomes represented by some sample of people polled. All theist
responses could be included and even classified as part of man s semantic deviations in a
broadened concept of humanism as "humanity's current aspirations." Thus just as the best
democracy includes all its residents however subclassified and dealt with, so the broadest
humanism eventually may in-dude "all men on earth" and then develop better operating
mechanisms for harmonizing subclasses of men who in varying ways and degrees pursue
values destructive to other men.
Most religion hitherto has been limited to some fraction of culture, whatever the
proportion of a community's total way of life that h classified under the religious institution. But
by sufficiently comprehensive polling of all of men's wants from the most concrete to the most
abstract, this fraction could become the whole. This humanist "religion" would be coterminous
with living. Humanist aspirations would be fully measured by asking in detail about what each
respondent wants most in family and educational life, in economic and political ad airs, in
religious and philanthropic concerns, in health and recreational lines, in scientific and artistic
interests, in mass media and military matters. "Aspiring" would then mean "wanting" but with a
connotation of "wanting the best" and with further implied criteria or larger latent values
defining that "best" to each respondent.
The value systems of the world, hitherto numerous and diverse, seem to be condensing
around the two opposed poles of the East and the West. Towards reducing this trend to
opposition (not the trend to just two systems), UNESCO is developing a ten-year major project
called "Mutual Appreciation of Eastern and Western Cultural Values." Our proposed Project
Worth could make yeoman contribution to UNESCO's project insofar as our proposed polling
of the actual value systems of AHA members were extended to larger populations in the
Eastern and Western and uncommitted camps. An essential part of "mutual appreciation of
cultural values" is the exact knowledge polls can build up of what those values are as
perceived by the people themselves about themselves and about each other.
Periodic polling of value systems would have the interesting consequences of helping to
keep them adapting to whatever changes are going on in society. As technology progresses,
the number of possible new combinations of what is known forever expands and complicates
living. New syntheses, new selections, out of all this are forever needed and never finished as
in the Garden of Eden's Tree of Knowledge which required choosing forevermore between
what men think is good for them or evil for them. Traditional religions and ideologies, insofar as
they are based on past revelations and less mystic experience, all tend to become obsolescent
unless reformed periodically. Our proposal to repoll humanists' value systems periodically and
reformulate them constantly and currently should solve man's problem of keeping his religion
adapted to, or in harmony with, his social evolution. Periodic reformulation is a systematic
device for man, the time binder, to adapt to all on-going in time. As social change accelerates,
religious obsolescence will accelerate and will require progressive redefining of religion and a
mechanism to keep pace.
Another major consequence, we foresee, from the process of specification-on-trial, is
that it should help man shift from an absolutistic idealism to a relativistic reality. Man becomes
the measure of all things. The Good" becomes what is measurably most wanted by most men
in most periods, places, and contexts. As research gees forward in axiology, the science of
values, we shall learn in ever-increasing circumstantial detail just how people choose and
aspire and pursue their goals.
Not everyone, obviously, could be expected to be aware of all these values and their
consequences; but Humanists could easily turn out to be the most aware, especially if the
American Humanist Association is able to carry out its plan (called "Project Worth") 1 of using
itself as a guinea pig and thus giving a demonstration to the world. We believe that a thorough
study of Humanist values — as expressed by members of the A.H.A. — would be a valuable
part of the pretesting for world- wide polling. And when world-wide polling is achieved, the
values freely chosen by man will be, by very definition, a humanist set of values. For an
individual Humanist this point would be a way of helping him to express his philosophy of life
and, by expressing, helping him to develop it. Many incompatible values that some people
harbor will be brought to light. Some incompatible values leading to split personalities or other
mental illness could be increasingly diagnosed at an early stage and perhaps corrected in
time. And for each person it should reduce the gap between the values which he expresses
and those which he practices.
Testing the correspondence of what people profess to what they practice can gradually
develop more realistic value systems for individuals and society. Impossible gaps between
perfectionist professing and one's current practice could be seen as leading, sometimes to
discouragement and hypocrisy, sometimes to split compartments in oneself, sometimes to
other unhealthy forms of rationalizing. The healthier adjustment of setting oneself attainable
goals which forever advance with successful achievement could become increasingly
demonstrated. This contrasts with attempting to reach in a single leap "absolute purity,
absolute honesty, absolute unselfishness and absolute love" — to cite the perfectionist system
of Moral Rearmament. Repeated comparative measuring of professed with practiced values
can help close the perennial gap — however large or small it may be — between speech and
action, faith and good works, ideals and habits.
For Humanists as a group, this project should develop a philosophy of life for the whole
community, which would then become more consciously aware of its own purposes. Men
become mature by taking increased responsibility for their own destiny and that of future
generations. The measured correlations of lesser to larger values, of immediate to more
ultimate values, both when stated as ideals and lived as daily behavior, could help mankind
guide its social evolution ever more purposefully and effectively.
As demoscopes become applied to measuring the aspirations of humanity, I see the
process as a major contribution to social evolution. The development of a world value system
will increasingly make Lincoln's definition of democracy — government of the people, by the
people, and for the people — comes true for the whole world. The process of government will
far the first time become sensitive to the wishes of each individual, responsive to all.
Thus, in conclusion, to the question, "Why be scientific about Humanism?" my answer
is, "Because it can help answer the ancient question, 'Whither Mankind?' by pointing out:
'Whither Man decides.'"
Notes
1. For our study of the American Humanist Association we propose three rounds of polls.
These three proposals are described in a research program called "Project Worth” which
my laboratory worked out and with the directors of the AHA have adopted. Briefly, a "FreeResponding" poll invites every AHA member to write out or point to what he thinks is the
best statement of the humanist value-system. The edited results will be recirculated to the
AHA members in a "focusing" poll of closed-end questions. Its edited results will then be
presented to all AHA members in a third or "Formulating" poll. This will have scaled
questions for each member to specify his value-system by checking the kinds and degrees
of principles he believes in together with each principle's relative "worth" to him (in some
standardizing units of worth"). The distributions of responses to the formulating poll will then
formulate the current value-systems, or "hierarchy" of values, of the American Humanists.
A fourth round of Validating Polls will be hard to execute, requiring ample time, large
funds, and much research ingenuity. But if little validating can be done at first, we would
argue that measuring Humanist’ value-systems in the first three rounds of polling is in itself
a worthwhile project. These polls get Humanists to articulate their own values and clarify
their beliefs up to date. This alone could justify "Project Worth" even if we knew no better
than now just how well those asserted beliefs correlate with later activity.
#35. Use Scientific Methods in Planning
“A message to the Arab People and to all People”
A lecture delivered at the O.A.S. 15th Annual Convention on 'Planning for Development in the
Arab World'. University of Colorado, Boulder, August 30, 1966. This paper was awarded the
Merit Certificate of the OAS.
Dr. Stuart C. Dodd is Professor at the Institute for Sociological Research, University of
Washington, Seattle, and formerly Professor of Sociology and Director Social Science
Research Section, American University at Beirut (1927-1947).
I. Introducing the Message
When Jeha once was unprepared for his Friday sermon in the mosque, he stalled by
asking his hearers: "Do you folks know what I'm going to talk to you about?" They naturally
said "No." Then said Jeha: "What's the use of my talking to people so ignorant that they
haven't notion of what I am talking about." So he left the pulpit.
The next Friday Jeha tried this ruse again. But the people were as to curious what he
had to say and didn't want him to leave it unsaid, So they answered: "Yes" to his question.
"Then," said Jeha, "since you already know what I will talk about there’s no use in my talking
about it." And he stepped down.
The third Friday the people agreed amongst themselves to catch him either way. So
when Jeha asked again: "Do you folks know what I'm going to tell you?" half the people said
"Yes and half and half said "No " "Then, " said Jeha, "let those who know tell those who don't
know and there's no need for me to talk further and he escaped for the third time.
Now, unlike Jeha, I have a message to the Arab people. I want this message to spread
through the Arab lands. I want this message to help build the Arab State of the future. I believe
this message could help the Arabs again to lead the world the pursuit of science — and the
pursuit of science, especially the behavioral sciences, can lead the world toward solving the
problems of men in society.
This is the theme of my message — that science can save society better than
Communism or capitalism if unaided by science.
Now how better can this message be spread than by the Organization of Arab Students
through its journal and annual conference on planning this year? How better can the Arab
people come to know and believe and act upon this theme than by calling on the readers here,
as Jeha said, "Let those who know tell (and excite!) those who don't know!"
II. Why Use Scientific Methods?
Any proposal to change one's traditional and cherished folkways whether to change to
more scientific ways or to some political ways or to some religious ways or to other ways
meets the usual query: Why change? Will the probable gains be worth the efforts and
sacrifices required?
A. The chief gain to be expected from using scientific methods increasingly in planning is the
guarantee that they work. They are by definition any methods that are effective means to
ends. They include built-in tests to prove their effectiveness under the total local
conditions—including psychological as well as material conditions, social as well as
geographic conditions. Thus one can test the hypothesis: "If the Egyptian Government digs
a latrine for every villager's house, bilharzia will be reduced X per cent." By actual
experiment this hypothesis has been shown to be ineffective if the peasants prefer their
age-old habits of using the open fields. Changing popular attitudes and behaviors may be
more important than changing physical conditions in any areas of planning. Scientific
experiments on whole social systems with each factor or subsystem varied under
conditions that are controlled as rigorously as resources permit, can test just what factors
and weightings are most effective in the total local situation.
For a documented example of a pioneering experiment on improving the health of
whole communities in the Aral, world one may study my A Controlled Experiment an Rural
Hygiene in Syria, American University of Beirut, Social Science Series, 1939, pp. 336.
This experiment, by the Near East Foundation and the American University of Beirut,
set up health-mobile clinics in a circuit of Alaouite villages. Then this experiment, by means
of surveys "before and after' com- paring experimental with control villages on several
hundred indices of health and hygiene, tested just how effectively it had worked. It
developed and demonstrated effective and low-cost techniques for improving rural health.
These techniques were then multiplied in government plans in several states in the Middle
East.
B. A second gain from using increasingly the exact methods of science in national planning is
that one can measure just how much better Plan A or Feature X works than its alternatives.
For each alternative sub- system, or part of a Plan, indices measuring target goals and per
cents of their later fulfillment can tell not only which alternative works but also which works
best under the current conditions. Thus to double a nation's agricultural production, only a
series of national plans which are designed as controlled experiments testing a series of
alternative hypotheses, can show what is the best "mix" orcombination
of
steps
a
government can take.
Thus to double the agricultural product, just what should be the ratio of increase of
budget and manpower for each factor in that doubled product? — such as building up:
1) irrigation,
2) better breeding stock,
3) seed and
4) fertilizers, better practices in
5) planting,
6) cultivating,
7) harvesting,
8) storing,
9) marketing, and
10) financing,
11) better education and motivation of fellaheen and
12) of agronomic advisers,
13) 13)fuller research on pest control,
14) incentives to better use of
15) land,
16) water and
17) human resources, etc., etc.
C. A third broad set of gains the Arab world may expect in the long run from increasingly
scientific planning is enlarging achievement of over-all goals such as what was advocated
last night, namely: the ideals of Arab freedom and prosperity, equality of opportunity,
economic and social justice, and Arab witty.
For scientific methodology calls for operational definitions of one's concepts such as
the five underlined ideals in the previous sentence. An operational definition (in three
tenses) means developing indices that tell more exactly how to measure, or make, or
manage, the thing defined. Thus the five Arab goals above can be redefined in each of the
three tenses by the first five statistical moment of frequency distributions of any national
desiderata or set of whatever goals the nation when properly polled is shown to desire for
itself. (These statistical and se- mantic techniques are part of the modern scientific
methodology of the behavioral and social sciences which Arab students, preparing for Arab
leadership, should master for increasingly scientific planning.)
III. What Are Scientific Methods?
To spread this message of convening planning from a high art to an exact science one
needs today amore exact knowledge of what the term "scientific methods" stands for
Knowledge in two categories such as "True or false?" "Yes or No," "Black or white," "All or
none" was inadequate even in Jeha's day and much more inadequate in the 1960's. For
scientific methodology is not a single action of a researcher but is a vast set of acts varying in
form and degree, in sequences and compoundings with the field and the problem, with current
knowledge and resources.
But all scientific methods have a few basic purposes and procedures in common. Most
modern scientists have a common general purpose of so describing phenomena as to improve
their explanation, their prediction and (where possible) their control.
These broad purposes have now been spelled out in detail in our "Scient-scales "1
These provide 100 operationally specified and periodically up-dated ratings scales for
measuring the excellence of methodology as reported in any published research in the social
sciences. Although not ready yet for general use, these Scient scales, along with other
instruments of measurement, can help to define and periodically re-standardize what is
currently considered to be the best scientific methods in the behavioral and value sciences.
commend to you further study and development and eventual use of Scient scales as one
technical step in making national planning more of an exact science.
A simple way to sketch the procedures common to scientific methodology is to refine
Dewey's five steps in problem solving. Thus:
Step 1) A "felt difficulty" is refined as:
1a) Formulating the problem for research
e.g. To plan development for 5 years ahead in country X in fields A, B, C,
Step 2) "Defining the difficulty" is refined as:
2a) Observing and measuring all relevant facts
e.g. Study the recent statistics and plans and treatises on them in this and similar
countries
Step 3) "Suggesting solutions" is refined as:
3a) Hypothesizing that One Con of each set of alternative features will work best
e.g. Proposing tables of target indices for every part of the Plan
Step 4) "Thinking out the implications" is refined as:
4a) Designing experiments (or else further series of controlled observations) to
test each hypothesis
e.g. Comparing alternative features among sub-regions or subperiods in a
national plan
Step 5) "Deciding on one suggestion" is refined as:
5a) Adopting the best tested and best working hypothesis (until one still better is
found)
e.g. Adopt and execute the Plan and feedback detailed reports on its fulfillment
Note that this general schema of steps in scientific methodology involves the logical
processes (in ways too intricate to exhibit here) such as
a) Induction of principles (hypotheses) From particular facts;
b) Deduction of further probable facts (consequences) from such principles;
c) Production of creative syntheses, called models or operationally defined 'theories"
by compounding these and further logical processes.
Again our Scient-scales specifies both Dewey’s five operational steps and the three
logical processes in more analytic detail and reflects their actual execution in given case
through a 1000-point score. This score measures the degree to which the scientist has applied
the best scientific methods appropriately to his problem in his published research.
Furthermore, in applying these Scient-scales any Arab student can not only rate the quality of
his own or another person's research but he can also learn to recognize and appraise what
currently constitutes the best scientific methods.
Now what is the purport of this sketch of scientific methods for national planning, Just
this:
You students, preparing for leadership in the Arab world, should increasingly try in the
future to refine every Five Year National Plan towards becoming a system of controlled
experiments in macro-sociology. These controlled experiments should test explicit and
alternative hypotheses, or means, according to measured criteria, or ends, fulfilled, with as
much comprehensiveness and precision as resources permit. These criteria of success of a
Plan should include maximal indices of production at least cost in most effective balance and
with optimal polled satisfactions.
This will increasingly develop planning into both an exact and an applied social science.
Experimental social science can in turn become the most effective way man knows to guide in
developing the Arab world. It can try out the best parts of capitalism or communism, or other "isms" and synthesize from these present day conflicts ever-better social systems for the future.
Thus experimental social science with Five Year Plans as Laboratories can
progressively sift out the parts of the century-old Marxian writings that are inadequate in the
rapidly changing, highly technological, internationally little governed, society of today. Such a
Laboratory can sift out the vital versus the obsolescent or inadequate features of an American
(or other) Constitution of almost two centuries ago (with its twenty-odd Amendments and many
up-dating interpretations by the Supreme Court). It can test proposed new changes in a
Constitution or a lesser law by suitable set-ups comparing results and satisfactions with and
without the proposed change. In short, experimental social science with polling feedback can
measure and prove the degree to which such scientific planning most fully satisfies most Arabs
most deeply and durably.
The claims in the preceding paragraph constitute a strategy of societal planning. The
reader should compare this planning strategy with alternative strategies (which include just
continuing present practices in one's nation). Which strategy do you think is most likely to be
effective? Which seems to you likely to prove most satisfying in your country? Which strategy
do you judge to be most readily agreed upon? Etc.
IV. How To Use Scientific Methods In Planning
If the reader is now ready to concede that scientific planning holds promise, he will need
clearer notions of just how to make planning increasingly scientific. To start filling this need, let
us apply Dewey's recipe in the five steps. Let us apply them here just to Stage I in planning,
leaving Stages II and III for the interested reader to think out.
All planning can be divided like Gaul into at least three stages, namely:
Stage I: Setting the goals or ends in schedules of target indices;
Stage II: Executing the program as means to those ends;
Stage III: Feeding back reports on fulfillment of quotas, etc.3
Since the setting of the goals and subgoals is at present more of an art and least of a
science compared with the other two stages, let us focus on it in the rest of this article.
A. Formulating the Problem called "Stating the Goals".
To spell out for you just how a planning authority can go about developing a national
plan more scientifically than hitherto, let us follow Dewey's five steps applied to the first of the
three stages, namely the setting of the goals.
The first step is to formulate the problem manageably and fruitfully. We ask: What shall
be the goals5 of this Plan and how shall they be decided upon? Such questions, and the
answers to them, start most inclusively with general goals and go on down to the most detailed sub-goals. What shall be the general goals, the subgoals and the full hierarchy of
detailed target indices? These queries are repeated in each of the fields covered by a Five
Year Plan. These fields may be the fields of economic capital and consumer gains, the field of
education at its different levels, the field of health reducing specific morbidity and mortality
rates or unhygienic conditions, the field of family planning and consequent population control,
the field of political behavior and government activities and so on for any other fields or
institutional segments of the total culture that the overall national planning is to cover.
There are three type of goals and three styles of executing them to be distinguished.
These three types correspond to the three stages in planning.
First there are the goals of impact or desirable change and outcome (such as a doubled
per capita income o a state of 90% literacy, or an outcome of 50% fewer births per year).
Second, there are the goals of effort or means taken to produce those ends (such as
percentage increase in factories set up, schools built and staffed, birth control clinics
established, agronomes trained, man-hours spent producing and consuming educational
materials in meetings, radio and TV publications, etc, These more specific goals of effort may
be called the program of activities undertaken to implement the broader goals of impact or
effective result.
Third there are the goals of feedback or reporting on all stages and aspects and parts of
the planning so as better to govern or replan its rates and direction and per cents of fulfillment.
In addition to the three types of goals, one should note three general 'styles' of planning
according to the degree that they involve democratic action by citizens or authoritarian action
by government. These might be seen, somewhat over-simplified, as the American style of
planning, the Russian style and the Swedish style. At extreme, the most democratic planning
may depend chiefly on financial contributions and voluntary action of individuals and private
groups as in Beautification Campaigns, Good Neighbor Community Chests, Guide-line Policies
to Freeze Prices and Wages to control inflation, etc. This American style of planning works well
in a wealthy technologically advanced society which wants a minimum of regimenting,
compulsion or curtailing of freedoms.
At the other extreme, the most authoritarian planning may depend chiefly on goverment
actions through officials and agents specially trained for such purposes as road and dam
building, pest control, setting up of clinics, schools, agricultural experiment stations, collective
farms, factories for heavy industry, and other public agencies. This Russian style of planning
works well in a poorer, less technically advanced country wanting large and swift progress and
willing to discipline itself to postpone expanding consumer goods. Between these extremes of
less centralized individualistic competition individualistic competition and more centralized
socialistic competition most national plans vary greatly as a whole and in their different parts.
The Swedish middle way or mixed style of planning may combine individual incentive,
cooperatives, private and public agencies, of municipal to national scope, in a non-doctrinaire
and highly eclectic experimental system.
Here again scientific planning can thrive on this diversity along the private-public
continuum as well as diversity along other continuums. Scientific planning can use diversity as
a set of alternative hypotheses and as an opportunity for exact experimental testing instead of
letting it harden into dogmatic conflict that often leads to military conflict.
Again, the present thirteen Arab States are in a strategic situation to side with neither
Communist nor capitalist regimes or ideology exclusively but to select increasingly by scientific
planning that composite social system which when tested produces most at least cost, with
greatest satisfying power in its total context.
The broad goals and their specific target indices in a comprehensive five or fifty year
plan may require a book filled with tables to specify in detail the subdivisions of the plan by
subperiods, subregions, subpopulations, and subcategories of desiderata, or activity, or
conditions. These answer in full the general question that formulates the problem in planning
— namely: What hierarchy of impact goals, and programs of efforts as means thereto, and
target amounts of reporting as feedback thereon, shall be adopted?
After outlining the stages and the three styles of goal-setting in a Plan we next ask
about the procedure in goal-setting: How shall these goals be decided on? Here scientific
planning "that works best" will go far beyond getting a Planning Commission to formulate, and
a Legislature to adopt, a National x-Year Plan. The most vital factor in planning is the popular
desire for it. To intensify this desire, this popular motivation for a Plan may require public
education and propaganda that is massive and continuous, skilled and persuasive. Such
propaganda and education to motivate a plan should be a special sub-program within every
Five or Fifty Year Plan in proportion as it involves popular participation.
The study of past, present and future plans should permeate the entire school system
from nursery school through the university and all adult education. For a major function of
education should be to produce purposeful citizens-in-a-state who unceasingly seek and test
whatever seems best among alternative courses of action that are open to them.
One of the techniques for helping to engineer such popular motivation for a plan is to
secure popular participation in formulating it. Then it becomes internalized as "our plan" and
not "their plan": i.e., the governments plan.
The people should participate in choosing the goals and targets especially wherever
their action and willingness to sacrifice alternatives is at stake. A major new tool here is a
national polling agency, such as England has, in its Social Surveys. Then monthly surveys of
the public attitudes and desires and behaviors, can collect currently needed data for
replanning. This polling can adjust a Plan to be most fully what most people want most deeply
and durably — and therefore to assure that it will work well.
The best decisions on goals and programs and feedbacks should come from
converging agreement among the three partners in a self-governing nation, viz:
1) The citizens should express their broad non-technical desires at understood costs —
e.g. a doubling of heavy industry in 10 years at a cost of keeping consumer goods at
current austere levels. Or a drive in a largely illiterate country for 100% literacy within
x years at a cost perhaps of every literate adult teaching one illiterate person to read
or be fined a month's income.4
2) The relevant knowledgeable experts, appointed by professionals should be polled
with technical thoroughness for their advice on the programmed steps, their amount,
timing, and compounding, that seems most likely to fulfill the impact goals.
3) The officials responsible for executing the plan should also be polled to measure
their estimates of what amounts of each goal and balances among them seem
feasible.
These three parties — the public, the experts and the officials — are partners in selfgoverning states. Their roles emphasize respectively the three modes of human behavior —
feeling, knowing, and doing. For the sovereign public chooses goals largely by what they feel
most liking for. The experts advise largely according to what they know to be most alike to
standards of cause and effect, of best means to ends. The officials estimate from previous
experience what they can most likely do. Insofar then as the three partners agree on the goals
of a Plan, one has scientific assurance that the goals are what that nation likes most, thinks
most effective, and is most likely to achieve.
Then the Planning Agency in a second draft of a proposed Plan could synthesize or the
Legislature to adopt the features of the total national plan. It would try to harmonize the
feelings of the public, the knowledge of the experts, and the administrative action of the
goverment- in- partnership-with- the- experts -and-the-citizenry.
For a fuller discussion of the eight chief dimensions of scientific planning see Chapter
40 on "Social Planning" in my Social Relations in the Middle East (Social Science Series, No.
17. American University of Beirut, 1996, 3rd ed., 904 pp.). This textbook in citizenship was required of all Freshmen in the prewar decades as part of their preparation for national
leadership and independence.
This 'Civics" course included a Laboratory in leadership. Each student led a face-to-face
group, directing its weekly meetings throughout the year, under supervision of upperclass
inspectors. Each leader prepared a weekly report that was discussed in class and doubly
graded, once for English by the English Departments teachers, and again for content of social
leadership by the Sociology Department teachers. Each Report had sections on
(1) Problems Emerging in my group last week;
(2) Relevant facts and observations;
(3) Hypotheses or Plans Ahead for reducing these problems; and
(4) Feedback reporting the next week on how those Plans Worked.
For a more developed model or operationally defined theory to help guide, explain, and
predict this whole democratic process of self-governed national planning, I commend for your
study our recent publication on "The Likability Models for Predicting Probable Acts of Man." 7 It
offers a comprehensive model for human individual and mass behavior that is highly testable
and highly applicable to planning. It may be the first comprehensive and predictive theory of
human behavior, especially mass behavior that is completely testable by polling. National Five
Year Plans seem to be the best laboratories for testing this theory on the one hand, and for
using it as an integrative theory or system for developing national purposes and behavior, on
the other hand.
.
B. Observing the Facts in Goal-setting
The second step in setting the goals is to observe all the facts that may be relevant to
answering the hierarchy of questions which formulate the problem: What shall be the goals in
this Plan? The facts to observe will be the statistical indices and other data about the region
planned for (and some comparable regions) in recent periods, especially the relevant target
indices of previous plans together with their accurate percentages of fulfillment.
The types of relevant indices to be observed are the standard, necessary and sufficient
factors in any human transaction. A transaction has been defined 8 as a behavioral system or
mathematical product of eight subsystems or sets of factors, namely: the Acts of People for
Wants (i.e. Valued-objects) under context conditions comprising spatial, temporal, material,
verbal and residual circumstances. A planning transaction is then specified by telling: What is
to be done, by whom, why, where, and when, under which material, symbolic, and residual
circumstances.
The transaction called "setting the goals in planning" is then specified by stating the
Plan's goals as follows:
Target indices for each goal or valued-object = the set of "values/ V0
Target indices for each act pursuing a goal in units mostly of man-hours of effort = the
set of implementing Acts, A°
Target indices for the subpopulations involved = the set of human agencies. P°
Target indices differentiating by regions = the set of subspaces, L°
Target indices for subperiods, rates, dates = the set of timings, T°
Thus all indices are to be reported in 3 tenses, namely:
In present In the present tense as current fact;
In the future tense as planned target;
In the past tense as percentage fulfillment of previous targets.
Target indices for material equipment involved, often measured in money units of a
budget = the set of materials, M°
Target indices for symbolic records, publications, etc., used = the set of words, W°
Target indices for residual circumstances, often compounds of the above = the set of
circumstances, C°9
C. Hypothesizing a Set of Goals
The third of Dewey's five steps in problem-solving (which is used here as a primitive
model for scientific methods in outline) is to suggest remedies for the felt difficulty. This means
in more formal terms to hypothesize or propose one or more solutions to the problem at issue
— which is here: What shall the goals be? For every Plan implicitly hypothesizes: "If these
goals are implemented and fulfilled, then the people will be better served and satisfied than by
alternative goals or systems of goals? This calls for indices measuring just how well the people
are "served and satisfied" by the goals that are set (as well as by the executing and reporting
stages which are not discussed here). This implicit hypothesis as to the goals chosen requires
both
1) target indices specifying each goal in measurable amount and also
2) satisfaction indices expressing popular endorsement from polls for that degree of
each target.
Most national plan so far lack such indices of the degree of the nation's knowledge and
appreciation and endorsement of the targets, the implementing program, and the feedback
subsystem, within a total Five Year Plan.
More scientific planning will establish regular polling to feed back to the planning
authority the public's satisfactions and dissatisfactions, with the targets as well as with the
program and reporting. Then revisions (and education of the public if this is indicated) can
continually increase the peoples liking for the current goals as well as the other features of the
current Plan. This polling of public satisfaction should increase the likelihood that the citizens
will work well to help fulfill the next period's plan. Such frequent feedback from polls is an
essential factor in making successive plans work.
Such feedback makes a more fully cybernetic or self-governing system of the nation's
life that forever tries to convert national aspirations into fulfillments — wi
Download