METHODS OF KNOWINGCharles Pierce 1.TENACITY 2. AUTHORITY 3.INTUITION 4. SCIENCE TENACITY • We’ve always done it this way • “When I was a kid, school was…” AUTHORITY • X says… • The Bible (Koran, Confucius, Vedic Script) says… • The Founding Fathers meant… • Dr. Saxon’s method says… INTUITION • • • • • It is only common sense that… It is obvious… Anyone can see… Everyone agrees… The majority says… SCIENCE • DISCIPLINED INQUIRY 1. Self-correction 2. Objective-orientationintersubjective confirmability 3. Active and passive exploration Description-Explanation • • • • • • • • Specificity (uniqueness phenomenology Naturalistic observe existing Universality generalizability) logical positivism Scientific experimentation, observation multiple realities objective reality (social,personal) (multiple perceptions) Constructivism Cognitive science ideological linkage apolitical orientation (Social justice etc.) EXPERIMENTATION • Systematic intervention into natural system • Careful observation of conditions: - Before/after - Comparison with different case/system • Attribution of CAUSE to the intervention CAUSE • Does not exist: Hume, Bertrand Russell (extreme logical positivist: relationships among natural variables can be described by mathematical statements) • Essentialists: cause exists if and only if presence of the cause results in an EFFECT; the effect occurs only when the cause is present CAUSE • Empiricists: J. S. Mill argued 3 necessary conditions: AGREEMENT: the effect occurs after the effect DIFFERENCE: the effect does not occur when the effect is absent in general CONCOMITANT VARIATION: support for causal relationship is greater when AGREEMENT and DIFFERENCE are close in time CAUSE- Nonmanipulable & Manipulable • Some causes can be observed but not manipulated due to ethical, natural, or practical considerations • eg. Gender causes differences in career interest • eg. PTSS causes hallucinations • eg. Divorce causes insecurity in children CAUSE- Molar and Molecular • Molecular causes- focus on small scale units, subparts • Molar causes- focus on larger systems that are made up of molecular causes eg: Teaching a student to decode words causes improvement in simple comprehension - molar Teaching a student to sound out beginning letters causes decoding gains EXPERIMENTS • RANDOMIZED – R. A. Fisher, 1923: split plot randomization of treatments; treatment as cause vs. chance • NONRANDOMIZED – Internal validity threats as alternative causal explanations NATURAL EXPERIMENTS • Time series experiments – Occurence of new condition in one place/group – Nonoccurence in a similar place/group NATURAL EXPERIMENTS 700 600 500 400 300 200 100 1 13 7 25 19 Case Number 37 31 49 43 61 55 73 67 85 79 97 91 103 NATURAL EXPERIMENTS 700 600 500 400 300 200 100 1.00 13.00 7.00 MONTH 25.00 19.00 37.00 31.00 49.00 43.00 61.00 55.00 73.00 67.00 85.00 79.00 97.00 91.00 103.00 CONSTRUCT VALIDITY • Single study as data point • Variables in study represent constructs, rarely define them • Study results often interactive with construct definition of the study- revision based on the conduct, findings of the study EXTERNAL VALIDITY • 3 Constructs: – Population: persons/groups – Ecology: settings – Time: absolute or relative time SAMPLING ISSUES • Formal random sampling theory – Statistical theory of estimation • Purposive sampling – ignore sampling theory, focus on particulars • Convenience sampling – availability of participants/subjects Grounded Theory of Generalization • Surface similarity of sample to population • Rule out irrelevant variables, conditions • Identify limits to generalizationdiscriminations • Interpolate and extrapolate data • Develop causal explanations: covariances and directional relationships CRITIQUES AND CRITICS • Kuhn- theories are incommensurable – theory vs. observation/data • • • • Collins- science is social construction Trust vs. skepticism in science External reality of data vs. theories Science as a preferred human predisposition: – evolutionary result? Multiple regression analysis • • • • • Two or more interval scale predictors Single interval dependent variable Predictors “known without error” Model: y = b1x1 + b2x2 + b0 + e Create predicted score yhat = b1x1 + b2x2 + b0 where means estimate of the model • OLS estimation is most commonly used Multiple regression analysis (MRA) • The test of the overall hypothesis that y is unrelated to all predictors, equivalent to • H0: 2y123… = 0 • H1: 2y123… = 0 • is tested by • F = [ R2y123… / p] / [ ( 1 - R2y123…) / (n – p – 1) ] • F = [ SSreg / p ] / [ SSe / (n – p – 1)] ANOVA table for MRA SOURCE x1, x2… e (residual) total df Sum of Squares Mean Square F p SSreg SSreg / p SSreg/ 1 SSe /(n-p-1) n-p-1 SSe SSe / (n-p-1) n-1 SSy / (n-1) SSy • Table 8.2: Multiple regression table for Sums of Squares Multiple regression analysis predicting Depression Model Summary Model 1 R .774a R Sq uare .600 Adjusted R Sq uare .596 Std. Error of the Estimate 6.120 a. Predictors: (Constant), t11, t9, t10 ANOVAb Model 1 Reg ression Residual Total Sum of Squares 21819.235 14571.498 36390.733 df 3 389 392 Mean Square 7273.078 37.459 a. Predictors: (Constant), t11, t9, t10 b. Dependent Variable: t6 LOCUS OF CONTROL, SELF-ESTEEM, SELF-RELIANCE F 194.162 Sig . .000a VENN DIAGRAMS • Venn diagrams are “heuristic” only for more than two predictors • They do not correctly separate sums of squares for 3 or more predictors (cannot represent 4 or more dimensions correctly in flat 2-D space • Still give us an idea of the rationale for predictor additions to prediction SSreg ssx1 SSy SSe ssx2 Fig. 8.4: Venn diagram for multiple regression with two predictors and one outcome measure Type I and III Sums of Squares • Type I sums of squares are the SS accounted for by a predictor in a specific order: the analyst specifies the order or allows the MRA program to pick the order – Forward regression: best predictor (most SS accounted for) is included first; second predictor is the one that accounts for the most additional SS • Type III sums of squares are the unique SS accounted for by a predictor Type I ssx1 SSx1 SSy SSe SSx2 Type III ssx2 Fig. 8.5: Type I contributions Type III ssx1 SSx1 SSy SSe SSx2 Type III ssx2 Fig. 8.6: Type IIII unique contributions Multiple Regression ANOVA table SOURCE • • • • • • • Model • df 2 Sum of Squares Mean Square (Type I) SSreg SSreg / 2 x1 1 SSx1 SSx1 / 1 x2 1 SSx2 x1 SSx2 x1 e n-3 SSe SSe / (n-3) total n-1 SSy SSy / (n-3) F SSreg / 2 SSe / (n-3) SSx1/ 1 SSe /(n-3) SSx2 x1/ 1 SSe /(n-3) Table 8.3: Multiple regression table for Sums of Squares of each predictor Path Models for MRA • Predictors are called exogenous variables – Exogenous variables never have a straight path arrow directed toward them – They may have curved “correlation” arrows connecting them to other exogenous var’s • Dependent variables are called endogenous variables – They always have at least one straight path arrow directed toward them – Endogenous variables may be predictors of other endogenous variables in more complex models Accounting for correlation in path models • The correlation between any two variables is the sum of the path effects between them • Any path effect is a unique pathway that may pass though any number of other exogenous and/or endogenous variables • The total path effect is computed by multiplying all path coefficients together along a path PATH DIAGRAM FOR REGRESSION X1 = .5 .387 r = .4 Y X2 e = .6 r(x2,y) = .6 (direct path) r(x1,y) = .5 (direct path) + .4 x .6 (unanalyzed path = .74 + .4 x .5 (unanalyzed path = .80 PATH DIAGRAM FOR REGRESSION X1 = .5 .387 r = .4 Y X2 = .6 e R2 = .742 + .82 - 2(.74)(.8)(.4) R2 =[ rx1y + rx2y - 2(rx1x2)(rx1y + rx2y) ] (1-. rx1x22) (1-.42) = .85 Estimating regression weights • Normal equations are computed (OLS) • Least squares estimates are BLUES • “Raw” or unstandardized regression weights are used to predict an outcome score for a specific set of predictor values • Statistical test for signficance for a bweight is t-distributed with n-p-1 degrees of freedom (n=number of cases, p=# predictors) Estimating regression weights • SPSS Regression in Analyze provides b estimates,also “standardized” estimates • Standardized estimates are also called beta weights – Beta weights are regression coefficients when all predictors and the dependent variable have mean zero and variance 1 (z-scores) • Only raw weight t-statistics can be interpreted in hypothesis tests formally Depression Coefficientsa Model 1 (Constant) t9 t10 t11 Unstandardized Coefficients B Std. Error 51.939 3.305 .440 .034 -.302 .036 -.181 .035 Standardized Coefficients Beta .471 -.317 -.186 t 15.715 12.842 -8.462 -5.186 Sig . .000 .000 .000 .000 a. Dependent Variable: t6 e LOC. CON. .4 = .63 .471 -.317 DEPRESSION SELF-EST SELF-REL -.186 R2 = .60 Shrinkage R2 • Different definitions: ask which is being used: – What is population value for a sample R2? • R2s = 1 – (1- R2)(n-1)/(n-k-1) – What is the cross-validation from sample to sample? • R2sc = 1 – (1- R2)(n+k)/(n-k) Estimation Methods • Types of Estimation: – Ordinary Least Squares (OLS) • Minimize sum of squared errors around the prediction line – Generalized Least Squares • A regression technique that is used when the error terms from an ordinary least squares regression display non-random patterns such as autocorrelation or heteroskedasticity. – Maximum Likelihood Maximum Likelihood Estimation • Maximum likelihood estimation • There is nothing visual about the maximum likelihood method - but it is a powerful method and, at least for large samples, very precise • Maximum likelihood estimation begins with writing a mathematical expression known as the Likelihood Function of the sample data. Loosely speaking, the likelihood of a set of data is the probability of obtaining that particular set of data, given the chosen probability distribution model. This expression contains the unknown model parameters. • The values of these parameters that maximize the sample likelihood are known as the Maximum Likelihood Estimatesor MLE's. Maximum likelihood estimation is a totally analytic maximization procedure. • Maximum Likelihood Estimation L Maximum L at b= 1.345 (likelihood function) 0 b values 1 2 3 Maximum Likelihood Estimation MLE's and Likelihood Functions generally have very • desirable large sample properties: they become unbiased minimum variance estimators as the sample size increases they have approximate normal distributions and approximate sample variances that can be calculated and used to generate confidence bounds likelihood functions can be used to test hypotheses about models and parameters With small samples, MLE's may not be very precise and may even generate a line that lies above or below the data points There are only two drawbacks to MLE's, but they are important ones: 1 With small numbers of failures (less than 5, and sometimes less than 10 is small), MLE's can be heavily biased and the large sample optimality properties do not apply 2 Calculating MLE's often requires specialized software for solving complex non-linear equations. This is less of a problem as time goes by, as more statistical packages are upgrading to contain MLE analysis capability every year. Outliers • Leverage (for a single predictor): • Li = 1/n + (Xi –Mx)2/ x2 (min=1/n, max=1) • Values larger than 1/n by large amount should be of concern – Yi) 2 / [(k+1)MSres] • Cook’s Di = (Y – the difference between predicted Y with and without Xi Outliers • In SPSS Regression, under the SAVE option, both leverage and Cook’s D will be computed and saved as new variables with values for each case t12 t13 t14 63 42 41 56 56 41 77 52 39 30 53 52 55 42 59 48 50 50 55 55 60 39 39 65 39 45 60 55 39 80 80 65 44 68 46 52 57 41 65 54 65 60 68 68 41 68 68 46 COO_1 LEV_1 .03855 .02422 .02065 .01915 .01696 .01689 .01525 .01448 .01425 .01242 .01133 .01060 .01047 .00918 .00907 .00885 .01520 .04943 .02010 .02349 .01056 .02435 .01520 .01607 .02289 .01346 .03147 .00693 .00512 .02459 .01098 .00160 Might reanalyze with these data points omitted