File

Psyc 6627 – Lab 4 – Correlation use USairpollution data set from the HSAUR2 package Correlation in Excel 1. z-scores: FISHER and FISHERINV Correlation in R 1. Scatterplot a. choose from integer variables b. Identify points: to use cursor to add ID/row number c. Jitter: to see data points that are overlaying d. Log: use log values e. Marginal boxplots: boxplot for each variable f. Least-squares line: solid green; regression line g. Smooth line: solid red, locally weighted, use to look for deviations from linearity h. Show spread: spread around the smooth line i. Span for smooth: span that is used to determine the smooth line j. Change axis labels k. Subset expression: to use a subset of observations (sex == “male”) l. Plot by groups: select a categorical variable to distinguish between groups 2. Scatterplot matrix a. Least-squares lines b. Smooth lines c. Show spread d. Density plots: on diagonal, similar to a histogram, good way to view the distribution e. Histograms: on diagonal f. Boxplots: on diagonal g. One-dimensional scatterplots: not very useful h. Normal QQ plots: Q = quantile; compares the distribution of a variable to the normal distribution (the straight line), good way to look at linearity of the data i. Subset and Plot by groups 3. Calculating correlations a. Statistics, Summaries, Correlation Matrix i. Pairwise p-values: to get matrix of p-values b. Statistics, Summaries, Correlation Text (only possible to select two variables) i. Alternative Hypothesis (one and two-tailed options) ii. Output: t, df, and p-value, statement of the alternative hypothesis, 95% CI of the correlation, sample estimate (the correlation value) 4. Correlations with script #calculate and output the correlations and covariance for two variables cor(temp, precip) cov(temp, precip) ###calculate correlations for all variables in a dataframe or matrix #use: how to handle missing data, other options are all.obs (for no missing data) and complete.obs (listwise deletion) #method: type of correlation, other options are spearman and kendall cor(USairpollution, use="pairwise.complete.obs", method="pearson") cov(USairpollution, use="pairwise.complete.obs") ###Compute correlations between certain variables (not a whole matrix) #these variables will be the rows; the numbers mean variables 1 and 2 from the data set x <- USairpollution[1:2] #these variables will the the columns; the numbers mean variables 3 through 7 y <- USairpollution[3:7] cor(x,y) cov(x,y) ###Calculate correlations and covariances with significance levels. #data structure has to be a matrix. Only uses pairwise deletion #load the Hmisc package library (Hmisc) #convert the data from to a matrix and calculate the full correlation matrix rcorr(as.matrix(USairpollution)) #calculate correlations for pairs of variables rcorr(temp, precip) ###Calculate correlation and covariance for different levels of a categorical variable #attach Cowles data set because it has a categorical variables data(Cowles, package="car") attach(Cowles) #load the package called plyr library(plyr) #ddply = input is a dataframe and output is a dataframe #Cowles is the name of the data set #sex is the variable we are splitting by #neuroticm and extraversion are the variables being correlated ddply(Cowles, .(sex), summarise, "corr" = cor(neuroticism, extraversion)) ddply(Cowles, .(sex), summarise, "covariance" = cov(neuroticism, extraversion)) ###Calculate confidence intervals library(psychometric) CIr(r=.90, n = 100, level = .95) ###Compare independent correlations #Install and load the psych package library (psych) #(r1, r2, NULL, n, n2=) NULL is needed because there is not a third correlation, n2= is optional paired.r(.52, .12, NULL, 60, n2=50) ###Compare dependent correlations #Use the psych package #library (psych) #(r31, r32, r12, N) paired.r(.20,.45,.15,103) 5. Online calculator for p-values: http://www.socscistatistics.com/pvalues/pearsondistribution.aspx

File

Related documents

Products

Support

File

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib