1R语言讲义(包括各种回归)

advertisement
R 语言讲义
李大玮
——在此,向吴喜之老师致
敬
R的优点









免费
通用性:在视窗、Mac、各种Unix系统通用
资源公开(不是黑盒子,也不是吝啬鬼)
容易学习的语法。可编程以实行复杂的课题
可扩展: 通过数千个网上提供的适用于不同领域、不同目的、
不同方法的软件包来实现你的目标。你也可以把你的方法贡
献出来
强大的绘图功能
R 有优秀的内在帮助系统
R有优秀的画图功能
R社区的支持,不断更新,不断修正
R:
 绝大多数美国统计研究生都会的语言
 Berkeley统计和应用数学本科都开设R语言课
 美国应用统计学家大都把自己的方法首先以R来实现,并尽量
放到R 网站上
 一年多,R网站的软件包数量增加了两倍,从近1000个到近
3000个。大都都有关于计算、演示和输入输出方法的函数和例
子数据
 所有代码都是公开、可以改变的
 透明是防止“腐败”的最好方式
下载R(http://www.r-project.org/)
点击CRAN得到一批镜像网站
点击镜像网站比如Berkeley
选择
base
选择这个,下载安装文件
选择这个,下载软件包
Packages (每个都有大量数据和可以读写修改的
函数/程序)















base The R Base Package
boot Bootstrap R (S-Plus) Functions (Canty)
class Functions for Classification
cluster Cluster Analysis Extended Rousseeuw et al.
concord Concordance and reliability
datasets The R Datasets Package
exactRankTests Exact Distributions for Rank and Permutation Tests
foreign Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat, dBase, ...
graphics The R Graphics Package
grDevices The R Graphics Devices and Support for Colours and Fonts
grid The Grid Graphics Package
KernSmooth Functions for kernel smoothing for Wand & Jones (1995)
lattice Lattice Graphics Interface
tools Tools for Package Development
utils The R Utils Package
Packages (继续)





















MASSMain Package of Venables and Ripley's MASS
methodsFormal Methods and Classes
mgcvGAMs with GCV smoothness estimation and GAMMs by REML/PQL
multtestResampling-based multiple hypothesis testing
nlmeLinear and nonlinear mixed effects models
nnetFeed-forward Neural Networks and Multinomial Log-Linear Models
nortestTests for Normality
outliersTests for outliers
plsPartial Least Squares Regression (PLSR) and Principal Component Regression (PCR)
pls.pcrPLS and PCR functions
rpartRecursive Partitioning
SAGxStatistical Analysis of the GeneChip
smaStatistical Microarray Analysis
spatialFunctions for Kriging and Point Pattern Analysis
splinesRegression Spline Functions and Classes
statsThe R Stats Package
stats4Statistical Functions using S4 Classes
survivalSurvival analysis, including penalised likelihood.
tcltkTcl/Tk Interface
toolsTools for Package Development
utilsThe R Utils Package
Packages (网上)
网上还有许多
所有这些Packages可以自由下载
Base中的package包含常用的函
数和数据
而其他的packages包含各个方向
统计学家自己发展的方法和数据。
希望你是下一个加盟这些
packages的作者之一。
安装Packages
几个有用的函数
函数:f(x): 名字(变元)
getwd()
setwd(dir = "f:/2010stat")#或setwd("f:/2010stat")
getwd()
x=rnorm(100)
ls()
?rnorm#或help(rnorm)
apropos(“norm“)
赋值和运算
z = rnorm(1000000,4,0.1)
median(z)
赋值: “=”可以用“<-”代替
x<-z->y->w
简单数学运算有:
+,-,*,/, ^,%*%,%%(mod)
%/%(整数除法)等等
常用的数学函数有:abs , sign , log , log2, log10 , logb,
expm1, log1p(x), sqrt , exp , sin , cos , tan , acos , asin, atan , cosh ,
sinh, tanh
赋值和运算







round, floor, ceiling
gamma , lgamma, digamma and trigamma.
sum, prod, cumsum, cumprod
max, min, cummax, cummin, pmax, pmin, range
mean, length, var, duplicated, unique
union, intersect, setdiff
>, >=, <, <=, &, |, !
从高到低的运算次序
一些基本运算例子
x=1:100
(x=1:100)
sample(x,20)
set.seed(0);sample(1:10,3)#随机种子!
z=sample(1:200000,10000)
z[1:10]#向量下标
y=c(1,3,7,3,4,2)
z[y]
一些基本运算例子
z=sample(x,20,rep=T)
z
(z1=unique(z));length(z1)
z=sample(x,100,rep=T)
xz=setdiff(x,z)
sort(union(xz,z))
sort(union(xz,z))==x
setequal(union(xz,z),x)
intersect(1:10,7:50)
sample(1:100,20,prob=1:100)
一些基本运算例子
pi * 10^2 #能够用?”*”来看基本算术运算方法
"*"(pi, "^"(10, 2))
pi * (1:10)^2
x <- pi * 10^2
x
print(x)
(x=pi * 10^2)
pi^(1:5)
print(x, digits = 12)
class(x)
typeof(x)
一些基本运算例子
class(cars)
typeof(cars)
names(cars)
summary(cars)
str(cars)
row.names(cars)
class(dist ~ speed)
plot(dist ~ speed,cars)
一些基本运算例子
head(cars)#cars[1:6,]
tail(cars)
ncol(cars);nrow(cars)
dim(cars)
lm(dist ~ speed, data = cars)
cars$qspeed =cut(cars$speed, breaks
=quantile(cars$speed), include.lowest = TRUE)
names(cars)
cars[3]
table(cars[3])
is.factor(cars$qspeed)
plot(dist ~ qspeed, data = cars)
(a=lm(dist ~ qspeed, data = cars))
summaryu(a)
一些基本运算例子
x <- round(runif(20,0,20), digits=2)
summary(x)
min(x);max(x)
median(x) # median
mean(x) # mean
var(x) # variance
sd(x) # standard deviation
sqrt(var(x))
rank(x) # rank
order(x)
x[order(x)]
sort(x)
sort(x,decreasing=T)#sort(x,dec=T)
sum(x);length(x)
round(x)
一些基本运算例子
fivenum(x) # quantiles
quantile(x) # quantiles (different convention)
有多种定义
quantile(x, c(0,.33,.66,1))
mad(x)
# normalized mean deviation to the median
(“median average distance“) 可用?mad查看
cummax(x)
cummin(x)
cumprod(x)
cor(x,sin(x/20)) # correlation
一些基本运算例子
#直方图
x <- rnorm(200)
hist(x, col = "light blue")
rug(x)
#茎叶图
stem(x)
#散点图
N <- 500
x <- rnorm(N)
y <- x + rnorm(N)
plot(y ~ x)
a=lm(y~x)
abline(a,col="red")#或者abline(lm(y~x),col="red")
print("Hello World!")
paste("x 的最小值= ", min(x))
#cat("\\end{document}\n", file="RESULT.tex", append=TRUE)
demo(graphics)#演示画图
一些基本运算例子
#复数运算
x=2+3i
(z <- complex(real=rnorm(10), imaginary =rnorm(10)))
complex(re=rnorm(3),im=rnorm(3))
Re(z)
Im(z)
Mod(z)
Arg(z)
choose(3,2);factorial(6)
#解方程
f =function(x) x^3-2*x-1
uniroot(f,c(0,2))#迭代
#如果知道根为极值
f =function(x) x^2+2*x+1
optimize(f,c(-2,2))
分布和产生随机数
 正态分布: pnorm(1.2,2,1); dnorm(1.2,2,1); qnorm(.7,2,1);
rnorm(10,0,1) #rnorm(10)
 t分布: pt(1.2,1); dt(1.2,2); qt(.7,1); rt(10,1)
 此外还有指数分布、F分布、“卡方”分布、Beta分布、二项分
布、Cauchy分布、Gamma分布、几何分布、超几何分布、对数正
态分布、Logistic分布、负二项分布、Poisson分布、均匀分布、
Weibull分布、Willcoxon分布等
变元可以是向量!
输入输出数据
x=scan()
1.5 2.6 3.7 2.1 8.9 12 -1.2 -4 #等价于x=c(1.5,2.6,3.7,2.1,8.9,12,-1.2,-4)
setwd(“f:/2010stat”)#或setwd("f:\\2010stat")
(x=rnorm(20))
write(x,"f:/2010stat/test.txt")
y=scan("f:/2010stat/test.txt");y
y=iris;y[1:5,];str(y)
write.table(y,"f:/2010stat/test.txt")
w=read.table("f:/2010stat/test.txt",header=T)
str(w)
write.csv(y,"f:/2010stat/test.csv")
v=read.csv("f:/2010stat/test.csv")
str(v)
data=read.table("clipboard")
write.table("clipboard")
序列和向量
z=seq(-1,10,length=100)#z=seq(-1,10, len=100)
z=seq(10,-1,-1) #z=10:-1
x=rep(1:3,3)
x=rep(3:5,1:3)
x=rep(c(1,10),c(4,5))
w=c(1,3,x,z);w[3]
x=rep(0,10);z=1:3;x+z
x*z
rev(x)
z=c("no cat","has ","nine","tails")
z[1]=="no cat"
z=1:5
z[7]=8;z
z=NULL
z[c(1,3,5)]=1:3; z
rnorm(10)[c(2,5)]
z[-c(1,3)] #去掉第1、3元素
z=sample(1:100,10);z
which(z==max(z))#给出下标
向量矩阵
x=sample(1:100,12);x
all(x>0);all(x!=0);any(x>0);(1:10)[x>0]
diff(x)
diff(x,lag=2)
x=matrix(1:20,4,5);x
x=matrix(1:20,4,5,byrow=T);x
t(x)
x=matrix(sample(1:100,20),4,5)
2*x
x+5
y=matrix(sample(1:100,20),5,4)
x+t(y)
(z=x%*%y)
z1=solve(z) # solve(a,b)可以解ax=b方程
z1%*%z
round(z1%*%z,14)
矩阵
nrow(x); ncol(x);dim(x)#行列数目
x=matrix(rnorm(24),4,6)
x[c(2,1),]#第2和第1行
x[,c(1,3)] #第1和第3列
x[2,1] #第[2,1]元素
x[x[,1]>0,1] #第1列大于0的元素
sum(x[,1]>0) #第1列大于0的元素的个数
sum(x[,1]<=0) #第1列不大于0的元素的个数
x[,-c(1,3)] #没有第1、3列的x.
diag(x)
diag(1:5)
diag(5)
x[-2,-c(1,3)] #没有第2行、第1、3列的x.
x[x[,1]>0&x[,3]<=1,1]; #第1中大于0并且相应于第3列中小于或等于1的元
x[x[,2]>0|x[,1]<.51,1] #第1中小于.51或者相应于第2列中大于0的元素(“或”)
x[!x[,2]<.51,1]#第一列中相应于第2列中不小于.51的元素(“非”)
apply(x,1,mean);apply(x,2,sum)
矩阵/高维数组
#上下三角阵
x=matrix(rnorm(24),4,6)
diag(x)
diag(1:5)
diag(5)
x[lower.tri(x)]=0#x[upper.tri(x)]=0;diag(x)=0
x=array(runif(24),c(4,3,2));x
is.matrix(x) #可由dim(x)得到维数(4,3,2)
is.matrix(x[1,,])
x=array(1:24,c(4,3,2))
x[c(1,3),,]
x=array(1:24,c(4,3,2))
apply(x,1,mean)
apply(x,1:2,sum)
apply(x,c(1,3),prod)
矩阵/高维数组/scale
#矩阵与向量之间的运算
x=matrix(1:20,5,4)
sweep(x,1,1:5,"*")
x*1:5
sweep(x,2,1:4,"+")
(x=matrix(sample(1:100,24),6,4));(x1=scale(x))
(x2=scale(x,scale=F)); (x3=scale(x,center=F))
round(apply(x1,2,mean),14)
apply(x1,2,sd)
round(apply(x2,2,mean),14);apply(x2,2,sd)
round(apply(x3,2,mean),14);apply(x3,2,sd)
Data.frame
x=matrix(1:6,2,3)
z=data.frame(x);z
z$X2
attributes(z)
names(z)=c("TOYOTA","GM","HUNDA")
row.names(z)=c("2001","2002")
Z
attach(x)
GM
detach(x)
GM
sapply(z,is.numeric)#apply(z,2,is.numeric)
缺失值问题等
airquality
complete.cases(airquality)#哪一行没有缺失值
which(complete.cases(airquality)==F)
sum(complete.cases(airquality))
na.omit(airquality)
#append,cbind,vbind
x=1:10;x[12]=3
(x1=append(x,77,after=5))
cbind(1:3,4:6);rbind(1:3,4:6)
#去掉矩阵重复的行
(x=rbind(1:5,runif(5),runif(5),1:5,7:11))
x[!duplicated(x),]
unique(x)
List
#list可以是任何对象的集合(包括lists)
z=list(1:3,Tom=c(1:2, a=list("R",letters[1:5]),w="hi!"))
z[[1]];z[[2]]
z$T
z$T$a2
z$T[[3]]
z$T$w
attributes(airquality)#属性!
airquality$Ozone
attributes(matrix(1:6,2,3))
Categorical data
A survey asks people if they smoke or not. The
data is Yes, No, No, Yes, Yes
x=c("Yes","No","No","Yes","Yes")
table(x);x
factor(x)
•Barplot:Suppose, a group of 25 people are surveyed as to their beer-drinking preference.
The categories were (1) Domestic can, (2) Domestic bottle, (3) Microbrew and (4) import.
The raw data is 3 4 1 1 3 4 3 3 1 3 2 1 2 1 2 3 2 3 1 1 1 1 4 3 1
beer = scan()
3411343313212123231111431
barplot(beer) # this isn't correct
barplot(table(beer)) # Yes, call with summarized data
barplot(table(beer)/length(beer)) # divide by n for proportion
table(beer)/length(beer)
Table/categorical data
library(MASS)
quine
attach(quine)
table(Age)
table(Sex, Age);
tab=xtabs(~ Sex + Age, quine);
unclass(tab)
tapply(Days, Age, mean)
tapply(Days, list(Sex, Age), mean)
#apply, sapply, tapply, lapply
smokes = c("Y","N","N","Y","N","Y","Y","Y","N","Y")
amount = c(1,2,2,3,3,1,2,1,3,2)
(tmp=table(smokes,amount)) # store the table
options(digits=3) # only print 3 decimal places
prop.table(tmp,1) # the rows sum to 1 now
prop.table(tmp,2) # the columns sum to 1 now
#上两行等价于下面两行
sweep(tmp, 1, margin.table(tmp, 1), "/")
sweep(tmp, 2, margin.table(tmp, 2), "/")
prop.table(tmp)#amount # all the numbers sum to 1
options(digits=7) # restore the number of digits
array/matrixtabledata.frame
## Start with a contingency table.
ftable(Titanic, row.vars = 1:3)
ftable(Titanic, row.vars = 1:2)
data.frame(Titanic)#把array变成data.frame
a=xtabs(Freq~Survived+Sex, w)
biplot(corresp(a, nf=2))#应用之一
## Start with a data frame.
str(mtcars)
x <- ftable(mtcars[c("cyl", "vs", "am", "gear")])
x #为array,其维的次序为("cyl", "vs", "am", "gear")
ftable(x, row.vars = c(2, 4))#从x(array)确定表的行变量
## Start with expressions, use table()'s "dnn" to change labels
ftable(mtcars$cyl, mtcars$vs, mtcars$am, mtcars$gear, row.vars = c(2, 4),
dnn = c("Cylinders", "V/S", "Transmission", "Gears"))
ftable(vs~carb,mtcars)#vs是列,carb是行#或ftable(mtcars$vs~mtcars$carb)
ftable(carb~vs,mtcars) #vs是行,carb是列
ftable(mtcars[,c(8,11)])#和上面ftable(carb~vs,mtcars)等价
ftable(breaks~wool+tension,warpbreaks)
#as.data.frame
(DF <- as.data.frame(UCBAdmissions)) #等价于data.frame(UCBAdmissions)
xtabs(Freq ~ Admit+ Gender + Dept, DF)#:把方阵变成原来的列联表
(a=xtabs( Freq~ Admit + Gender, data=DF))#如无频数(权),左边为空
写函数
ss=function(n=100){z=NULL;for (i in
2:n)if(any(i%%2:(i-1)==0)==F)z=c(z,i);return(z) }
fix(ss)
ss()
t1=Sys.time()
ss(10000)
Sys.time()-t1
system.time(ss(10000))
#函数可以不写return,这时最后一个值为return的
值.为了输出多个值最好使用list
关于画图
#几个图一起:
par(mfrow=c(2,4))#par(mfcol=c(2,4))
layout(matrix(c(1,1,1,2,3,4,2,3,4),nr=3,byrow=T))
hist(rnorm(100),col="Red",10)
hist(rnorm(100),col="Blue",8)
hist(rnorm(100),col="Green")
hist(rnorm(100),col="Brown")
#par(mar = c(bottom, left, top, right))设置边缘
#缺省值c(5, 4, 4, 2) + 0.1 (英寸)
spring= data.frame(compression=c(41,39,43,53,42,48,47,46),
distance=c(120,114,132,157,122,144,137,141))
attach(spring)#(Hooke’s law: f=.5ks)
par(mfcol=c(2,2))
plot(distance ~ compression)
plot(distance ~ compression,type="l")
plot(compression, distance,type="o")
plot(compression, distance,type="b")
关于画图
par(mfrow=c(2,2))#准备画2x2的4个图
plot(compression, distance,main= "Hooke's Law") #只有标题
plot(compression, distance,main= "Hooke's Law", xlab= "x",ylab=
"y") #标题+x,y标记
identify(compression,distance) #标出点号码
plot(compression, distance,main="Hooke's Law") #只有标题
text(46,120, expression(f==frac(1,2)*k*s))#在指定位写入文字
plot(compression, distance,main="Hooke's Law") #只有标题的图
text(locator(2), c("I am here!","you are there!")) #在点击的两个位
置写入文字
par(mfrow=c(1,1))
plot(1:10,sin(1:10),type="l",lty=2,col=4,main=paste(strwrap("The
title is too long, and I hate to make it shorter,
!@#$%^&*",width=50),collapse="\n"))
legend(1.2,1.0,"Just a sine",lty=2,col=4)
关于画图
library(MASS);data(Animals);attach(Animals)
par(mfrow=c(2,2))
plot(body, brain)
plot(sqrt(body), sqrt(brain))
plot((body)^0.1, (brain)^0.1)
plot(log(body),log(brain)) #或者plot(brain~body,log="xy")
par(mfrow=c(1,1))
par(cex=0.7,mex=0.7) #character (cex) & margin (mex) expansion
plot(log(body),log(brain))
text(x=log(body), y=log(brain),labels=row.names(Animals),
adj=1.5)# adj=0 implies left adjusted text
plot(log(body),log(brain))
identify(log(body),log(brain),row.names(Animals))
关于画图(符号颜色大小形状等)
plot(1,1,xlim=c(1,7.5),ylim=c(0,5),type="n") # Do not plot points
points(1:7,rep(4.5,7),cex=seq(1,4,l=7),col=1:7, pch=0:6)
text(1:7,rep(3.5,7),labels=paste(0:6,letters[1:7]),cex=seq(1,4,l=7), col=1:7)
points(1:7,rep(2,7), pch=(0:6)+7) # Plot symbols 7 to 13
text((1:7)+0.25, rep(2,7), paste((0:6)+7)) # Label with symbol number
points(1:7,rep(1,7), pch=(0:6)+14) # Plot symbols 14 to 20
text((1:7)+0.25, rep(1,7), paste((0:6)+14)) # Labels with symbol number
#调色板
par(mfrow=c(2,4))
palette(); barplot(rnorm(15,10,3),col=1:15)
palette(rainbow(15));barplot(rnorm(15,10,3),col=1:15)
palette(heat.colors(15));barplot(rnorm(15,10,3),col=1:15)
palette(terrain.colors(15));barplot(rnorm(15,10,3),col=1:15)
palette(topo.colors(15));barplot(rnorm(15,10,3),col=1:15)
palette(cm.colors(15));barplot(rnorm(15,10,3),col=1:15)
palette(gay(15));barplot(rnorm(15,10,3),col=1:15)
palette(grey(15));barplot(rnorm(15,10,3),col=1:15)
palette("default")
par(mfrow=c(1,1))
关于画图
#matplot
sines=outer(1:20,1:4,function(x, y) sin(x/20*pi*y))
matplot(sines, pch = 1:4, type = "o", col = rainbow(ncol(sines)))
#legend
x <- seq(-pi, pi, len = 65)
plot(x, sin(x), type = "l", ylim = c(-1.2, 1.8), col = 3, lty = 2)
points(x, cos(x), pch = 3, col = 4)
lines(x, tan(x), type = "b", lty = 1, pch = 4, col = 6)
title("legend(..., lty = c(2, -1, 1), pch = c(-1,3,4), merge = TRUE)",
cex.main = 1.1)
legend(-1, 1.9, c("sin", "cos", "tan"), col = c(3,4,6), lty = c(2, -1, 1), pch
= c(-1, 3, 4), merge = TRUE, bg='gray90')
关于画图
#barplot and table
par(mfrow=c(2,2))
tN=table(Ni=rpois(100, lambda=5));tN
r=barplot(tN, col='gray')
lines(r, tN, type='h', col='red', lwd=2) #- type = "h" plotting *is* `bar'plot
barplot(tN, space = 1.5, axisnames=FALSE, sub = "barplot(..., space=0, axisnames
= FALSE)")
#如space=1.5则有稀牙缝
barplot(tN, space = 0, axisnames=FALSE, sub = "barplot(..., space=0, axisnames =
FALSE)")
pie(tN)#pie plot
par(mfrow=c(1,1))
#加grid
plot (1:3)
grid(10, 5 , lwd = 2)
dev.set;dev.off;dev.list
关于画图(pairs/三维)
#pairs#data(iris)
pairs(iris[1:4], main = "Anderson's Iris Data -- 3 species",
pch = 21, bg = c("red", "green3", "blue")[unclass(iris$Species)])
#iris为150x5数据,这里是4个数量变量的点图(最后一个是分类变量(iris$Species))
#stars#data(mtcars)
stars(mtcars[, 1:7], key.loc = c(14, 1.5), main = "Motor Trend Cars : full
stars()",flip.labels=FALSE)
#mtcars为32x11数据,这里只选前7个数量变量的点图
#persp
x <- seq(-10, 10, length= 30)
y <- x
f <- function(x,y) { r <- sqrt(x^2+y^2); 10 * sin(r)/r }
z <- outer(x, y, f)
z[is.na(z)] <- 1
persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = "lightblue")
data(volcano)
par(mfrow=c(2,2))
z <- 2 * volcano# Exaggerate the relief
x <- 10 * (1:nrow(z)) #10 meter spacing(S to N)
y <- 10 * (1:ncol(z)) #10 meter spacing(E to W)
## Don't draw the grid lines : border = NA
#par(bg = "slategray")
persp(x, y, z, theta = 135, phi = 30, col = "green3", scale = FALSE,
ltheta = -120, shade = 0.75, border = NA, box = FALSE)
par(bg= "white")
关于画图(三维)
#contour
rx <- range(x <- 10*1:nrow(volcano))
ry <- range(y <- 10*1:ncol(volcano))
ry <- ry + c(-1,1) * (diff(rx) - diff(ry))/2
tcol <- terrain.colors(12)
opar <- par(pty = "s", bg = "lightcyan");par(opar)
plot(x = 0, y = 0,type = "n", xlim = rx, ylim = ry, xlab = "", ylab = "")
u <- par("usr")
rect(u[1], u[3], u[2], u[4], col = tcol[8], border = “red”) #rect画矩形
contour(x, y, volcano, col = tcol[2], lty = "solid", add = TRUE, vfont = c("sans serif", "plain"))
title("A Topographic Map of Maunga Whau", font = 4)
abline(h = 200*0:4, v = 200*0:4, col = "lightgray", lty = 2, lwd = 0.1);par(opar)
#image
x <- 10*(1:nrow(volcano))
y <- 10*(1:ncol(volcano))
image(x, y, volcano, col = terrain.colors(100), axes = FALSE)
contour(x, y, volcano, levels = seq(90, 200, by=5), add = TRUE, col = "peru")
axis(1, at = seq(100, 800, by = 100))
axis(2, at = seq(100, 600, by = 100))
box()
title(main = "Maunga Whau Volcano", font.main = 4)
par(mfrow=c(1,1))
多窗口操作
x11()
plot(1:10)
x11()
plot(rnorm(10))
dev.set(dev.prev())
abline(0,1)# through the 1:10 points
dev.set(dev.next())
abline(h=0, col="gray")# for the residual plot
dev.set(dev.prev())
dev.off(); dev.off()#- close the two X devices
#dev.list()
画图杂项
#模拟布朗运动
n=100;x=cumsum(rnorm(100));y=cumsum(rnorm(100));plot(x,y,type="l")
x=0;y=0;plot(100,ylim=c(-15,15),xlim=c(-15,15))#慢动作
for(i in 1:200){x1=x+rnorm(1);y1=y+rnorm(1);
segments(x,y,x1,y1);x=x1;y=y1
Sys.sleep(.05)}
#散点大小同因变量值成比例
x=1:10;y=runif(10)
symbols(x,y,circle=y/2,inches=F,bg=x)
#数据框的每一列都做Q-Q图
table=data.frame(x1=rnorm(100),x2=rnorm(100,1,1))
par(ask=TRUE)#waitforchanging等待页面改变的确认
results=apply(table,2,qqnorm)
par(ask=FALSE)
#在一个图上添加一个小图
x=rnorm(100)
hist(x)
op=par(fig=c(.02,.5,.5,.98),new=TRUE)
boxplot(x)
#数学符号
x=1:10;plot(x,type="n")
text(3,2,expression(paste("Temperature(",degree,"C) in 2003")))
text(4,4,expression(bar(x)==sum(frac(x[i],n),i==1,n)))
text(6,6,expression(hat(beta)==(X^t*X)^{.1}*X^t*y))
text(8,8,expression(z[i]==sqrt(x[i]^2+y[i]^2)))
改变大小写字母
x=c("I","am","A","BIG", "Cat")
tolower(x)
toupper(x)
R统计模型讲义
#基础
x=rnorm(20,10)
t.test(x,m=9,alt="greater")
t.test(x[1:10],m=9,alt="greater")$p.value
t.test(x,con=.90)$conf
x=rnorm(30,10);y=rnorm(30,10.1)
t.test(x,y,alt="less")
library(TeachingDemos)
ci.examp()
run.ci.examp()
vis.boxcox()
vis.boxcoxu()
回归
相关
#相关
x=rnorm(20);y=rnorm(20);
cor(x,y)
cor(x,y,method="kendall");
cor(x,y,method="spearman")
cor.test(x,y);
cor.test(x,y,method="kendall");
cor.test(x,y,method="spearman")
cor.test(x,y,method="kendall")$p.value
#相关吗?
x=rnorm(3);y=rnorm(3);cor(x,y);cor.test(x,y)$p.value
library(TeachingDemos)
put.points.demo()
基本原理
#基本原理
set.seed(100)
x1=rnorm(100);x2=rnorm(100);eps=rnorm(100)
y=5+2*x1-3*x2+eps
a=lm(y~x1+x2)
(lm(y~0+x1+x2))#不要截距:等价于(lm(y~-1+x1+x2))
summary(a);anova(a)
names(a)
shapiro.test(a$res)
qqnorm(a$res);qqline(a$res)
#数学原理
x=cbind(1,x1,x2)
dim(x)
b=solve(t(x)%*%x)%*%t(x)%*%y
b
a$coe
5
0
y
10
例1:cross.txt
3
4
5
6
61
x
7
8
例1:
cross.txt
w=read.table("cross.txt",header=T)
head(w)
plot(y~x,w);summary(w)
a=lm(y~x+z,w)
summary(a)
anova(a)
qqnorm(a$res);qqline(a$res)
shapiro.test(a$res)
a1=lm(y~x*z,w)
summary(a1);anova(a1)
qqnorm(a1$res);qqline(a1$res)
shapiro.test(a1$res)
anova(a,a1)
library(party)#更简单的方法
wt=mob(y~x|z,data=w)
coef(wt);plot(wt)
plot(y~x,w);abline(coef(wt)[1,],col=2);abline(coef(wt)[2,],col=4)
回归方程
63
Poison Experiment
The data give the survival times (in 10 hour units) in a 3 x 4
factorial experiment, the factors being (a) three poisons
and (b) four treatments. Each combination of the two
factors is used for four animals, the allocation to animals
being completely randomized.
Box, G. E. P., and Cox, D. R. (1964). An analysis of
transformations (with Discussion). J. R. Statist. Soc. B, 26,
211-252.
http://www.statsci.org/data/general/poison.html
64
例2:poison.txt:3种毒药,4种处理,
用于动物实验,48个观测值
1.5
2.0
2.5
3.0
3.5
4.0
2.0
2.5
3.0
1.0
3.0
4.0
1.0
1.5
Poison
0.2 0.4 0.6 0.8 1.0 1.2
1.0
2.0
Treatment
Time
1.0
1.5
2.0
2.5
3.0
0.2
0.4
0.6
0.8
1.0
1.2
65
setwd("f:/2010stat")
w=read.table("poison.txt",head=T)
head(w);tail(w)
str(w);summary(w)
dim(w)
w$Poison=factor(w$Poison)
w$Treatment=factor(w$Treatment)
pairs(w)
#直接回归
a=lm(Time~Poison*Treatment,w)
anova(a)
a=lm(Time~.,w)
anova(a)
qqnorm(a$res);qqline(a$res)
shapiro.test(a$res)
#变换
a=lm(1/Time~Poison+Treatment,w)
anova(a)
qqnorm(a$res);qqline(a$res)
shapiro.test(a$res)
summary(a)
回归
变换并拟合主效应
67
结果解释
68
多项式回归
#多项式回归
y <- cars$dist;x <- cars$speed
o = order(x)
plot( y~x )
do.it <- function (model, col) {
r <- lm( model ); yp <- predict(r)
lines( yp[o] ~ x[o], col=col, lwd=3 )}
do.it(y~x, col="red")
do.it(y~x+I(x^2), col="blue")
do.it(y~-1+I(x^2), col="green")
legend(par("usr")[1], par("usr")[4],
c("affine function", "degree-2 polynomial", "degree 2 monomial"),
lwd=3,
col=c("red", "blue", "green"), )
n <- 100
x <- runif(n,min=-4,max=4) + sign(x)*.2
y <- 1/x + rnorm(n)#双曲线
plot(y~x)
lm( 1/y ~ x )
n <- 100
x <- rlnorm(n)^3.14#a log-normal distribution is a probability distribution of a random variable whose
logarithm is normally distributed.
y <- x^-.1 * rlnorm(n)
plot(y~x)
lm(log(y) ~ log(x))
多项式p,q正交, 如
#关于正交多项式
y <- cars$dist;x <- cars$speed
#非正交: 一项加一项(互相影响, 显著的系数变成不显著)
summary( lm(y~x) )
summary( lm(y~x+I(x^2)) )
summary( lm(y~x+I(x^2)+I(x^3)) )
summary( lm(y~x+I(x^2)+I(x^3)+I(x^4)) )
summary( lm(y~x+I(x^2)+I(x^3)+I(x^4)+I(x^5)) )
#正交: 不会改变开始显著的系数 poly: Compute Orthogonal Polynomials
summary( lm(y~poly(x,1)) )
summary( lm(y~poly(x,2)) )
summary( lm(y~poly(x,3)) )
summary( lm(y~poly(x,4)) )
summary( lm(y~poly(x,5)) )
#对正交多项式点出系数的p-values
n <- 5
p <- matrix( nrow=n, ncol=n+1 )
for (i in 1:n) {
p[i,1:(i+1)] <- summary(lm( y ~ poly(x,i) ))$coefficients[,4]
}
matplot(p, type='l', lty=1, lwd=3)
legend( par("usr")[1], par("usr")[4],
as.character(1:n),
lwd=3, lty=1, col=1:n
)
title(main="Evolution of the p-values (orthonormal polynomials)")
#对非正交多项式, 点出系数的p-values
p <- matrix( nrow=n, ncol=n+1 )
p[1,1:2] <- summary(lm(y ~ x) )$coefficients[,4]
p[2,1:3] <- summary(lm(y ~ x+I(x^2)) )$coefficients[,4]
p[3,1:4] <- summary(lm(y ~ x+I(x^2)+I(x^3)) )$coefficients[,4]
p[4,1:5] <- summary(lm(y ~ x+I(x^2)+I(x^3)+I(x^4)) )$coefficients[,4]
p[5,1:6] <- summary(lm(y ~ x+I(x^2)+I(x^3)+I(x^4)+I(x^5)) )$coefficients[,4]
matplot(p, type='l', lty=1, lwd=3)
legend( par("usr")[1], par("usr")[4],
as.character(1:n),
lwd=3, lty=1, col=1:n
)
title(main="Evolution of the p-values (non orthonormal polynomials)“)
37.4
36.4
36.6
36.8
y
37.0
37.2
#例子
data(beavers)
y <- beaver1$temp
x <- 1:length(y)
plot(y~x)
for (i in 1:10) {
r <- lm( y ~ poly(x,i) )
lines( predict(r), type="l", col=i )
}
summary(r)
0
20
40
60
x
80
100
非参数回归
#非参数: 样条
plot(quakes$long, quakes$lat)
lines( smooth.spline(quakes$long, quakes$lat), col='red', lwd=3)
library(Design)#rcs: Design Special Transformation Functions
# 4-node spline
r3 <- lm( quakes$lat ~ rcs(quakes$long) )
plot( quakes$lat ~ quakes$long )
o <- order(quakes$long)
lines( quakes$long[o], predict(r)[o], col='red', lwd=3 )
r <- lm( quakes$lat ~ rcs(quakes$long,10) )
lines( quakes$long[o], predict(r)[o], col='blue', lwd=6, lty=3 )
title(main="Regression with rcs")
legend( par("usr")[1], par("usr")[3], yjust=0,
c("4 knots", "10 knots"),
lwd=c(3,3), lty=c(1,3), col=c("red", "blue") )
#更多的样条
library(splines)
data(quakes)
x <- quakes[,2]
y <- quakes[,1]
o <- order(x)
x <- x[o]
y <- y[o]
r1 <- lm( y ~ bs(x,df=10) )
r2 <- lm( y ~ ns(x,df=6) )
plot(y~x)
lines(predict(r1)~x, col='red', lwd=3)
lines(predict(r2)~x, col='green', lwd=3)
#核光滑
plot(cars$speed, cars$dist)
lines(ksmooth(cars$speed, cars$dist, "normal", bandwidth=2), col='red')
lines(ksmooth(cars$speed, cars$dist, "normal", bandwidth=5), col='green')
lines(ksmooth(cars$speed, cars$dist, "normal", bandwidth=10), col='blue')
#加权局部最小二乘. Weighted Local Least Squares: loess
#各种核函数
curve(dnorm(x), xlim=c(-3,3), ylim=c(0,1.1))
x <- seq(-3,3,length=200)
D.Epanechikov <- function (t) {
ifelse(abs(t)<1, 3/4*(1-t^2), 0)
}
lines(D.Epanechikov(x) ~ x, col='red')
D.tricube <- function (t) { # aka "triweight kernel"
ifelse(abs(t)<1, (1-abs(t)^3)^3, 0)
}
lines(D.tricube(x) ~ x, col='blue')
legend( par("usr")[1], par("usr")[4], yjust=1,
c("noyau gaussien", "noyau d'Epanechikov", "noyau tricube"),
lwd=1, lty=1,
col=c(par('fg'),'red', 'blue'))
title(main="Differents kernels")
#局部多项式回归
library(KernSmooth)
data(quakes)
x <- quakes$long;y <- quakes$lat
plot(y~x)
bw <- dpill(x,y) # .2
lines( locpoly(x,y,degree=0, bandwidth=bw), col='red' )
lines( locpoly(x,y,degree=1, bandwidth=bw), col='green' )
lines( locpoly(x,y,degree=2, bandwidth=bw), col='blue' )
legend( par("usr")[1], par("usr")[3], yjust=0,
c("degree = 0", "degree = 1", "degree = 2"),
lwd=1, lty=1,
col=c('red', 'green', 'blue'))
title(main="Local Polynomial Regression")
#大窗宽
plot(y~x);bw <- .5
lines( locpoly(x,y,degree=0, bandwidth=bw), col='red' )
lines( locpoly(x,y,degree=1, bandwidth=bw), col='green' )
lines( locpoly(x,y,degree=2, bandwidth=bw), col='blue' )
legend( par("usr")[1], par("usr")[3], yjust=0,
c("degree = 0", "degree = 1", "degree = 2"),
lwd=1, lty=1,
col=c('red', 'green', 'blue'))
title(main="Local Polynomial Regression (wider window)")
非线性回归
#非线性回归
library(nls2)
f <- function (x,p) {
u <- p[1]
v <- p[2]
u/(u-v) * (exp(-v*x) - exp(-u*x))
}
n <- 100
x <- runif(n,0,2)
y <- f(x, c(3.14,2.71)) + .1*rnorm(n)
r <- nls( y ~ f(x,c(a,b)), start=c(a=3, b=2.5) )
plot(y~x)
xx <- seq(0,2,length=200)
lines(xx, f(xx,r$m$getAllPars()), col='red', lwd=3)
lines(xx, f(xx,c(3.14,2.71)), lty=2)
分位数回归
模型
损失函数
寻找参数(可能是向量)
对线性回归模型
满足
条件则为
在最小二乘回归中
在t分位数回归中
t=0.5最小一乘回归
在t分位数定义:
损失函数rt(u)形状






















#分位数回归
library(quantreg)
data(engel);head(engel);plot(engel)
plot(engel, log = "xy",main = "'engel' data (log - log scale)")
plot(log10(foodexp) ~ log10(income), data = engel,main = "'engel' data (log10 - tranformed)")
taus <- c(.15, .25, .50, .75, .95, .99)
rqs <- as.list(taus)
for(i in seq(along = taus)) {
rqs[[i]] <- rq(log10(foodexp) ~ log10(income), tau = taus[i], data = engel)
lines(log10(engel$income), fitted(rqs[[i]]), col = i+1)}
legend("bottomright", paste("tau = ", taus), inset = .04,
col = 2:(length(taus)+1), lty=1)
abline(lm(log10(foodexp)~log10(income),engel),lwd=5)#最小二乘黑粗线
plot(summary(rq(log10(foodexp)~log10(income),tau = 1:49/50,data=engel)))#画出系数图
#未变换数据
plot(foodexp~income, data = engel, main = "'engel' data")
for(i in seq(along = taus)) {
rqs[[i]] <- rq(foodexp ~ income, tau = taus[i], data = engel)
lines(engel$income, fitted(rqs[[i]]), col = i+1)}
legend("bottomright", paste("tau = ", taus), inset = .04, col = 2:(length(taus)+1), lty=1)
abline(lm(foodexp~income,engel),lwd=5)#最小二乘黑粗线
plot(summary(rq(foodexp~income,tau = 1:49/50,data=engel)))#画出系数图
N <- 2000
x <- runif(N)
y <- rnorm(N)
y <- -1 + 2 * x + ifelse(y>0, y+5*x^2, y-x^2)
plot(x,y)
abline(lm(y~x), col="red")
library(quantreg)
plot(y~x)
for (a in seq(.1,.9,by=.1)) {
abline(rq(y~x, tau=a), col="blue", lwd=3)
}
#局部多项式分位数回归: locally polynomial quantile regression
plot(y~x)
for (a in seq(.1,.9,by=.1)) {
r <- lprq(x,y,
h=bw.nrd0(x), # See ?density
tau=a)
lines(r$xx, r$fv, col="blue", lwd=3)
}
Logistic & Probit 回归
广义线性模型(GLM)
称为连接函数(link function)
GLM的对数似然函数为
记分函数(score function)为
Logistic回归/Probit回归
例子: ModeChoice
88
例子: ModeChoice
实际上,Mode只有两种:0、1,
其余变量为数量
89
考虑2种logistic模型
模型b
模型c
90
ANOVA
library(Ecdat);data(ModeChoice)#二分类
w=ModeChoice
#两个logistic模型
b=glm(factor(mode)~ttme+invt+gc,data=w,family="binomial")
c=glm(factor(mode)~ttme+invc*invt+gc,data=w,family="binomial")
anova(c,test="Chi");summary(c);anova(b,c,test="Chi")
91
模型b拟合
92
模型c拟合
93
考虑2种probit模型
模型bb
或
模型cb
或
94
ANOVA
#两个probit模型
bb=glm(factor(mode)~ttme+invt+gc,data=w,family=binomial(link=probit))
cb=glm(factor(mode)~ttme+invc*invt+gc,data=w,binomial(link=probit))
anova(cb,test="Chi");summary(c);anova(bb,cb,test="Chi")
anova(b,bb,test="Chi");anova(c,cb,test="Chi")
95
模型b拟合
97
dispersion
library(dglm)
library(statmod)
clotting <- data.frame(
u = c(5,10,15,20,30,40,60,80,100),
lot1 = c(118,58,42,35,27,25,21,19,18),
lot2 = c(69,35,26,21,18,16,13,12,12))
a1=glm(lot1 ~ log(u), data=clotting, family=Gamma)
summary(a1)
# The same example as in glm: the dispersion is modelled as constant
# However, dglm used ml not reml, so results slightly different:
out <- dglm(lot1 ~ log(u), ~1, data=clotting, family=Gamma)
summary(out)
# Try a double glm
out2 <- dglm(lot1 ~ log(u), ~u, data=clotting, family=Gamma)
summary(out2)
anova(out2)
# Summarize the mean model as for a glm
summary.glm(out2)
# Summarize the dispersion model as for a glm
summary(out2$dispersion.fit)
# Examine goodness of fit of dispersion model by plotting residuals
plot(fitted(out2$dispersion.fit),residuals(out2$dispersion.fit))
Poisson log-linear model: dispersion
offset
n independent responses
The Poisson distribution has
but it may happen that the actual variance exceeds the nominal variance under the assumed
probability model. Suppose now that θi=λi ni
Thus, it can be shown
Hence, for φ>0 we have overdispersion. It is interesting to note that the same mean and
variance arise also if we assume a negative binomial distribution for the response variable.
Poisson log-linear model: dispersion
library(dispmod)
data(salmonellaTA98)
attach(salmonellaTA98)
log.x10 <- log(x+10)
mod <- glm(y ~ log.x10 + x, family=poisson(log))
summary(mod)
mod.disp <- glm.poisson.disp(mod)
summary(mod.disp)
mod.disp$dispersion
# compute predictions on a grid of x-values...
x0 <- seq(min(x), max(x), length=50)
eta0 <- predict(mod, newdata=data.frame(log.x10=log(x0+10), x=x0), se=TRUE)
eta0.disp <- predict(mod.disp, newdata=data.frame(log.x10=log(x0+10), x=x0), se=TRUE)
# ... and plot the mean functions with variability bands
plot(x, y)
lines(x0, exp(eta0$fit))
lines(x0, exp(eta0$fit+2*eta0$se), lty=2)
lines(x0, exp(eta0$fit-2*eta0$se), lty=2)
lines(x0, exp(eta0.disp$fit), col=2)
lines(x0, exp(eta0.disp$fit+2*eta0.disp$se), lty=2, col=2)
lines(x0, exp(eta0.disp$fit-2*eta0.disp$se), lty=2, col=2)
Poisson log-linear model: dispersion
##-- Holford's data
data(holford)
attach(holford)
mod <- glm(incid ~ offset(log(pop)) + Age + Cohort, family=poisson(log))
summary(mod)
mod.disp <- glm.poisson.disp(mod)
summary(mod.disp)
mod.disp$dispersion
#另一种方法(利用Tweedie distributions—自己找文献)
tt= glm(incid ~ offset(log(pop)) + Age + Cohort,
family=tweedie(var.power=4,link.power=0))
岭回归
ridge
ˆ

2
p
p
 N 


2
= arg min   yi   0   xij  j      j 

j =1
j =1

 i =1 

ˆ


= arg min   yi   0   xij  j 

i =1 
j =1

p
N
ridge
p
subject to

j =1
2
j
s
2
library(perturb);data(consumption)
A data frame with 28 observations on the following 5 variables.
•year: 1947 to 1974
•c: total consumption, 1958 dollars
•r: the interest rate (Moody's Aaa)
•dpi: disposable income, 1958 dollars
•d_dpi annual change in disposable income
library(perturb);data(consumption)
head(consumption)
library(MASS)
ct1<-c(NA,c[-length(c)]);
a<-lm.ridge(c~ct1+dpi+r+d_dpi, lambda=seq(0, 0.1,length=100), model =TRUE)
names(a)# "coef" "scales" "Inter" "lambda" "ym" "xm" "GCV" "kHKB" "kLW"
a$lambda[which.min(a$GCV)] ##找到GCV 最小时的lambdaGCV= 0.014
a$coef[,which.min(a$GCV)] ##找到GCV 最小时对应的系数
a$coef[,which.min(a$GCV)]
par(mfrow=c(1,2))
plot(a) ##画出图形,并作出lambda 取0.01 时的那条线,以红线表示。
abline(v=a$lambda[which.min(a$GCV)],col="red")
plot(a$lambda,a$GCV,type="l")#lamda 同GCV 之间关系的图形
abline(v=a$lambda[which.min(a$GCV)],col="green")
0.55
a$GCV
0.53
0.54
80
60
0.52
t(x$coef)
40
20
0
0.00
0.02
0.04
0.06
x$lambda
0.08
0.10
0.00
0.02
0.04
0.06
a$lambda
0.08
0.10
0.10
0.15
a$GCV
2
0.05
1
t(x$coef)
3
0.20
0.25
4
对于正交数据(独立)
0
20
40
60
x$lambda
80
100
0
20
40
60
a$lambda
80
100
#另一个例子
longley # not the same as the S-PLUS dataset
names(longley)[1] <- "y"
a0=lm.ridge(y ~ ., longley)#lambda = 0
plot(lm.ridge(y ~ ., longley,lambda = seq(0,0.1,0.001)))
select(lm.ridge(y ~ ., longley,lambda = seq(0,0.1,0.0001)))
a1=lm.ridge(y ~ ., longley,lambda=0.0057)
a1$coe
偏最小二乘回归
PLSR (Partial Least Squares and Principal Component Regression)
oliveoil {pls}
Sensory and physico-chemical data of olive oils
Description
A data set with scores on 6 attributes from a sensory panel and measurements of 5 physicochemical quality parameters on 16 olive oil samples. The first five oils are Greek, the next five
are Italian and the last six are Spanish.
data(oliveoil)
Format: A data frame with 16 observations on the following 2 variables.
Sensory: a matrix with 6 columns. Scores for attributes ‘yellow’, ‘green’, ‘brown’, ‘glossy’,
‘transp’, and ‘syrup’.
Chemical: a matrix with 5 columns. Measurements of acidity, peroxide, K232, K270, and DK.
Source
Massart, D. L., Vandeginste, B. G. M., Buydens, L. M. C., de Jong, S., Lewi, P. J., SmeyersVerbeke, J. (1998) Handbook of Chemometrics and Qualimetrics: Part B. Elsevier. Tables 35.1
and 35.4.
[Package pls version 2.1-0 Index]
sensory ~ chemical
Sensory
Chemical
#偏最小二乘回归(先主成份回归)
library(pls);
data(oliveoil);head(oliveoil);dim(oliveoil)
oliveoil$sensory#是一个16x6矩阵
oliveoil$chemical #是一个16x5矩阵
#PCR
sens.pcr <- pcr(sensory ~ chemical, ncomp = 4, scale = TRUE, data =
oliveoil)
summary(sens.pcr);names(sens.pcr)
[1] "coefficients" "scores"
"loadings" "Yloadings" "projection"
"Xmeans"
[7] "Ymeans"
"fitted.values" "residuals" "Xvar"
"Xtotvar"
"ncomp"
[13] "method"
"scale"
"call"
"terms"
"model"
sens.pcr$loadings
sens.pcr$coefficients
sens.pcr$scores
sens.pcr$Yloadings
sens.pcr$projection
sens.pcr$residuals
#PLSR
sens.pls <- plsr(sensory ~ chemical, ncomp = 4, scale = TRUE, data =
oliveoil)
summary(sens.pls);names(sens.pls)
[1] "coefficients" "scores"
"loadings"
"loading.weights" "Yscores"
[6] "Yloadings"
"projection" "Xmeans"
"Ymeans"
"fitted.values"
[11] "residuals"
"Xvar"
"Xtotvar"
"ncomp"
"method"
[16] "scale"
"call"
"terms"
"model"
sens.pls$loadings
sens.pls$coef
sens.pls$scores
sens.pls$loading.weights
sens.pls$Yscores
sens.pls$Yloadings
sens.pls$Xvar
sens.pls$Xtotvar
library(pls);data(yarn)
Consisting of 21 NIR spectra of PET
yarns, measured at 268 wavelengths,
and 21 corresponding densities. (Erik
Swierenga).
library(pls);data(yarn)
names(yarn)#[1] "NIR" "density" "train"
dim(yarn$NIR)#28 268 自变量
yarn$density #因变量
summary(yarn$train)
yarn$train
Mode FALSE TRUE NA's
logical 7 21 0
yarn.pls <- plsr(density ~ NIR, ncomp = 4, scale = TRUE, data = yarn)
summary(yarn.pls);names(yarn.pls)
[1] "coefficients" "scores"
"loadings"
"loading.weights" "Yscores"
[6] "Yloadings" "projection" "Xmeans"
"Ymeans"
"fitted.values"
[11] "residuals" "Xvar"
"Xtotvar"
"ncomp"
"method"
[16] "scale"
"call"
"terms"
"model"
yarn.pls$loadings
yarn.pls$coef
yarn.pls$scores
yarn.pls$loading.weights
yarn.pls$Yscores
yarn.pls$Yloadings
yarn.pls$Xvar
yarn.pls$Xtotvar
Download