Bootstrap Support for Mary-Lee's Clusters

advertisement
Bootstrap Support for Mary-Lee's Clusters
efg, 5 April 2006
1
Step-by-Step Analysis Procedure ..................................................................................................... 2
1.1
Starting Point ............................................................................................................................ 2
1.2
R Script: SetupConsenseData .................................................................................................. 2
1.3
PHYLIP neighbor ..................................................................................................................... 4
1.4
PHYLIP consense ..................................................................................................................... 5
2
Consense Output – Baseline Case .................................................................................................... 7
3
Unrooted Trees in TreeView ............................................................................................................ 9
3.1
Baseline................................................................................................................................... 10
3.2
Delete-1 Jackknife .................................................................................................................. 11
3.3
Delete-2 Jackknife .................................................................................................................. 12
3.4
Delete-4 Bootstrap .................................................................................................................. 13
3.5
Delete-6 Bootstrap .................................................................................................................. 14
4
Appendix A. SetupConsenseData.R .............................................................................................. 15
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
1
Bootstrap Support for Mary-Lee's Clusters
1 Step-by-Step Analysis Procedure
The following analysis uses the PHYLIP (PHYLogeny Inference Package) programs, "neighbor" and
"consense" from http://evolution.genetics.washington.edu/phylip/general.html, which are installed for
use in the Stowers' Linux environment, to analyze how much boostrap support there is for the clusters
in Mary-Lee's dataset of 90 selected genes with 17-point time series.
1.1 Starting Point
In directory: U:\efg\Research\Olivier\Mary-Lee\17PointSeries, the Book2.xls Excel file contains the
90 selected genes and their 17-point time series.
1.2 R Script: SetupConsenseData
The script SetupConsenseData (see Appendix) was used to create several datasets with multiple
distance matrices for use with the neighbor PHYLIP program. Correlation matrices were computed
by deleting "d" points in the time series. Each correlation matrix was converted to distance matrix
with this formula:
Distance <- (1 - CorrelationMatrix)/2
See Efron and Tibshirani, p. 149, for information about "delete-d jackknife."
Run this R script under Windows:
File | Change Dir … | U:\efg\Research\Olivier\Mary-Lee\17PointSeries
source("SetupConsenseData.R")
[NOTE: This script is extremely slow when run directly from the "U" drive. Perhaps a 100X speed
improvement can be seen by copying and running this script from the C:\ drive. Unfortunately, this
script does not work from Linux because of the use of the RODBC package to read the unmodified
Excel file.]
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
2
Bootstrap Support for Mary-Lee's Clusters
In R, issue the following commands:
• CreateBaseline()
• CreateDelete1Jackknife()
• CreateDelete2Jackknife()
• CreateDeleteNBootstrap(1000, 4)
• CreateDeleteNBootstrap(1000, 6)
• CreateDeleteNBootstrap(1000, 8)
The "Delete1" procedure creates the 17 "delete 1" jackknife samples. Likewise, the "Delete"
procedure creates the 17*16=272 "delete 2" jackknife samples. For larger deletions, 1000 bootstrap
samples were used to approximate the exact jackknife deletions.
The functions above created the following files: infile-baseline, infile-1, infile-2, infile-4, infile-6,
infile-8. These files were moved to corresponding directories, Baseline, Delete1-17, Delete2-272,
Delete4-1000-Boot, Delete6-1000-Boot, and Delete-8-1000Boot, and renamed "infile". This
segregated the various files and did not require any additional renaming to keep all the results form
neighbor and consense.
Note: This is a "distance" matrix. The main diagonal should be 0s.
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
3
Bootstrap Support for Mary-Lee's Clusters
1.3 PHYLIP neighbor
Run the PHYLIP program, neighbor, under Linux after changing to the various directories created
above (Baseline, Delete1-17, Delete2-272, Delete4-1000-Boot, Delete6-1000-Boot, and Delete-81000Boot). This PHYLIP program reads the distance matrices and creates a file with the
corresponding unrooted trees, which will be processed by the consense PHYLIP program.
For the baseline case, accept the neighbor defaults. For the other cases, select the "M" option
("Analyse multiple data sets"), and specify the following values:
Directory
# Data sets Random Number Used
Delete1-17
17
19
Delete2-272
272
29
Delete4-1000-Boot
1000
71
Delete6-1000-Boot
1000
255
Delete8-1000-Boot
1000
19937
neighbor
Neighbor-Joining/UPGMA method version 3.6a3
Settings for this run:
N
Neighbor-joining or UPGMA tree?
O
Outgroup root?
L
Lower-triangular data matrix?
R
Upper-triangular data matrix?
S
Subreplicates?
J
Randomize input order of species?
M
Analyze multiple data sets?
0
Terminal type (IBM PC, ANSI, none)?
1
Print out the data at start of run
2 Print indications of progress of run
3
Print out tree
4
Write out trees onto tree file?
Neighbor-joining
No, use as outgroup species
No
No
No
No. Use input order
No
(none)
No
Yes
Yes
Yes
1
Y to accept these or type the letter for one to change
M
How many data sets?
17
Random number seed (must be odd)?
19
Y to accept these or type the letter for one to cha nge
Y
. . .
Cycle
2: node 1 (
0.00636) joins node 71 (
0.01994)
Cycle
1: node 1 (
0.00579) joins node 60 (
0.04484)
last cycle:
node 1 (
0.00135) joins node 8 (
0.01066) joins node 15
(
0.01601)
Output written on file "outfile"
Tree written on file "outtree"
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
4
Bootstrap Support for Mary-Lee's Clusters
Done.
Rename outtree to be intree: mv outtree intree
The intree files for the cases with 1000 distance matrices hare fairly large: 31,028 lines.
1.4 PHYLIP consense
Run Linux program: consense. This PHYLP program read the trees created by the "neighbor"
PHYLP program and computes a consensus tree by the majority-rule consensus tree method.
[Select "R" to replace files unless old output files are renamed or deleted.]
consense
Consensus tree program, version 3.6a3
Settings for this run:
C
Consensus type (MRe, strict, MR, Ml):
O
Outgroup root:
Majority rule (extended)
No, use as outgroup species
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
1
5
Bootstrap Support for Mary-Lee's Clusters
R
T
1
2
3
4
Trees to be treated as Rooted:
Terminal type (IBM PC, ANSI, none):
Print out the sets of species:
Print indications of progress of run:
Print out tree:
Write out trees onto tree file:
No
(none)
Yes
Yes
Yes
Yes
Are these settings correct? (type Y or the letter for one to change)
Y
Consensus tree written to file "outtree"
Output written to file "outfile"
Done.
look at bottom of outfile for consensus tree
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
6
Bootstrap Support for Mary-Lee's Clusters – Baseline
2 Consense Output – Baseline Case
Created by processing the original distance matrix through the neighbor and consense PHYLP programs.
Wnt Cluster (Axin, Dkk1, …)
Notch Cluster (Hes1, Hes5, …)
NOTE: A better way to look at these unrooted trees is with the TreeView program, shown in the next section.
Extended majority rule consensus tree
CONSENSUS TREE:
the numbers on the branches indicate the number
of times the partition of the species into the two sets
which are separated by that branch occurred
among the trees, out of
1.00 trees
(trees had fractional w eights)
+------Mxra8
+ --1.0-|
+ --1.0-|
+------Zfp191
|
|
+ --1.0-|
+-------------Bcl9l
|
|
+ --1.0-|
+-------------------- Dnpep
|
|
+ --1.0-|
+--------------------------- C80012
|
|
|
|
+ ------Kctd11
+ ------------------------------------------------------- ---------------------------------------------------- 1.0-|
+----------------------- 1.0-|
|
|
+ ------Ugp2
|
|
|
|
+ -------------Rnf103
|
+ ----------------------- 1.0-|
|
|
|
|
+ ------Dsp-B
+ --1.0-|
+ ------Dsp-A
|
|
|
|
|
+ ----------------------------------------------------------- ----------------- Arfl4
|
|
+ -------------Nol5a
|
+ --1.0-|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
+ --1.0-|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
+ --1.0-|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
+ ------Mrps15
|
+ --1.0-|
+--1.0-|
|
|
|
+------Mta3
|
|
|
|
+ --1.0-|
+-------------------- Gpr175
|
|
|
|
|
|
+ ------Nkd1
+ --1.0-|
+--1.0-|
+----------------1.0-|
|
|
|
|
+ ------Hey1
|
|
|
|
|
|
+--1.0-|
+---------------------------------- Nrarp
|
|
|
|
|
|
|
|
+ ------Rabep1
|
|
+--1.0-|
+------------------------------ 1.0-|
|
|
|
|
+ ------Chd4
|
|
+ --1.0-|
|
+ --1.0-|
|
|
|
+------------------------------------------------ Id1
|
|
|
+ --1.0-|
|
|
|
|
|
|
+ ------------------------------- ------------------------ Hira
|
|
+ --1.0-|
|
|
|
|
+ -------------------------------------------------------------- Efna1
|
|
|
|
|
+ --------------------------------------------------------------------- Lfng
|
|
+ --1.0-|
|
+ ------Oact2
|
|
+ ------------------------------------------------------------------------ 1.0-|
|
|
+ ------Mtm1
|
|
|
|
+ ------Ttc1
|
|
+ --1.0-|
+ --1.0-|
|
+ --1.0-|
+------Ptpn11
|
|
|
|
|
|
|
+ ----------------------------------------------------------------- 1.0-|
+-------------Spry2
|
|
|
+ --1.0-|
|
+ -------------------- Nagk
|
|
|
|
|
+------------------------------------------------------------------------------------------------- Hes5
+--1.0-|
|
|
|
+ -------------------------------------------------------------------- ------------------------------------ Egr1
|
|
|
|
+ -------------Bcl2l11
+ --1.0-|
+--------------------------------------------------------------------------------------------- 1.0-|
|
|
|
|
+ ------Klf10
|
|
|
+ --1.0-|
|
+ --1.0-|
|
+ ------Hes1
|
|
|
|
|
|
|
+ ---------------------------------------------------------------------------------------------------------------------- Csnk2a2
|
+ --1.0-|
|
|
|
|
+ ------------------------------------------------ ----------------------------------------------------------------------------- Trub2
|
|
|
|
|
|
+ ------Gm428
|
+ --1.0-|
+------------------------------------------------------------------------------------------------------------------------- 1.0-|
|
|
|
+
------1427572 at
|
|
|
+--1.0-|
|
+ ------Nudt13
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
7
Bootstrap Support for Mary-Lee's Clusters – Baseline
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
+ --1.0-|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
+ --1.0-|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
+ --1.0-|
|
|
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
+ -------------------------------------------------------------------------------------------------------------------------------- 1.0-|
|
+ ------Star
|
+ ---------------------------------------------------------------------------------- ---------------------------------------------------------------- 2810437L13
+
|
+ ------Seh1l
+ --1.0-|
--1.0-|
+------Otud5
|
|
+ --1.0-|
+-------------1418669 at
|
|
|
|
|
|
+
------Gfra2
+ -------------------------------------------------------------------------------------------------------------------------------- 1.0-|
+---------1.0-|
|
+ ------Sh2bp1
|
|
+ ------Ninj1
+ ----------------1.0-|
+ ------Tlr5
+ -------------------- Cyp3a11
+ ----------------1.0-|
|
|
+ -------------Spint2
|
+ --1.0-|
+
|
|
+ ------Ubc-B
|
+ --1.0-|
+ --1.0-|
+ ------Ubc-A
|
|
|
|
+ ---------------------------------- Cflar
|
|
|
|
|
|
+ ------Sp5
|
|
|
+ --1.0-|
|
+ --1.0-|
+--1.0-|
+------Kcnmb2
--1.0-|
|
|
|
|
|
|
+ --1.0-|
+-------------Trim2
|
|
|
|
|
|
|
+ --1.0-|
+-------------------- Fscn1
+ --1.0-|
|
|
|
|
|
+ --------------------------- Fgf1
|
|
|
|
|
+ --------------------------------------------- ---Zfp96
|
|
|
+ ------------------------------------------------------- Poldip3
|
+ --1.0-|
+ -------------Wdr40a
|
|
+ --1.0-|
|
|
|
|
|
|
|
|
|
|
|
|
+--------------------------------------------------------------------------------------------- 1.0-|
|
|
|
|
|
|
|
|
|
+ ------H2-K1
|
+ --1.0-|
+ --1.0-|
+------Snrpd3
|
|
|
|
+------Wee1
|
|
+--1.0-|
+------------------------------ 1.0-|
+--1.0-|
+------Mtl5
|
|
|
|
|
|
|
+ -------------Plxdc2
|
|
|
|
|
|
|
|
|
|
+ ------Cyp2c50
|
|
|
|
+ ----------------1.0-|
|
|
|
|
+ ------Tnfrsf9
+ --1.0-|
|
|
|
|
|
|
|
+ ------------------------ --------------------------------------------- 2900083I11
|
|
|
|
|
|
|
+ ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 1810008K03
+ --1.0-|
|
|
|
|
|
+ ------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------ 6330407G11
|
|
|
+ --1.0-|
|
+------------------------------------------------------------------ -------------------------------------------------------------------------------------------------------------------------- A830059I20
|
|
|
|
|
+ --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2610042O14
|
|
|
+ ----------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------- 1200011M11
|
--1.0-|
+ ------Rpl26-B
|
+ --1.0-|
|
+ --1.0-|
+------Rpl26-A
|
|
|
|
|
+ --1.0-|
+-------------Tmsb10
|
|
|
|
+ --1.0-|
|
|
|
+ ------Dnmt3a
|
|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1.0-|
+---------1.0-|
|
|
|
+ ------Mdh1
|
|
|
+ --1.0-|
|
+ --------------------------- Hbb-bh1
|
|
|
|
|
+ ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ ------------------------------------------ Phlda1
|
|
+ --1.0-|
|
+ ------Has2
|
|
+ --------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------- 1.0-|
|
|
+ ------Rbm22
+ --1.0-|
|
|
|
+ ------------------------------------------ -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Tnfrsf19-B
+ --1.0-|
|
|
|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Dact1
+--1.0-|
|
|
|
+ ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------ Axin2
+------|
|
|
|
+ ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------ Myc
|
|
|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------- Dkk1
|
+------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------ Tnfrsf19-A
remember: this is an unrooted tree!
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
8
TreeView Display of Unrooted Trees
3 Unrooted Trees in TreeView
The Treeview program (from http://taxonomy.zoology.gla.ac.uk/rod/treeview.html) reads many different tree formats, including the format
created by PHYLIP, and provides a better way to view "unrooted" trees.
Tree files, outtree, are in the various subdirectories under U:\efg\Research\Olivier\Mary-Lee\17PointSeries:
Baseline, Delete1-17, Delete2-272, Delete4-1000-Boot, Delete6-1000-Boot, and Delete-8-1000Boot
Instructions to view with TreeView:
1.
Start TreeView
2.
File | Open | Files of type: All files
3.
Open
Tree | Unrooted
Tree | Show Internal Edge Labels
Tree | Internal Label Font … | 10-point Arial
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
9
TreeView Display of Unrooted Trees
3.1 Baseline
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
10
TreeView Display of Unrooted Trees
3.2 Delete-1 Jackknife
17 "delete 1" jackknife samples
We want to look for clades of genes that consistently group together, say 66% of the time or more (~11 out of 17).
WNT Group
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
11
TreeView Display of Unrooted Trees
3.3 Delete-2 Jackknife
17*16=272 "delete 2" jackknife samples
We want to look for clades of genes that consistently group together, say 66% of the time or more (~181 out of 272).
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
12
TreeView Display of Unrooted Trees
3.4 Delete-4 Bootstrap
1000 bootstrap samples of "delete 4" jackknife
We want to look for clades of genes that consistently group together, say 66% (less?) of the time or more (~666 out of 1000).
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
13
TreeView Display of Unrooted Trees
3.5 Delete-6 Bootstrap
1000 bootstrap samples of "delete 6" jackknife
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
14
Appendix A. SetupConsenseData.R Script
4 Appendix A. SetupConsenseData.R
From U:\efg\Research\Olivier\Mary-Lee\17PointSeries
# efg, 4 May 2006.
Stowers Institute.
library(RODBC)
filename <- "U:/efg/Research/Olivier/Mary-Lee/17PointSeries/Book2.xls"
connection <- odbcConnectExcel(filename)
sqlTables(connection)
worksheet <- sqlFetch(connection, "Sheet1", as.is=TRUE)
close(connection)
worksheet <- worksheet[,c(1,4,22:38)]
# Get rid of blanks in Column Names
colnames(worksheet) <- gsub(" ", "",colnames(worksheet))
# Change "---" GeneSymbols to be Affy ProbeSetIDs
worksheet$GeneSymbol[ worksheet$GeneSymbol == "---" ] <- worksheet$ProbeSetID[ worksheet$GeneSymbol ==
"---" ]
# Get rid of overloaded Affy data in GeneSymbol field
worksheet$GeneSymbol <- unlist( lapply( strsplit(worksheet$GeneSymbol, "///"), "[", 1 ) )
DuplicateGeneIDs <- worksheet$GeneSymbol[ table(sort(worksheet$GeneSymbol)) > 1]
# Add "-A" and "-B" to GeneSymbol duplicates
IDtable <- table(worksheet$GeneSymbol)
Duplicates <- names(which(IDtable > 1))
for (i in 1:length(Duplicates))
{
worksheet$GeneSymbol[ worksheet$GeneSymbol == Duplicates[i]] <paste(worksheet$GeneSymbol[ worksheet$GeneSymbol == Duplicates[i]], c("-A", "-B"), sep="")
}
rownames(worksheet) <- worksheet$GeneSymbol
d <- data.matrix(worksheet[,3:ncol(worksheet)])
# Base correlation matrix
BaseCorrelationMatrix <- cor(t(d))
#
Dkk1 Tnfrsf19-A
# Dkk1
1.0000000 0.9254702
# Tnfrsf19-A 0.9254702 1.0000000
# Hes1
-0.9012325 -0.9130869
# Axin2
0.8679224 0.7636788
# Dnmt3a
0.3520647 0.3392926
Hes1
Axin2
Dnmt3a
-0.9012325 0.8679224 0.3520647
-0.9130869 0.7636788 0.3392926
1.0000000 -0.7181422 -0.5555910
-0.7181422 1.0000000 0.2403288
-0.5555910 0.2403288 1.0000000
#[1] 1
#[1] 0.9254702
#[1] -0.9012325
#[1] 0.8679224
#[1] 0.3520647
cor(d[1,], d[1,])
cor(d[1,], d[2,])
cor(d[1,], d[3,])
cor(d[1,], d[4,])
cor(d[1,], d[5,])
# Create Distance Matrix from Correlation Matrix
# See R-Help, ?dist, second example
Distance <- (1 - BaseCorrelationMatrix)/2
#
#
#
#
#
#
#
Distance[1:5,1:5]
Dkk1 Tnfrsf19-A
Dkk1
0.00000000 0.03726489
Tnfrsf19-A 0.03726489 0.00000000
Hes1
0.95061627 0.95654343
Axin2
0.06603878 0.11816062
Dnmt3a
0.32396763 0.33035369
Hes1
0.9506163
0.9565434
0.0000000
0.8590711
0.7777955
Axin2
0.06603878
0.11816062
0.85907112
0.00000000
0.37983561
Dnmt3a
0.3239676
0.3303537
0.7777955
0.3798356
0.0000000
heatmap(Distance, scale="none", Colv=NULL, Rowv=NULL)
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
15
Appendix A. SetupConsenseData.R Script
heatmap(Distance)
WriteDistanceMatrix <- function(OutFile, DeleteColumns)
{
print(DeleteColumns)
flush.console() # Let Windows display catch up
if ( length(DeleteColumns) == 0 )
{
DeleteD <- d
} else {
DeleteD <- d[,-DeleteColumns]
}
CorrelationMatrix <- cor(t(DeleteD))
Distance <- (1 - CorrelationMatrix)/2
cat(sprintf("%5d", nrow(Distance)), "\n", file=OutFile)
for (k in 1:nrow(Distance))
{
cat( sprintf("%-16s", rownames(Distance)[k]), file=OutFile)
cat( sprintf("%10.6f", Distance[k,]),
file=OutFile)
cat("\n", file=OutFile)
}
}
# baseline sample
CreateBaseline <- function()
{
OutFile <- file("infile-baseline", "w")
WriteDistanceMatrix(OutFile, NULL)
close(OutFile)
}
# 17 Jackknife Delete-1 samples
CreateDelete1Jackknife <- function(seed=19)
{
OutFile <- file("infile-1", "w")
set.seed(seed)
for (i in 1:ncol(d))
{
DeleteColumns <- i
WriteDistanceMatrix(OutFile, DeleteColumns)
}
close(OutFile)
}
# 17*16 Jacknife Delete-2 samples
CreateDelete2Jackknife <- function(seed=19)
{
OutFile <- file("infile-2", "w")
set.seed(seed)
for (i in 1:ncol(d))
{
DeleteColumns <- i
for (j in 1:ncol(d))
if (j != i)
{
WriteDistanceMatrix(OutFile, c(DeleteColumns, j) )
}
}
close(OutFile)
}
# N Bootstrap samples of DeleteCount Jackknife
CreateDeleteNBootstrap <- function(BootCount, DeleteCount, seed=19)
{
# Bootstrap with "Delete-d jacknife" (See Efron & Tibshirani, p. 149)
OutFile <- file(paste("infile-", DeleteCount, sep=""), "w")
set.seed(seed)
for (i in 1:BootCount)
{
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
16
Appendix A. SetupConsenseData.R Script
DeleteColumns <- sample(1:ncol(d))[1:DeleteCount]
WriteDistanceMatrix(OutFile, DeleteColumns)
}
close(OutFile)
}
U:\efg\Research\Olivier\Mary-Lee\17PointSeries\NeighborConsense.doc 9 October 2006
17
Download