Project in Bioinformatics – 236524

advertisement
Project in Bioinformatics – 236524
Coloring Heuristics for Dotted Graphs
Presented By:
Shai Lubliner 036317048
Talya Gendler 061106597
Contents:
1. Introduction (Biological Background)
2. Modeling the problem
3. Algorithms & their implementation
4. Program design & usage
5. Results
5.1 Biological data
5.2 Randomly generated data
6. References
7. Appendix
2
1. Introduction (Biological Background)
Genotyping
The process of genotyping determines the variants of certain sequences in an individual’s
DNA, out of a known set of polymorphisms which exist in the population. This allows
differentiation between individuals, linkage analysis, prenatal disease detection, etc.
Among the most important types of polymorphisms is MLP (Microsatellite Length
Polymorphism).
Microsatellite Length Polymorphism
Microsatellites are ubiquitous short tandem-repeat sequences widely and randomly
distributed throughout eukaryotic genomes. They are acutely prone to replication errors
that result in expansions and contractions of repeat unit (repeat length variability) because
of misalignment of the template and daughter strands.
The typical repeated sequence is 1-5 bp long, typically repeated 10-20 times. The MLP is
flanked by fixed areas, which enable the design of specific primers.
The genotyping process
The process of determining an individual’s allele is comprised of designing primers
complementary to the flanking regions, enhancement by PCR and electrophoresis.
The amount of resources needed for electrophoresis is measured by two factors: tags and
lanes.
Minimization of both can reduce experiment costs. From here on we will focus on
reducing the number of lanes, under the assumption that only one tag is used.
This can be achieved by multiplexing – using the same lane for more than one MLP.
We seek the minimal number of lanes with which one can conduct genotyping, in a
manner that enables decisive interpretation of the outcome.
3
2. Modeling the Problem
MLP Representation
It is clear that an MLP can be represented as an arithmetic progression, also referred to as
a dotted interval. Formally:
• For any polymorphic microsatellite p, let F be the length of the flanking area for p
and Δ the length of the repeat sequence. Let l and h be the minimum and
maximum number of repeats possible in p, respectively.
• The possible lengths of the DNA fragment corresponding to p are:
Sp = {F + n Δ : l ≤ n ≤ h} which is an arithmetic progression.
Two MLPs are distinct iff their arithmetic progressions share no mutual point.
When genotyping, two MLPs can be run in the same lane iff they are distinct.
Dotted Interval Graphs (DIGs)
Seeking to exploit the dotted interval representation of MLPs, Aumann et Al.[1] present
the following model, which is a variation on the well known Interval Graph:
- Each arithmetic progression is represented as a vertex.
- Two vertices have an edge between them if and only if the corresponding
progressions are distinct.
- The resulting graph is referred to as a DIG (Dotted Interval Graph).
- The notation DIGd represents a DIG with the integer d being its maximal jump.
Solution lies in coloring
It is obvious that a coloring of the graph induces a partition of all MLP samples to lanes.
Unfortunately, coloring DIGs has been proven ([1]) to be NP-complete.
Therefore we turn to approximation algorithms and heuristics.
Our goal is to examine and compare a set of approximation and heuristic algorithms, and
see which gives the best results when run on DIGs. Mainly, we wish to conclude which
algorithm gives the best results for biological data.
The primary parameter by which we rate an algorithm’s performance is having minimal
number of colors assigned to the graph’s vertices.
The secondary one is the skew of the color size.
4
3. Algorithms & their implementation
Sequential coloring based algorithms
The following algorithms are all composed of two main steps. The first one is a vertex
ordering algorithm and the second is the sequential coloring algorithm.
In our implementation of the Sequential Coloring (SC) algorithm, we insert each vertex
into the minimal sized color (if possible), in order to try and minimize the skew of the
color sizes, as proposed in [3].
All three algorithms are heuristic. The first two algorithms are elementary. We regard
them as an experimental control.
1. Ending First Coloring (EFC / Interval Graph Coloring) – Vertices are ordered such
that vertex (dotted interval) a precedes vertex b if a ends before b.
2. Beginning First Coloring (BFC) – Vertices are ordered such that vertex (dotted
interval) a precedes vertex b if a begins before b.
3. Smallest Last Coloring (SLC), which is based on Smallest Last Ordering (SLO),
discussed in [2].
SLC is known to be a useful algorithm for coloring general graphs. Roth et al. [3]
discussed and tested SLC performance on a dotted interval graph representing the
biological data taken from the Weber8 set [4]. SLC outperformed the Heuristic
Approximate Chordal Coloring algorithm, and was also proved to have returned the
optimal result (in this case).
We therefore consider this as “the algorithm to beat”.
5
“DIG aware” approximation algorithms
The following two algorithms are presented in Aumann et al. [1], and exploit the
characteristics of DIGs.
DIG2 Coloring (DIG2) – This algorithm has been proven to color DIG2 graphs with an
approximation ratio of 3/2.
Some definitions for a vertex vi:
bi - the beginning point of the DI represented by vi
Vi - the vertices that are represented by the DIs above the point bi.
Gi - a subgraph of G induced by the vertices of Vi.
Following is the algorithm, which iterates over the vertices according to BFO (beginning
first order):
The “DIG awareness” of this algorithm is epitomized in the first if condition (3-4).
Considering the possibility of multiplexing, it first tries to use one of the colors already
assigned to overlapping DIs.
6
DIGd Coloring (DIGd) – This algorithm has been proven to color DIGd graphs with an
approximation ratio of (7/8)d +3/8.
Some definitions:
Vi - the vertices (DIs) whose jump=i.
Gi - a subgraph of G induced by the vertices of Vi.
Following is the algorithm:
A weakness of this algorithm is the fact that it assigns a new set of colors in each
invocation of a coloring algorithm.
7
A color reducing heuristic
We suggest a simple algorithm that, given a coloring of a graph, attempts to reduce the
number of colors.
The algorithm ReduceCol:
1. sort colors by ascending size → C1,…,Cn
2. for C = C1 to Cn :
2.1. foreach v ∈ C
try to relocate v into another color, in descending size order
if relocation fails, go to 2 for next color
2.2 if C is empty remove it from coloring
The notion of this algorithm arose after we received the DIG2 and DIGd algorithms’
results (presented later). These results were characterized by a distinct skew of the color
sizes.
It is obvious that this algorithm can only improve the results attained by any other
algorithm, regarding number of colors used.
Since this algorithm tries to eliminate colors starting with the smallest, we expect it to
reduce the skew of color sizes whenever it receives a skewed coloring to begin with.
On the other hand, lessening the color number of balanced colorings may come at the
price of enlarging the coloring skew.
Notation issue – We add the ++ sign to an algorithm’s name to indicate that ReduceCol is
run successively. For example, DIG2++ means running DIG2 and immediately after
running ReduceCol on its output.
8
4. Program design & usage
Components
Our code exists of the following:
- A C++ core program
(DIG.cpp\hpp, DIG_DottedInterval.cpp\hpp, DIG_Vertex.cpp\hpp &
DIG_Util.cpp\hpp)
- A C++ random dataset generator
(DIGen.cpp\hpp)
- A perl script which serves as a wrapper for the core program
(DIG.pl)
- A perl script which allows mass testing of the algorithms’ performance
(StatGen .pl)
The core program
The core program receives a data file which specifies a set of dotted intervals and uses it
to build a DIG. The dotted intervals can be infinite as well as finite, so the program
supports the wider notion of DIGs (as presented in [1]).
The program implements all of the algorithms discussed in section 4.
The coloring algorithms specified by a user request are run on the DIG to produce an
output file, specifying the results per algorithm.
The core program design focuses on data safety and correctness, rather than on time and
space performance. As will be later mentioned, for biologically relevant sized datasets,
the time performance is satisfying.
Random dataset generator
This program receives the following parameters:
Dataset size, maximal jump, random seed, previous random seed.
Dataset size – determines the number of DIs in the dataset produced.
Maximal jump – for each DI we randomly select a jump value from the integer range
[1,…,maximal jump].
Random seed – used to initiate the random generation process. For a given nonzero
random seed, the generation is actually deterministic. Hence, given a specific set of
parameters (with a nonzero random seed), the program generates a specific dataset. This
enables reconstruction of previously generated datasets and allows us to include the
option of regression, used in stat_gen.pl (explained below).
Previous random seed – used to ensure that two consecutive calls to the generator will not
result in identical sets. This is relevant for mass testing as in StatGen.pl.
Each DI is characterized by the size of its flanking areas, its jump and the number of dots
it contains. Based on random decisions of these three parameters, we can determine the
DI’s beginning, end and offset.
9
Since we focus on achieving best results for biological data, which of course is made up
only of finite DIs, we only constructed finite DIs.
We chose the range of [1,…,200] for the flanking area size and that of [1,…,50] for the
number of points. These ranges were chosen based on examination of the Weber sets [4].
All random decisions are made to simulate uniform distributions.
StatGen.pl
This program has two operation modes; regression based and non-regression based. In
both modes the program requests the core program to run all of the algorithms on each
dataset.
In the non-regression based, the following parameters are received:
Number of sets per dataset size (num_sets) (default=10), max jump (default=2), output
file and new regression file.
Number of sets per dataset size – specifies number of times a new dataset of size X is
generated and fed to the core program. X belongs to {20,50,100,200,300,500,750,1000}.
Max jump – passed on to the random dataset generator.
Output file – contains the results. For each algorithm and dataset size X the following
parameters are calculated, as an average on all num_sets datasets:
- average number of colors
- average color size
- minimal color size
- maximal color size
New regression file – contains all information needed to recreate the test case.
In the regression based option, the script receives a single parameter – the name of a
regression file. This enables the recreation of a specific test case. Therefore a standard
test case can be used for comparison of future algorithms with the existing ones.
Documentation
Each perl script includes a man page, invoked when the script is called with the ‘-man’
option. This documentation is user-oriented. We found it superfluous to include extensive
developer-oriented documentation, trusting in our code fluency.
Making the programs:
We supply a short Makefile which specifies the needed rules for the making of the core
program and the random dataset generator.
For reasons of simplicity we did not partition our files into subdirectories. Any such
partition calls for changes in the scripts and in the Makefile.
10
5. Results
We discuss the results we received on two different types of data, the first being
biological data and the second randomly generated data.
5.1 Results received on biological data
We run the program on three different sets, all taken from the Weber sets [4]. Each such
set includes multiple MLPs found in the human genome. The Weber sets are not distinct,
and some contain great amounts of data contained in other sets. We chose these specific
sets because they belong to different groups, and seem to represent most of the data
present in all sets.
In order to create the files representing these sets in the format requested by the program,
we preformed some preliminary manual manipulation on the data taken from the web,
followed by the help of a simple perl script (“convert.pl”, submitted).
In doing so, we omitted any MLP with incomplete/indecisive definition (for example a
missing starting\ending point, or starting\ending points not corresponding to the jump
values given).
The results are presented in the following tables:
For set Weber8:
(301 vertices, maximal clique size – 23)
EFC
number of colors 26
max color size 12
min color size
11
EFC++
24
19
6
BFC
24
13
11
BFC++
23
24
1
SLC
23
14
13
SLC++
23
25
6
DIG2
24
21
2
DIG2++
24
21
3
DIGd
26
21
1
DIGd++
23
21
4
SLC
46
8
7
SLC++
45
14
3
DIG2
45
15
1
DIG2++
45
15
2
DIGd
47
18
1
DIGd++
46
15
2
SLC
20
12
11
SLC++
20
24
2
DIG2
22
21
1
DIG2++
21
21
3
DIGd
24
22
1
DIGd++
21
24
3
For set Weber13:
(355 vertices, maximal clique size – 43)
EFC
number of colors 47
max color size 8
min color size
7
EFC++
46
13
3
BFC
46
8
7
BFC++
45
15
1
For set Weber53:
(230 vertices, maximal clique size – 19)
EFC
number of colors 24
max color size 10
min color size
9
EFC++
22
18
3
BFC
21
11
10
BFC++
21
21
3
11
A point worth elaborating on is the results for the Weber8 set, since Roth et al. [3] also
used the SLC algorithm on this set and received a result of 34 colors. They stated that this
was in fact an optimal coloring, since the maximal clique size they found was 34.
The set we used had 365 MLPs in it, and only 301 were left after omission as explained
above.
Their set, however, contained 383 MLPs.
We can only conclude that there is some difference between the sets used.
The best coloring we received was 23 colors, as is the size of the maximal clique.
Therefore, we also achieved an optimal coloring in this case.
Discussion – We recall that we measure the algorithms’ performance according to the
number of colors assigned and according to the skew of the color size.
As can be seen from the tables, the SLC++ algorithm is the best regarding color number.
However, it does tend to greatly enlarge the coloring skew compared with the results of
the SLC algorithm. Considering that the skew is practically important when actually
running DNA samples in gel (we do not want overly crowded lanes), it seems that the
best algorithm for these cases is SLC.
12
5.2 Results received on randomly generated data
We used the StatGen.pl script described above for the generation of this data. The dataset
sizes are {20,50,100,200,300,500,750,1000}. The parameters used were:
Number of sets per dataset size – 100
max jump – 5
The raw output appears in section 7 (Appendix), and contains, in addition to the
information discussed here, the standard deviation of the number of colors assigned by
each algorithm.
The following table contains the average number of colors assigned by each algorithm to
each dataset size:
20
50
100
200
300
500
750
1000
EFC
7.26
14.29
25.03
43.8
62.43
96.52
138.29
178.89
EFC++
6.87
13.08
22.57
39.53
56.19
87.26
124.57
162.01
BFC
7.11
13.85
24.29
43.05
61.72
96
137.48
179.16
BFC++
6.83
12.98
22.24
38.98
55.11
87.59
123.05
159.46
SLC
6.86
12.98
22.26
39.02
54.79
85.05
121.13
157.49
SLC++
6.82
12.79
21.84
38.3
54.01
83.88
119.67
155.59
DIG2
6.88
13.04
22.24
38.53
54.24
83.81
118.37
154.19
DIG2++
6.79
12.79
21.97
38.09
53.65
83.07
117.71
153.25
DIGd
8.35
15.52
25.92
43.59
59.95
91.69
127.51
164.2
DIGd++
6.87
12.98
22.15
38.64
54.33
84.34
119.73
155.33
Best results are marked in blue. As can be seen, ReduceCol indeed manages to reduce the
number of colors. The SLC++ algorithm seems to be the best for smaller datasets though
not by much. For larger datasets, DIG2++ is the best.
Regarding the approximation algorithms presented in [1], it is clear that the DIG2
algorithm’s strategy (as mentioned in section 3) provides good results, while the DIGd
algorithm’s color assignment “promiscuousness” provides poorer results.
The graph on the following page shows the color skewness for each algorithm.
It can be seen from the graph that our expectation was correct and that ReduceCol
reduces the skew of color sizes whenever it receives a skewed coloring to begin with (for
example DIG2), and enlarges the coloring skew of balanced colorings.
13
18
16
14
12
av col num
EFC-min
EFC-max
EFC++-min
EFC++-max
BFC-min
BFC-max
BFC++-min
BFC++-max
SLC-min
SLC-max
SLC++-min
SLC++-max
DIG2-min
DIG2-max
DIG2++-min
DIG2++-max
DIGd-min
DIGd-max
DIGd++-min
DIGd++-max
10
8
6
4
2
0
0
200
400
600
set size
14
800
1000
1200
6. References
1. Y. Aumann, M. Lewenstein, O. Melamud, R.Y. Pinter, Z. Yakhini, Dotted interval
graphs and high throughput genotyping, Proceedings of the sixteenth annual ACMSIAM symposium on Discrete algorithms, Session 4B, Pages: 339 - 348 , 2005.
2. D. W. Matula and L. L. Beck, Smallest-Last Ordering and Clustering and Graph
Coloring Algorithms, J. of the Association for Computing Machinery, vol. 30, no. 3, pp.
417--427, July 1983.
3. R.M. Roth, P. Webb, Z. Yakhini, Tagging DNA Fragments and Graph Coloring
Methods, HP, 1997
4. Web site http://research.marshfieldclinic.org/genetics/sets/combo.html
15
7. Appendix
The following is the raw output of StatGen.pl, upon which we base the discussion in
section 5.2:
SETS PER SIZE: 100
SET SIZE: 20
EFC:
Mean Number of Colors = 7.26 +/- 1.12
Average Color Size = 2.75
Average Max Color Size = 4.15
Average Min Color Size = 1.46
EFC:++
Mean Number of Colors = 6.87 +/- 1.11
Average Color Size = 2.91
Average Max Color Size = 4.82
Average Min Color Size = 1.36
BFC:
Mean Number of Colors = 7.11 +/- 1.14
Average Color Size = 2.81
Average Max Color Size = 4.09
Average Min Color Size = 1.68
BFC:++
Mean Number of Colors = 6.83 +/- 1.19
Average Color Size = 2.93
Average Max Color Size = 4.75
Average Min Color Size = 1.49
SLC:
Mean Number of Colors = 6.86 +/- 1.16
Average Color Size = 2.92
Average Max Color Size = 3.57
Average Min Color Size = 2.40
SLC:++
Mean Number of Colors = 6.82 +/- 1.15
Average Color Size = 2.93
Average Max Color Size = 4.69
Average Min Color Size = 1.42
DIG2:
Mean Number of Colors = 6.88 +/- 1.18
Average Color Size = 2.91
Average Max Color Size = 5.02
Average Min Color Size = 1.36
DIG2:++
Mean Number of Colors = 6.79 +/- 1.17
Average Color Size = 2.95
Average Max Color Size = 4.81
Average Min Color Size = 1.63
DIGd:
Mean Number of Colors = 8.35 +/- 1.15
16
Average Color Size = 2.40
Average Max Color Size = 4.25
Average Min Color Size = 1.20
DIGd:++
Mean Number of Colors = 6.87 +/- 1.17
Average Color Size = 2.91
Average Max Color Size = 4.59
Average Min Color Size = 1.60
SET SIZE: 50
EFC:
Mean Number of Colors = 14.29 +/- 1.45
Average Color Size = 3.50
Average Max Color Size = 5.33
Average Min Color Size = 1.58
EFC:++
Mean Number of Colors = 13.08 +/- 1.47
Average Color Size = 3.82
Average Max Color Size = 6.32
Average Min Color Size = 1.67
BFC:
Mean Number of Colors = 13.85 +/- 1.63
Average Color Size = 3.61
Average Max Color Size = 5.18
Average Min Color Size = 1.79
BFC:++
Mean Number of Colors = 12.98 +/- 1.52
Average Color Size = 3.85
Average Max Color Size = 6.20
Average Min Color Size = 1.73
SLC:
Mean Number of Colors = 12.98 +/- 1.52
Average Color Size = 3.85
Average Max Color Size = 4.51
Average Min Color Size = 3.22
SLC:++
Mean Number of Colors = 12.79 +/- 1.51
Average Color Size = 3.91
Average Max Color Size = 6.41
Average Min Color Size = 1.89
DIG2:
Mean Number of Colors = 13.04 +/- 1.63
Average Color Size = 3.83
Average Max Color Size = 7.13
Average Min Color Size = 1.23
DIG2:++
Mean Number of Colors = 12.79 +/- 1.54
Average Color Size = 3.91
Average Max Color Size = 6.69
Average Min Color Size = 1.84
DIGd:
17
Mean Number of Colors = 15.52 +/- 1.67
Average Color Size = 3.22
Average Max Color Size = 6.18
Average Min Color Size = 1.13
DIGd:++
Mean Number of Colors = 12.98 +/- 1.54
Average Color Size = 3.85
Average Max Color Size = 6.37
Average Min Color Size = 1.79
SET SIZE: 100
EFC:
Mean Number of Colors = 25.03 +/- 1.90
Average Color Size = 4.00
Average Max Color Size = 6.08
Average Min Color Size = 1.77
EFC:++
Mean Number of Colors = 22.57 +/- 1.88
Average Color Size = 4.43
Average Max Color Size = 7.82
Average Min Color Size = 1.63
BFC:
Mean Number of Colors = 24.29 +/- 2.10
Average Color Size = 4.12
Average Max Color Size = 6.03
Average Min Color Size = 1.81
BFC:++
Mean Number of Colors = 22.24 +/- 1.94
Average Color Size = 4.50
Average Max Color Size = 7.63
Average Min Color Size = 1.78
SLC:
Mean Number of Colors = 22.26 +/- 1.93
Average Color Size = 4.49
Average Max Color Size = 5.10
Average Min Color Size = 3.96
SLC:++
Mean Number of Colors = 21.84 +/- 1.98
Average Color Size = 4.58
Average Max Color Size = 8.16
Average Min Color Size = 1.94
DIG2:
Mean Number of Colors = 22.24 +/- 2.11
Average Color Size = 4.50
Average Max Color Size = 8.70
Average Min Color Size = 1.11
DIG2:++
Mean Number of Colors = 21.97 +/- 2.06
Average Color Size = 4.55
Average Max Color Size = 8.29
Average Min Color Size = 1.87
18
DIGd:
Mean Number of Colors = 25.92 +/- 2.52
Average Color Size = 3.86
Average Max Color Size = 7.42
Average Min Color Size = 1.21
DIGd:++
Mean Number of Colors = 22.15 +/- 2.03
Average Color Size = 4.51
Average Max Color Size = 7.94
Average Min Color Size = 1.88
SET SIZE: 200
EFC:
Mean Number of Colors = 43.80 +/- 2.18
Average Color Size = 4.57
Average Max Color Size = 7.21
Average Min Color Size = 1.93
EFC:++
Mean Number of Colors = 39.53 +/- 2.54
Average Color Size = 5.06
Average Max Color Size = 9.11
Average Min Color Size = 1.88
BFC:
Mean Number of Colors = 43.05 +/- 3.04
Average Color Size = 4.65
Average Max Color Size = 7.09
Average Min Color Size = 1.53
BFC:++
Mean Number of Colors = 38.98 +/- 2.66
Average Color Size = 5.13
Average Max Color Size = 8.94
Average Min Color Size = 1.77
SLC:
Mean Number of Colors = 39.02 +/- 2.59
Average Color Size = 5.13
Average Max Color Size = 5.77
Average Min Color Size = 4.54
SLC:++
Mean Number of Colors = 38.30 +/- 2.71
Average Color Size = 5.22
Average Max Color Size = 10.12
Average Min Color Size = 2.22
DIG2:
Mean Number of Colors = 38.53 +/- 3.19
Average Color Size = 5.19
Average Max Color Size = 10.60
Average Min Color Size = 1.12
DIG2:++
Mean Number of Colors = 38.09 +/- 2.96
Average Color Size = 5.25
Average Max Color Size = 10.14
Average Min Color Size = 1.99
19
DIGd:
Mean Number of Colors = 43.59 +/- 3.01
Average Color Size = 4.59
Average Max Color Size = 9.10
Average Min Color Size = 1.23
DIGd:++
Mean Number of Colors = 38.64 +/- 2.62
Average Color Size = 5.18
Average Max Color Size = 9.39
Average Min Color Size = 1.86
SET SIZE: 300
EFC:
Mean Number of Colors = 62.43 +/- 3.06
Average Color Size = 4.81
Average Max Color Size = 7.52
Average Min Color Size = 2.12
EFC:++
Mean Number of Colors = 56.19 +/- 3.14
Average Color Size = 5.34
Average Max Color Size = 9.86
Average Min Color Size = 1.90
BFC:
Mean Number of Colors = 61.72 +/- 3.81
Average Color Size = 4.86
Average Max Color Size = 7.44
Average Min Color Size = 1.62
BFC:++
Mean Number of Colors = 55.11 +/- 2.96
Average Color Size = 5.44
Average Max Color Size = 9.79
Average Min Color Size = 1.77
SLC:
Mean Number of Colors = 54.79 +/- 3.07
Average Color Size = 5.48
Average Max Color Size = 6.09
Average Min Color Size = 4.96
SLC:++
Mean Number of Colors = 54.01 +/- 3.11
Average Color Size = 5.55
Average Max Color Size = 11.36
Average Min Color Size = 2.35
DIG2:
Mean Number of Colors = 54.24 +/- 3.63
Average Color Size = 5.53
Average Max Color Size = 11.30
Average Min Color Size = 1.09
DIG2:++
Mean Number of Colors = 53.65 +/- 3.52
Average Color Size = 5.59
Average Max Color Size = 10.69
20
Average Min Color Size = 1.95
DIGd:
Mean Number of Colors = 59.95 +/- 3.66
Average Color Size = 5.00
Average Max Color Size = 10.00
Average Min Color Size = 1.19
DIGd:++
Mean Number of Colors = 54.33 +/- 3.25
Average Color Size = 5.52
Average Max Color Size = 10.28
Average Min Color Size = 1.86
SET SIZE: 500
EFC:
Mean Number of Colors = 96.52 +/- 3.88
Average Color Size = 5.18
Average Max Color Size = 8.19
Average Min Color Size = 2.32
EFC:++
Mean Number of Colors = 87.26 +/- 3.80
Average Color Size = 5.73
Average Max Color Size = 10.87
Average Min Color Size = 1.89
BFC:
Mean Number of Colors = 96.00 +/- 5.06
Average Color Size = 5.21
Average Max Color Size = 8.22
Average Min Color Size = 1.27
BFC:++
Mean Number of Colors = 85.79 +/- 3.88
Average Color Size = 5.83
Average Max Color Size = 11.08
Average Min Color Size = 1.78
SLC:
Mean Number of Colors = 85.05 +/- 3.98
Average Color Size = 5.88
Average Max Color Size = 6.45
Average Min Color Size = 5.22
SLC:++
Mean Number of Colors = 83.88 +/- 3.72
Average Color Size = 5.96
Average Max Color Size = 12.88
Average Min Color Size = 2.47
DIG2:
Mean Number of Colors = 83.81 +/- 4.49
Average Color Size = 5.97
Average Max Color Size = 12.55
Average Min Color Size = 1.02
DIG2:++
Mean Number of Colors = 83.07 +/- 4.33
Average Color Size = 6.02
21
Average Max Color Size = 12.14
Average Min Color Size = 1.92
DIGd:
Mean Number of Colors = 91.69 +/- 5.08
Average Color Size = 5.45
Average Max Color Size = 10.76
Average Min Color Size = 1.23
DIGd:++
Mean Number of Colors = 84.34 +/- 4.08
Average Color Size = 5.93
Average Max Color Size = 11.22
Average Min Color Size = 1.87
SET SIZE: 750
EFC:
Mean Number of Colors = 138.29 +/- 4.32
Average Color Size = 5.42
Average Max Color Size = 8.76
Average Min Color Size = 2.49
EFC:++
Mean Number of Colors = 124.57 +/- 4.11
Average Color Size = 6.02
Average Max Color Size = 11.43
Average Min Color Size = 2.06
BFC:
Mean Number of Colors = 137.48 +/- 6.23
Average Color Size = 5.46
Average Max Color Size = 8.74
Average Min Color Size = 1.29
BFC:++
Mean Number of Colors = 123.05 +/- 4.29
Average Color Size = 6.10
Average Max Color Size = 11.66
Average Min Color Size = 1.87
SLC:
Mean Number of Colors = 121.13 +/- 4.69
Average Color Size = 6.19
Average Max Color Size = 6.86
Average Min Color Size = 5.67
SLC:++
Mean Number of Colors = 119.67 +/- 4.55
Average Color Size = 6.27
Average Max Color Size = 14.57
Average Min Color Size = 2.61
DIG2:
Mean Number of Colors = 118.37 +/- 5.03
Average Color Size = 6.34
Average Max Color Size = 13.34
Average Min Color Size = 1.04
DIG2:++
Mean Number of Colors = 117.71 +/- 4.84
22
Average Color Size = 6.37
Average Max Color Size = 13.38
Average Min Color Size = 2.01
DIGd:
Mean Number of Colors = 127.51 +/- 4.65
Average Color Size = 5.88
Average Max Color Size = 11.58
Average Min Color Size = 1.17
DIGd:++
Mean Number of Colors = 119.73 +/- 4.55
Average Color Size = 6.26
Average Max Color Size = 12.08
Average Min Color Size = 2.03
SET SIZE: 1000
EFC:
Mean Number of Colors = 178.89 +/- 5.21
Average Color Size = 5.59
Average Max Color Size = 9.14
Average Min Color Size = 2.50
EFC:++
Mean Number of Colors = 162.01 +/- 5.47
Average Color Size = 6.17
Average Max Color Size = 12.01
Average Min Color Size = 2.07
BFC:
Mean Number of Colors = 179.16 +/- 7.57
Average Color Size = 5.58
Average Max Color Size = 8.97
Average Min Color Size = 1.13
BFC:++
Mean Number of Colors = 159.46 +/- 5.70
Average Color Size = 6.27
Average Max Color Size = 12.17
Average Min Color Size = 1.75
SLC:
Mean Number of Colors = 157.49 +/- 5.92
Average Color Size = 6.35
Average Max Color Size = 7.05
Average Min Color Size = 5.80
SLC:++
Mean Number of Colors = 155.59 +/- 5.95
Average Color Size = 6.43
Average Max Color Size = 16.02
Average Min Color Size = 2.68
DIG2:
Mean Number of Colors = 154.19 +/- 6.63
Average Color Size = 6.49
Average Max Color Size = 14.00
Average Min Color Size = 1.01
DIG2:++
23
Mean Number of Colors = 153.25 +/- 6.40
Average Color Size = 6.53
Average Max Color Size = 13.89
Average Min Color Size = 1.94
DIGd:
Mean Number of Colors = 164.20 +/- 6.57
Average Color Size = 6.09
Average Max Color Size = 12.13
Average Min Color Size = 1.15
DIGd:++
Mean Number of Colors = 155.33 +/- 6.09
Average Color Size = 6.44
Average Max Color Size = 12.78
Average Min Color Size = 1.88
24
Download