Supplementary Software 1

advertisement
Supplementary Information
Active learning framework with iterative clustering for bioimage classification
Natsumaro Kutsuna, Takumi Higaki, Sachihiro Matsunaga, Tomoshi Otsuki,
Masayuki Yamaguchi, Hirofumi Fujii & Seiichiro Hasezawa
Supplementary Software 1
1
Supplementary Software 1 | Pseudocode of CARTA algorithm. Core routines of CARTA are shown in List 1–4.
global parameters
N: number of input images
P: population size (number of individuals) in genetic algorithm (GA)
List 1
1 function CARTA(images) do
2 for i ← 1 to N do
3
vectors[i] ← feature vector extracted from images[i]
// Feature Extractor in Fig.1a
4 end for
5 // select features & annotated subset of images
6 selector, annotatedVectors, annotatedLabels ← iterativeClustering(vectors, images)
// List 2
7 display selector to user
8 // perform supervised learning and cross-validation
9 classifierSub, accuracySub ← trainAndValidate(project(selector, annotatedVectors), annotatedLabels)
// Lists 7 & 9
10
classifierFull, accuracyFull ← trainAndValidate(annotatedVectors, annotatedLabels)
// List 7
11
// classify all images
12
if accuracyFull > accuracySub then
13
labels ← classify(classifierFull, vectors) // use full set of features, List 8
14
else
15
labels ← classify(classifierSub, project(selector, vectors)) // use selected features, Lists 8 & 9
16
end if
2
17
return labels
18 end function
List 2
1 function iterativeClustering(vectors, images) do
2 // constant L: criteria to stop the iteration of GA
3 generation ← 1
4 annotatedVectors ← empty
5 annotatedLabels ← empty
6 peakGeneration ← 1
7 peakFitness ← 0 // minimum value of fitness value
8 makeFirstGeneration(population) // randomly initialize individuals, List 5
9 peakSelector ← featureSelector of population[1]
10
repeat do
11
foreach individual ∊ population do
12
evaluate(individual, vectors, annotatedVectors, annotatedLabels) // Feature Evaluator in Fig.1a, List 3
13
end foreach
14
bestIndividual ← individual assigned best fitness in population
15
currentFitness ← fitness of bestIndividual
16
display currentFitness & featureSelector of bestIndividual to user
17
if currentFitness > peakFitness then // better solution found
18
peakFitness ← currentFitness
19
peakGeneration ← generation
20
peakSelector ← featureSelector of bestIndividual
21
else if (annotatedLabels ≠ empty) and (generation  peakGeneration > L) or (interrupted by user) then
22
return peakSelector, annotatedVectors, annotatedLabels
23
end if
24
newAnnotatedImages, newAnnotatedLabels ← acceptAnnotation(peakSelector, vectors, images) // List 4
3
25
if newAnnotatedImages ≠ empty then
26
peakFitness ← 0 // minimum value of fitness value
27
peakGeneration ← generation
28
peakSelector ← featureSelector of bestIndividual
29
for i ← 1 to N do
30
if images[i] in newAnnotatedImages then
31
append vectors[i] to annotatedVectors
32
end if
33
end for
34
append newAnnotatedLabels to annotatedLabels
35
end if
36
population ← makeOffsprings(population) // Feature Optimizer in Fig.1a, List 6
37
generation ← generation + 1
38
end repeat
39 end function
List 3
1 procedure evaluate(individual, vectors, annotatedVectors, annotatedLabels) do
// Feature Evaluator in Fig.1a
2 if annotatedLabels is empty then // unsupervised situation
3
fitness ← 1
4 else
// semi-supervised situation
5
vectorsInSubspace ← project(featureSelector of individual, vectors)
// List 9
6
som ← train self-organizing map (SOM) using vectorsInSubspace
7
fitness ← 0
8
foreach class ∊ classes of annotatedLabels do
9
classVectorsInSubspace ← project(featureSelector of individual, vectors labeled as class in annotatedVectors) // List 9
10
for i ← 1 to number of classVectorsInSubspace do
11
classPoints[i] ← location of best matching unit (BMU) in som to classVectorsInSubspace[i]
4
12
13
14
15
// location: f(x) in Q1×Q2 defined in equations (1, 2)
end for
classTree ← construct minimum spanning tree (MST) which connects all classPoints
𝟏
fitness ← fitness + 𝟏+ ∑
// compact tree yields high fitness
|𝒂𝒓𝒄|
16
17
18
19
20
21
22
end foreach
end if
for i ← 1 to N do
allLocation[i] ←location of BMU in som to vectorsInSubspace[i] // location: f(x) in Q1×Q2
end for
allTree ← construct MST which connects allLocations
𝒇𝒊𝒕𝒏𝒆𝒔𝒔
fitness ← 𝟏+ ∑
// adjust fitness by occupancy of SOM nodes
|𝒂𝒓𝒄|
𝒂𝒓𝒄∈𝒄𝒍𝒂𝒔𝒔𝑻𝒓𝒆𝒆
𝒂𝒓𝒄∈𝒂𝒍𝒍𝑻𝒓𝒆𝒆
23
assign fitness to individual
24 end procedure
List 4
1 function acceptAnnotation(featureSelector, vectors, images) do
2 vectorsInSubspace ← project(featureSelector, vectors) // List 9
3 som ← train SOM using vectorsInSubspace
4 for i ← 1 to N do
5
location ← location of BMU in som to vectorsInSubspace[i]
// location: f(x) in Q1×Q2
6
assign location to images[i]
7 end for
8 foreach node ∊ som do // display tiled images of SOM
9
location ← location of node
10
imagesAtXy ← get images which assigend to location from images
11
display one of imagesAtXy as the tile of image at location
12
end foreach
5
13
if inputs from user are exist then
14
return annotated images by user, annotated labels by user
15
else
16
return empty, empty
17
end if
18 end function
6
Download