extraction classify

advertisement
DCellIQ Software Guide
First Edition
1
I. Introduction
Welcome to the DCellIQ Software Guide.
DCellIQ is an open-source software system for cellular image segmentation,
multiple objects tracking, feature extraction, and classification (identification). It is
extremely user-friendly. You can find updated information from the website:
http://www.cbi-tmhs.org/Dcelliq/downloading.html
Currently, DCellIQ focuses on the quantitative analysis of time-lapse cellular
image sequences. The system consists of four major components: image segmentation,
multiple objects tracking, feature extraction, and classification. In the segmentation
component, the cells/nuclei are detected and segmented; then the segmented objects
are tracked over time. The single cells/nuclei are represented by a 211 dimension
feature vector. Finally the cell phases can be identified by the online and offline
Support Vector Machine (SVM) classifiers, respectively.
II. Installation
2.1 Download. The DCellIQ software is written in the Matlab computer language
subset because of its ease of use.
2.2 Organization. The DCellIQ software consists of two independent interfaces:
DCellIQ and DCellIQui (user interaction). The DCellIQ interface is used to
automatically process the input time-lapse image sequences. The DCellIQui interface
is used to manually correct the classification/identification errors, collect training data
sets, and generate the new problem-specific classifiers that are needed in the DCellIQ
interface.
2
2.3 Installation. Decompress the .rar file into a directory, e.g.
c:\Program Files\Matlab7.1\Toolbox\Dcelliq\.
When running the DCellIQ system for the first time, users will need to:
(1)
Change the ‘current directory’ of Matlab to the directory where the system will
be stored, e.g. c:\Program Files\Matlab7.1\Toolbox\Dcelliq\, or add it to the
Matlab path as: addpath(‘c:\Program Files\Matlab7.1\Toolbox\Dcelliq\’).
(2)
a. Input ‘DCellIQ’ in the command window of Matlab.
b. Input ‘DCellIQui’ in the command window of Matlab.
For future use, users just need to directly input ‘DCellIQ’ or ‘DCellIQui’ in the
command window of Matlab.
III. User’s Guide
DCellIQ software can detect and count cells and cell nuclei, segment images,
track cell movement and changes, quantify cell features/changes, and classify cells
and their specific features and changes. The system can achieve all of these tasks
efficiently, process the results, and automatically store them in one directory that can
be easily accessed or viewed.
All of the detection, segmentation, tracking, feature extraction, and classification
steps are integrated together as a pipeline. The user simply specifies the location of
the raw data directory which will enable batch processing (Note: No matter how many
data sets, it will process them sequentially and store them the same way). In addition,
users must generate a problem-specific classifier for their specific study to use the
classification function.
3.1 Usage of DCellIQ
3.1.a Interface of DCellIQ. Input ‘DCellIQ’ in the command window and press
3
Enter, the DCellIQ interface will appear (Fig. 1).
Figure 1: Interface of DCellIQ (version Beta V1.0).
3.2 Processing data using DCellIQ
3.2.a Obtain the directory of the data. To process data, the user must specify the
location of the raw data directory using the menu ‘File->GetDirectory’. (Users can
download the ‘test data’ from: http://www.cbi-tmhs.org/Dcelliq/downloading.html . In
the test data, there are two data sets: “…\testdata\test1” and “…\testdata\test2”). Since
the software has batch process ability, users can put all data sets under one directory,
e.g. …\testdata\, then use the menu ‘File->GetDirectory’ to find the directory.
3.2.b Processing the data. To begin processing data, the user can use the menu:
‘Image Processing -> Processing’. Then a sub-window, as seen in the following
window (Fig. 2.1), will pop up in which the user can choose the tasks to perform after
the segmentation: tracking, feature extraction and classification. If the user wants to
choose one operation, they need to input ‘Y’. ‘Yes’, ‘yes’, or ‘y’ will also be
recognized. To skip one operation, enter ‘N’, ‘No’, ‘no’ or ‘n’ instead. Every task
(detection, segmentation, tracking, feature extraction and classification) will then be
4
done automatically.
Figure 2.1: The users can choose or skip some operations by inputting ‘Y’ or ‘N’.
3.2.c Parameters Setting. The user must set specific parameters for specific
images (Fig. 2.2). For the details, please refer to References [2,3].
5
Figure 2.2 Parameter Setting interface
3.3 Checking the processing results using DCellIQ. The user can monitor the
processing results using the DCellIQ interface, Note that all the processing results are
stored automatically in the same directory as the original data sets, and the name is
‘…_Res’, e.g. ‘…\TestData\data1_Res’.
3.3.a Load the processed data. To check the processed results, the user must load
the processed data results first using the menu: ‘Check Processing Results -> Load
Data’. Find the processed results in the same directory as the original data sets, and
choose the results directory of the specific data set that is sought, e.g.
‘…\TestData\data1_Res’. The data set will be displayed as below (Fig. 3).
6
Figure 3: Snapshot of DcellIQ Interface after loading the processed data. Objects are numbered using
green numbers, and their boundaries are delineated using red color.
3.3.b Check and correct the segmentation results.
1) Check the segmentation results.
After loading the processed results, the segmentation results can be viewed via
inputting specific frame numbers in the Frame ID box and pressing Enter (Fig.
4). The blue arrow button can also be used to view the previous and next frame
(Fig. 4).
7
Figure 4: Validation of segmentation results.
2) Correct the segmentation results
Merge the over-segmentation (Fig. 5)
Step1. Click the ‘Merge’ radio button.
Step2. Input the cell IDs to be merged, e.g., 15, 16; 33, 35;, in the pop-up window.
Note: the different merging pairs are separated by a semicolon ‘;’.
Step3. Click ‘OK’.
Fig 5 shows an example of the pop-up window.
8
Figure 5: Example of interface of merging cells.
3) Split the under-segmentation (Fig.6).
Step1:
Step2:
Activate the ‘Selecting Seeds’ radio button.
Move the mouse to the under-segmented cells. To separate one
under-segmented cell into N-part, select N-seed for each of them. To
achieve this, click the rough center and then click the ‘add seeds’
button one by one. Blue star signs ‘*’ will appear at the seed point
selected. Note: users can add as many centers as possible.
Step3: Click the ‘Split’ button to finish the splitting process.
9
Figure 6: Example interface of splitting cells.
4) Check the general attributes of a single object. Inputting the Cell ID in the Cell
ID box, the user can obtain some basic information of the single object, as seen in
the blue boxes of Fig. 7.
10
Figure 7: View the basic information of the selected single object.
Avg Int: average intensity; Max Int: maximum intensity; Dev Int: standard deviation of the
intensity; Area: size of object (number of pixels); PerCov: perimeter of the convex image;
Perimeter: perimeter of the object; Compact: compact = (4*pi*Area)/(perimeter^2); AxRatio
= LongAx/ShortAx; LongAx: the longer axis; ShortAx: the shorter axis.
5) Check the tracking results (Fig. 8).
1) All of the numbers (the number outside the parentheses is the cell ID in the
last frame, and the number inside the parentheses is the cell ID in the first
frame) of traces are listed in the TraceID box. The user can track one of them
by inputting the number of trace into the Imaging Interval box at bottom, and
pressing Enter. The object in each frame is labeled by a red star. The
showing speed of the frames can be controlled by inputting a number in the
Show Interval box (e.g. 0.02 (Seconds)). Also, the number of frames and
number of objects in the frame can be viewed by inputting a number in the
FrameID and Cell ID boxes, respectively. In addition, to check the
migration of one object in a specific frame, use the red arrow to manually
track the object after inputting the trace number and pressing Enter.
11
Figure 8: Validation of tracking.
2) View the tree structures of the cell cycle via inputting the cell number in the
first frame (the trace Ids inside the parentheses) in the Draw trace tree box.
Some representative cell cycle tree structures are provided in Fig. 9.
Figure 9: Tree structures of cell cycles. The red numbers in the notes are the cell IDs in the corresponding frames.
The black numbers are the cell division time points. The red numbers are the corresponding trace IDs.
12
IV.
DCellIQui
4.1 Usage of DCellIQui. The DCellIQui interface is used to validate and correct
the classification results. In addition, it is used to generate the users’ own SVM
classifiers or update the existing SVM classifiers.
4.2 Interface of DCellIQui. After inputting ‘DCellIQui’ in the command window
of Matlab, the users can see the interface of DCellIQui as in Fig. 10.
Figure 10: Interface of DCellIQui.
4.3 Validate or correct the classification (phase identification) results.
A. Load processed data. To validate the classification results, the user must load the
processed data first. Use the menu: ‘File  LoadData’ to download the processed
data same as in the DCellIQ interface, e.g. ‘…\TestData\ data_1_Res\.
B. Validate and correct the classification results. Click the number of frames in the
13
right list box, the users can view the classification results. Different classes
(phases) will be denoted by different color boundaries, as seen in following figure.
To correct the classes of cells, the user can click the cell using the mouse, a ‘star’
will appear on the top of the cell, then use the buttons at the right of the interface
to change the phase of the selected cell.
To save the correction results, the user will need to click the blue ‘Save Phase
Correction’ button.
Figure 11: Interface of DCellIQui after loading the processed data. The phases (or classes) of cells
are represented via different colors. In the example image, the cells with red color boundaries are
interfaces; the green color denotes the Prophase; the cyan color denotes the Metaphase; the blue
color represents the Anaphase, and the purple color means the cells are classified into bad cell
classes which will be thrown out. The other five classes have not been defined.
14
Generate your own SVM Classifier (Fig. 12)
A.
Build the training data set. To build your own SVM Classifier, a training data
set is needed. The DCellIQui provides a convenient way for users to build a
training data set as four steps:
(1) Choose one frame, correct the cells’ classes as introduced in the Section 2.2.
(2) Click the “Select Cells” Radio button at the bottom. The number of cells will
be labeled automatically using the green numbers.
(3) Input the cell numbers you want to choose in the bottom edit box.
(4) Push the Save Selected Cells button to save the selected cells.
Note: Cells can be selected in any frame, in any data set by using above four steps.
Figure 12: Related controls of building a training data set.
B. Build your own SVM Classifier.
After building a training data set, a specific SVM classifier can be set as follows:
15
(1) Click the Select Features radio button at bottom.
(2) Input the indexes of selected features in the edit box at bottom. (Remember
there are 211 features total.
Features
Gabor(W)
CDF (w)
Geometry
Moments
Texture
Shape
number
70
15
11
48
13
54
index
1-70
71-85
86-96
97-144
145-157
158-211
For the details of these features please refer to the references [1].
(3) Find the training data set and build the SVM Classifier via the menu: ‘Offline
SVM -> Offline SVM Training’ (Fig. 13)
Figure 13: Related controls of building the SVM classifier.
C. Reclassification via new generated SVM classifier. Users can reclassify the
cells via the menu: ‘Offline SVM -> Phase Identification (offline)’.
Online update the SVM classifier. Users can update the existed SVM
classifiers, but the users cannot change the features used in the existed SVM
16
classifiers. To update the SVM classifier, users need to:
(1) Save the information of cells whose phases are corrected via the menu:
‘Online SVM -> Save Corrected Cells’.
(2) Find the saved corrected cells’ information and update the existed Online
SVM Classifier via menu: ‘Online SVM-> Online SVM training’.
Users can reclassify the cells via the updated Online SVM classifier via the menu:
‘Online SVM -> Phase Identification (online).
4.6 Replace SVM Classifier Used in the DCellIQ Interface. After building or
updating their own SVM classifiers, users can manually update the SVM classifiers
used in the DCellIQ interface. Please note:
(1) The SVM classifier used in DCellIQ is stored at the directory:
‘CodesDir\SVM_Info\DCSVM.model,
restore
and
SelFeaInd.mat’.
Where the ‘CodesDir’ is the directory where the users store the program
codes of DCellIQ.
(2) Users new generated SVM classifier information is stored at the directory:
‘CodesDir\SVM_Info\OfflineSVM\OfflineSVM.model,
restore.txt
and
SelFeaInd.mat’.
(3)
Users updated SVM classifier information is stored at the directory:
‘CodesDir\SVM_Info\OnlineSVM\OnlineSVM.model,
restore.txt
and
SelFeaInd.mat’.
If the user wants to replace the current SVM classifier of DCellIQ, i.e.
DCSVM.model, restore.txt and SelFeaInd.mat, they must copy the new generated
or updated SVM classifier information to the directory ‘…\CodesDir\SVM_Info\’ to
replace
the
existing
one
(NOTE:
rename
the
OfflineSVM.model
or
OnlineSVM.model into DCSVM.model).
To update the SVM classifier online, there must be an existing SVM classifier in
the directory: ‘CodesDir\SVM_Info\OnlineSVM\’. The user can generate their first
17
OnlineSVM.model by copying the OfflineSVM.model or DCSVM.model to that
directory. (Note: rename the SVM classifier into OnlineSVM.model).
V. Introduction of the Processed Data
Suppose that ‘RawImageDataDir’ denotes the directory where the raw image
data are stored. The processed data are automatically stored at the same directory, and
they can be distinguished by their names which contain ‘_Res’.
Each ‘_Res’ corresponds to one data set (image sequence), and there are nine or
ten sub-directories, as seen in Fig. 14.
Figure 14: Organization of the processed data.
5.1 ResIntNor. In this directory, the intensity normalized images are stored. They
are stored in ‘.mat’ files. The users can read them using the Matlab ‘load’ function.
5.2 ResSeg. In this directory, the segmentation results are stored. They are stored
in ‘.mat’ file. Each segmentation result image is a matrix in which each object is
represented by a number, e.g. 10. The number of ‘0’ denotes the background. The
maximum number also means the number of objects in that frame. The users can read
them using Matlab ‘load’ function.
5.3 ResCelMat. In this directory, the association matrix of objects in two
consecutive frames is stored. If the value of element (i,j) is true, the i-th object in
frame t is associated with the j-th object in frame (t+1); False means there is no
association between them. They are stored in ‘.mat’ files. The users can read them
18
using the Matlab ‘load’ function.
5.4 ResCen. In this directory, the coordinate of objects’ mass centers in each
frame are stored. They are stored in ‘.mat’ files. The users can read them using the
Matlab ‘load’ function.
5.5 ResFea. In this directory, the 211 features of each object in each frame are
stored. They are stored in ‘.mat’ files. The users can read them using the Matlab ‘load’
function.
5.6 ResPha. In this directory, the phase (class) information of each object in each
frame is stored in a matrix entitled ‘Phase.mat’. In this matrix, each column represents
a frame. The element (i,j) denotes the phase of the i-th object in j-th frame. It is stored
in ‘.mat’ files. The user can read them using the Matlab ‘load’ function.
If the users have corrected the phase identification results and saved them, there
will be another file entitled ‘Phase_Manual.mat’ which records the corrected phase
information.
5.7 ResTra. In this directory, the tracking results are stored in a file entitled
‘TraceResult.mat’. There are several files inside it. The ‘TraceAll.mat’ file is a matrix
in which each row is trace, and its (i,j) element denotes the object (Cell) ID of the i-th
trace in j-th frame. The ‘LabelAll.mat’ contains the phase information corresponding
to the objects in the ‘TraceAll.mat’. The ‘TraceIdAll’ contains the Ids of traces.
5.8 ResPro. In this directory, there are four files: Celltrace_1.csv, Celltrace_2.txt,
Celltrace_3.csv and Celltrace_4.csv. The ‘Celltrace_1.csv’ records the time in each
phase of the object in every trace, the ‘Celltrace_2.txt’ records the phase information
of the object in each trace, the ‘Celltrace_3.csv’ records the feature information of the
object in each trace, and the ‘Celltrace_4.csv’ records the traces which experience at
least one entire cell cycle, e.g. from the beginning of anaphase to the end of
19
metaphase. ‘Celltrace_4.csv’ also records the time in each phase of the object.
5.9 ResMan. If the users have used the DCellIQui interface and saved the
corrected results using the menu: ‘Write Result -> Save as Excel’, there will be
another sub-directory: ‘ResMan’. Its content is same as the sub-directory “4.8 ResPro’.
However, the information has been corrected manually.
If there is any problem with the software, please find solutions at:
http://www.cbi-tmhs.org/Dcelliq/downloading.html .
Mar, 2008
20
References:
1. Li F, Zhou X, Ma J, Wong STC, “Optimal Multiple Nuclei Tracking Using
Integer Programming for Quantitative Cancer Cell Cycle Analysis,” IEEE
transactions on medical imaging, 29(1):96-105, 2010.
2. Meng Wang, Xiaobo Zhou, Fuhai Li, Jeremy Huckins, Randy W. King,
Stephen T.C. Wong, “Novel cell segmentation and online SVM for cell cycle
phase identification in automated microscopy”, Bioinformatics, 24(1):94-101,
2008.
3. Fuhai Li, Xiaobo Zhou, Stephen T.C. Wong, “Novel Nuclei Segmentation and
Cell Phase Identification Using Markov Model”, International Symposium on
Computational Models for Life Sciences (CMLS), December 17-19, 2007,
Gold Coast, Queensland, Australia.
4. Fuhai Li, Xiaobo Zhou, Jinmin Zhu, Jinwen Ma, Xudong Huang and Stephen
T.C. Wong, “High content image analysis for H4 human neuroglioma cells
exposed to CuO nanoparticles”, BMC Biotechnology, 7:66, 2007.
5. Meng Wang, Xiaobo Zhou, Randy W. King, Stephen T.C. Wong, “Context
based mixture model for cell phase identification in automated fluorescence
microscopy”, BMC Bioinformatics, 8:32, 2007.
6. Xiaowei Chen, Xiaobo Zhou, Stephen T.C. Wong, “Automated segmentation,
classification, and tracking of cancer cell nuclei in time-lapse microscopy”,
IEEE Transaction on Biomedical Engineering, 53, 762- 766, 2006.
7. Zhou XB and Wong STC "High content cellular imaging for drug
development ", IEEE Signal and Processing Magazine, 23(2):170-174, 2006.
8. Jun Yan, Xiaobo Zhou, Qiong Yang, Ning Liu, QianSheng Cheng, Stephen T.
C. Wong: An Effective System for Optical Microscopy Cell Image
Segmentation, Tracking and Cell Phase Identification. ICIP 2006: 1917-1920
21
Download