Presented By Dr. Keith Haynes Introduction Appearance-Based Approach Features Classifiers Face Detection Walkthrough Questions Computer vision is a field that includes methods for acquiring, processing, analyzing, and understanding images. What does that mean? What are some computer vision task? Are there any faces in this image? Class Label + Test Subject Database of Classes Sensing Classification Preprocessing Post-Processing Feature Extraction Decision Images vary due to the relative camera-object pose Frontal, profile, etc. Components may vary in: ◦ ◦ ◦ ◦ Size Shape Color texture Some objects have the ability to change shape There are many possible objects Scale Orientation 15 15 32 44 57 84 138 219 244 248 248 248 248 246 244 242 223 222 233 244 245 223 160 74 9 14 36 50 57 81 119 128 208 244 250 248 251 221 153 145 158 191 209 228 217 177 133 62 36 27 54 87 106 121 149 169 133 126 160 222 226 171 150 182 177 175 176 179 172 158 122 35 27 56 100 124 144 155 144 147 86 42 64 165 190 152 188 212 173 162 187 198 196 174 110 40 11 69 97 97 105 112 91 80 46 15 41 157 186 146 182 160 113 100 152 188 202 188 119 40 22 52 41 31 29 28 36 43 5 2 52 173 187 122 135 79 47 19 52 90 131 168 142 35 29 33 20 19 23 27 46 69 52 11 33 146 174 115 99 48 35 31 31 52 97 148 150 74 17 41 48 72 104 122 153 237 235 56 33 162 242 175 73 91 113 152 181 197 201 192 167 134 17 41 74 80 76 75 106 224 235 51 36 165 251 183 71 120 103 136 194 208 199 195 171 130 29 54 94 107 101 94 122 212 119 30 46 168 251 225 167 148 141 125 175 190 180 176 154 116 44 72 93 100 104 113 111 80 33 11 43 163 242 228 182 108 163 157 156 143 150 166 141 107 50 99 142 126 108 110 79 10 52 37 54 166 243 229 194 140 163 157 155 147 140 132 111 94 43 103 161 165 158 160 116 20 97 84 81 173 244 234 215 200 178 160 165 166 147 120 102 94 33 84 142 191 224 234 185 53 125 110 76 160 240 223 194 211 202 184 171 164 154 137 119 109 35 76 165 222 243 230 159 73 127 101 50 139 230 201 155 195 189 183 171 171 160 139 128 122 42 89 186 230 231 177 62 25 27 61 100 159 196 191 178 167 135 153 165 183 173 146 141 126 50 138 191 173 138 97 35 10 10 42 83 142 166 131 83 56 68 71 136 187 192 176 154 108 38 133 116 83 78 64 29 14 15 61 119 182 189 135 82 51 50 54 66 148 198 184 167 112 28 96 89 81 93 81 37 22 69 109 157 190 203 196 171 148 74 67 49 107 167 179 167 93 26 77 127 157 160 114 34 13 77 150 200 209 215 229 224 197 52 40 68 94 129 165 151 70 22 60 159 210 191 126 44 19 40 101 145 152 161 173 164 151 76 94 145 156 155 158 122 41 14 33 134 187 170 122 71 47 33 53 91 106 125 144 131 140 171 207 227 232 207 154 86 12 6 18 74 122 143 128 85 71 77 113 164 185 204 226 225 227 235 234 239 235 196 125 49 1 0 12 39 67 111 131 95 95 96 121 127 168 212 224 225 232 245 241 245 243 175 72 19 0 As the dimensions increase, the volume of the space increases exponentially The data points occupy a volume that is mainly empty. Under these conditions, tasks such as estimating a probability distribution function become very difficult. In high dimensions the training sets may not provide adequate coverage of the space. Machine learning is the science of getting computers to act without being explicitly programmed. Applications ◦ ◦ ◦ ◦ self-driving cars speech recognition effective web search understanding of the human genome 15 15 32 44 57 84 138 219 244 248 248 248 248 246 244 242 223 222 233 244 245 223 160 74 9 14 36 50 57 81 119 128 208 244 250 248 251 221 153 145 158 191 209 228 217 177 133 62 36 27 54 87 106 121 149 169 133 126 160 222 226 171 150 182 177 175 176 179 172 158 122 35 27 56 100 124 144 155 144 147 86 42 64 165 190 152 188 212 173 162 187 198 196 174 110 40 11 69 97 97 105 112 91 80 46 15 41 157 186 146 182 160 113 100 152 188 202 188 119 40 22 52 41 31 29 28 36 43 5 2 52 173 187 122 135 79 47 19 52 90 131 168 142 35 29 33 20 19 23 27 46 69 52 11 33 146 174 115 99 48 35 31 31 52 97 148 150 74 17 41 48 72 104 122 153 237 235 56 33 162 242 175 73 91 113 152 181 197 201 192 167 134 17 41 74 80 76 75 106 224 235 51 36 165 251 183 71 120 103 136 194 208 199 195 171 130 29 54 94 107 101 94 122 212 119 30 46 168 251 225 167 148 141 125 175 190 180 176 154 116 44 72 93 100 104 113 111 80 33 11 43 163 242 228 182 108 163 157 156 143 150 166 141 107 50 99 142 126 108 110 79 10 52 37 54 166 243 229 194 140 163 157 155 147 140 132 111 94 43 103 161 165 158 160 116 20 97 84 81 173 244 234 215 200 178 160 165 166 147 120 102 94 33 84 142 191 224 234 185 53 125 110 76 160 240 223 194 211 202 184 171 164 154 137 119 109 35 76 165 222 243 230 159 73 127 101 50 139 230 201 155 195 189 183 171 171 160 139 128 122 42 89 186 230 231 177 62 25 27 61 100 159 196 191 178 167 135 153 165 183 173 146 141 126 50 138 191 173 138 97 35 10 10 42 83 142 166 131 83 56 68 71 136 187 192 176 154 108 38 133 116 83 78 64 29 14 15 61 119 182 189 135 82 51 50 54 66 148 198 184 167 112 28 96 89 81 93 81 37 22 69 109 157 190 203 196 171 148 74 67 49 107 167 179 167 93 26 77 127 157 160 114 34 13 77 150 200 209 215 229 224 197 52 40 68 94 129 165 151 70 22 60 159 210 191 126 44 19 40 101 145 152 161 173 164 151 76 94 145 156 155 158 122 41 14 33 134 187 170 122 71 47 33 53 91 106 125 144 131 140 171 207 227 232 207 154 86 12 6 18 74 122 143 128 85 71 77 113 164 185 204 226 225 227 235 234 239 235 196 125 49 1 0 12 39 67 111 131 95 95 96 121 127 168 212 224 225 232 245 241 245 243 175 72 19 0 Model-Based ◦ Uses 3D models to generate images ◦ Original and rendered images compared for classification Appearance-Based ◦ Learns how to classify image via training examples Learn Set of Discriminatory Features Training Set Perform Feature Extraction Feature Representation of Test Subject Test Subject Classify Quickly & Accurately Class Label Features are learned through example images, usually known as a training set 3D Models are not needed Utilizes machine learning and statistical analysis A feature is a calculation performed on a portion of an image that yields a number Features are used to represent the entity being analyzed. 15 15 32 44 57 84 138 219 244 248 248 248 248 246 244 242 223 222 233 244 245 223 160 74 9 14 36 50 57 81 119 128 208 244 250 248 251 221 153 145 158 191 209 228 217 177 133 62 36 27 54 87 106 121 149 169 133 126 160 222 226 171 150 182 177 175 176 179 172 158 122 35 27 56 100 124 144 155 144 147 86 42 64 165 190 152 188 212 173 162 187 198 196 174 110 40 11 69 97 97 105 112 91 80 46 15 41 157 186 146 182 160 113 100 152 188 202 188 119 40 22 52 41 31 29 28 36 43 5 2 52 173 187 122 135 79 47 19 52 90 131 168 142 35 29 33 20 19 23 27 46 69 52 11 33 146 174 115 99 48 35 31 31 52 97 148 150 74 17 41 48 72 104 122 153 237 235 56 33 162 242 175 73 91 113 152 181 197 201 192 167 134 17 41 74 80 76 75 106 224 235 51 36 165 251 183 71 120 103 136 194 208 199 195 171 130 29 54 94 107 101 94 122 212 119 30 46 168 251 225 167 148 141 125 175 190 180 176 154 116 44 72 93 100 104 113 111 80 33 11 43 163 242 228 182 108 163 157 156 143 150 166 141 107 50 99 142 126 108 110 79 10 52 37 54 166 243 229 194 140 163 157 155 147 140 132 111 94 43 103 161 165 158 160 116 20 97 84 81 173 244 234 215 200 178 160 165 166 147 120 102 94 33 84 142 191 224 234 185 53 125 110 76 160 240 223 194 211 202 184 171 164 154 137 119 109 35 76 165 222 243 230 159 73 127 101 50 139 230 201 155 195 189 183 171 171 160 139 128 122 42 89 186 230 231 177 62 25 27 61 100 159 196 191 178 167 135 153 165 183 173 146 141 126 50 138 191 173 138 97 35 10 10 42 83 142 166 131 83 56 68 71 136 187 192 176 154 108 38 133 116 83 78 64 29 14 15 61 119 182 189 135 82 51 50 54 66 148 198 184 167 112 28 96 89 81 93 81 37 22 69 109 157 190 203 196 171 148 74 67 49 107 167 179 167 93 26 77 127 157 160 114 34 13 77 150 200 209 215 229 224 197 52 40 68 94 129 165 151 70 22 60 159 210 191 126 44 19 40 101 145 152 161 173 164 151 76 94 145 156 155 158 122 41 14 33 134 187 170 122 71 47 33 53 91 106 125 144 131 140 171 207 227 232 207 154 86 12 6 18 74 122 143 128 85 71 77 113 164 185 204 226 225 227 235 234 239 235 196 125 49 1 0 12 39 67 111 131 95 95 96 121 127 168 212 224 225 232 245 241 245 243 175 72 19 0 Computes the difference between sums of two or more areas Edge detector A B D C -328 15 15 32 44 57 84 138 219 244 248 248 248 248 246 244 242 223 222 233 244 245 223 160 74 9 14 36 50 57 81 119 128 208 244 250 248 251 221 153 145 158 191 209 228 217 177 133 62 106 121 149 169 133 126 160 222 226 171 150 182 177 175 176 179 172 158 122 35 27 56 100 124 144 155 144 147 86 42 64 165 190 152 188 212 173 162 187 198 196 174 110 40 11 69 97 97 105 112 91 80 46 15 41 157 186 146 182 160 113 100 152 188 202 188 119 40 22 52 41 31 29 28 36 43 5 2 52 173 187 122 135 79 47 19 52 90 131 168 142 35 29 33 20 19 23 27 46 69 52 11 33 146 174 115 99 48 35 31 31 52 97 148 150 74 17 41 48 72 104 122 153 237 235 56 33 162 242 175 73 91 113 152 181 197 201 192 167 134 17 41 74 80 76 75 106 224 235 51 36 165 251 183 71 120 103 136 194 208 199 195 171 130 29 54 94 107 101 94 122 212 119 30 46 168 251 225 167 148 141 125 175 190 180 176 154 116 44 72 93 100 104 113 111 80 33 11 43 163 242 228 182 108 163 157 156 143 150 166 141 107 50 99 142 126 108 110 79 10 52 37 54 166 243 229 194 140 163 157 155 147 140 132 111 94 43 103 161 165 158 160 116 20 97 84 81 173 244 234 215 200 178 160 165 166 147 120 102 94 33 84 142 191 224 234 185 53 125 110 76 160 240 223 194 211 202 184 171 164 154 137 119 109 35 76 165 222 243 230 159 73 127 101 50 139 230 201 155 195 189 183 171 171 160 139 128 122 42 89 186 230 231 50 138 191 173 138 38 133 116 83 28 96 89 81 26 77 127 22 60 14 33 6 0 199 36 27 52787 54 177 62 25 414 27 61 100 159 196 191 178 167 135 153 165 183 173 146 141 126 97 35 10 10 83 142 166 131 83 56 68 71 136 187 192 176 154 108 78 64 29 14 152 15 42 61 119 182 189 135 82 51 50 54 66 148 198 184 167 112 93 81 37 22 69 109 157 190 203 196 171 148 74 67 49 107 167 179 167 93 157 160 114 34 13 77 150 200 209 215 229 224 197 52 40 68 94 129 165 151 70 159 210 191 126 44 19 40 101 145 152 161 173 164 151 76 94 145 156 155 158 122 41 134 187 170 122 71 47 33 53 91 106 125 144 131 140 171 207 227 232 207 154 86 12 18 74 122 143 128 85 71 77 113 164 185 204 226 225 227 235 234 239 235 196 125 49 1 12 39 67 111 131 95 95 96 121 127 168 212 224 225 232 245 241 245 243 175 72 19 0 262 + - Feature representation is determined by: ◦ the task being performed ◦ performance constraints such as accuracy and calculation time. Two Groups ◦ Global – feature uses the entire image ◦ Local – feature uses parts of the image Attempts to identify the critical areas from a set of images for class discrimination How are critical areas identified? Requires an exhaustive search of possible sub-windows Height 1 2 3 4 24 128 256 512 Width 1 2 3 4 24 128 256 512 Possible Features 1 9 36 100 90,000 68,161,536 1,082,146,816 17,247,043,584 Time < ms < ms < ms < ms 0.2 ms 0.1 sec 1.19 sec 18.48 sec 2MP image has 922,944,480,000 possible features and took 16.45 min A single Haar feature is a weak classifier A set of features can form a strong classifier Features in Set Number of Sets 1 90,000 2 8,099,910,000 3 7.28976E+14 4 6.56056E+19 5 5.90424E+24 6 5.31352E+29 7 4.78185E+34 8 4.30333E+39 9 3.87266E+44 10 3.48504E+49 Exhaustive Search ◦ For 5 features 5.9x1024 unique sets Find best features one at a time. ◦ Find the first best feature ◦ Find the feature that works best with the first feature, and so on ◦ For 5 features 449,990 sets searched Increase step size Together they form a strong classifier 15 15 32 44 57 84 138 219 244 248 248 248 248 246 244 242 223 222 233 244 245 223 160 74 9 14 36 50 57 81 119 128 208 244 250 248 251 221 153 145 158 191 209 228 217 177 133 62 36 27 54 87 106 121 149 169 133 126 160 222 226 171 150 182 177 175 176 179 172 158 122 35 27 56 100 124 144 155 144 147 86 42 64 165 190 152 188 212 173 162 187 198 196 174 110 40 11 69 97 97 105 112 91 80 46 15 41 157 186 146 182 160 113 100 152 188 202 188 119 40 22 52 41 31 29 28 36 43 5 2 52 173 187 122 135 79 47 19 52 90 131 168 142 35 29 33 20 19 23 27 46 69 52 11 33 146 174 115 99 48 35 31 31 52 97 148 150 74 17 41 48 72 104 122 153 237 235 56 33 162 242 175 73 91 113 152 181 197 201 192 167 134 17 41 74 80 76 75 106 224 235 51 36 165 251 183 71 120 103 136 194 208 199 195 171 130 29 54 94 107 101 94 122 212 119 30 46 168 251 225 167 148 141 125 175 190 180 176 154 116 44 72 93 100 104 113 111 80 33 11 43 163 242 228 182 108 163 157 156 143 150 166 141 107 50 99 142 126 108 110 79 10 52 37 54 166 243 229 194 140 163 157 155 147 140 132 111 94 43 103 161 165 158 160 116 20 97 84 81 173 244 234 215 200 178 160 165 166 147 120 102 94 33 84 142 191 224 234 185 53 125 110 76 160 240 223 194 211 202 184 171 164 154 137 119 109 35 76 165 222 243 230 159 73 127 101 50 139 230 201 155 195 189 183 171 171 160 139 128 122 42 89 186 230 231 177 62 25 27 61 100 159 196 191 178 167 135 153 165 183 173 146 141 126 50 138 191 173 138 97 35 10 10 42 83 142 166 131 83 56 68 71 136 187 192 176 154 108 38 133 116 83 78 64 29 14 15 61 119 182 189 135 82 51 50 54 66 148 198 184 167 112 28 96 89 81 93 81 37 22 69 109 157 190 203 196 171 148 74 67 49 107 167 179 167 93 26 77 127 157 160 114 34 13 77 150 200 209 215 229 224 197 52 40 68 94 129 165 151 70 22 60 159 210 191 126 44 19 40 101 145 152 161 173 164 151 76 94 145 156 155 158 122 41 14 33 134 187 170 122 71 47 33 53 91 106 125 144 131 140 171 207 227 232 207 154 86 12 6 18 74 122 143 128 85 71 77 113 164 185 204 226 225 227 235 234 239 235 196 125 49 1 0 12 39 67 111 131 95 95 96 121 127 168 212 224 225 232 245 241 245 243 175 72 19 0 Original Image Feature Set 47 -229 -498 179 106 -157 -346 11 24 -99 -257 423 Feature selection is important, is application dependent Statistical methods very useful with high dimensionality Local identify discriminating areas or features images No universal solution Features can be combined Linear Discriminant Analysis Fisher Discriminant Analysis Bayesian Classifier Neural Networks K-Nearest Neighbor Classifier Features can used to form a coordinate space called the feature space. Euclidean distance is used as the metric X ( x11 x12 ) ... ( xd1 xd 2 ) 2 2 The distance is not used directly for feature selection The higher the ratio, the better the filter In order to prevent one class from dominating, an exponential function was used The sum of function for all test images was used for selection [Liu, Srivastava, Gallivan] Separation and grouping Better Classification Low Classification Rates “Divide and Conquer” Instead of trying to solve a difficult problem all at once, divide it into several parts Each of the resulting parts should be easier to solve than the original problem Perform classifications fast 1,..,20 1,13,15,18 1,15 3,4,8,20 5,6,9,10 2,7,11,12, 14,16,17,19 13,18 7,11, 16,17 2,12 - Indicates that all children are leaf nodes 7,16 14,19 11,17 Classical technique that is widely used for image compression and recognition Produces features with a dimensionality significantly less than that of the original images Reduction is performed without a substantial loss of the data contained in the image Analysis is based on the variance of dataset ◦ Variance implies a distinction in class Feature Set 47 -229 -498 179 106 -157 -346 11 24 -99 -257 423 Feature Set × PCA Matrix 478 -367 206 -358 386 Lower Dimensional Space In many cases, the PCA reduction was not sufficient Improving the performance of the reduction matrix is necessary Four methods were implemented ◦ Gradient Search ◦ Random or Vibration Search Variation of the Metropolis Algorithm ◦ Neighborhood Component Analysis ◦ Stochastic Gradient Search Data reduction occurs via a matrix multiplication ◦ x′ = xA Optimization is achieved by ◦ defining F as a function A, F(A) ◦ Changing A Can be computationally expensive Does not provide a means to escape a local maximum local maximum global maximum f(x) x Makes a guess Guesses are fast There is a possibility of escaping a local maximum local maximum global maximum f(x) x Restricting the search area increases the probability of finding an increasing path If all data at the node can be classified accurately ◦ The classification decision is stored as leaf nodes ◦ No further processing occurs down this branch of the tree. If data cannot be classified accurately ◦ The problem then becomes a clustering one. ◦ Accuracy is now defined in terms of clusters, not classes. The accuracy achieved through class level clustering will always be no worse than that of individual classes and in most cases, will be higher. Feature Space Decision Boundary Prior to Clustering Decision Boundary After Clustering R A 3 B D 6 2 E 1 4 8 C 5 9 10 7 R A 3 B D 2 E Feature Space Root A B Decision Boundary After Clustering 1 4 8 C 6 5 9 10 7 R A 3 B D 2 E Feature Space C D E Decision Boundary After Clustering 1 4 8 C 6 5 9 10 7 Classifier KNN Neural Network SVM RCT KNN Neural Network SVM RCT SVM RCT Dataset Accuracy Throughput ORL 100% 92.50% 86.25% 97% 16,735 53,522 7,251 982,110 COIL 100% 89% 97.75% 100% 19,204 53,709 3,790 3,781,933 Breast Cancer 96.96% 97.35% 141,240 8,949,638 X. Liu, A. Srivastava, K. Gallivan, Optimal linear representations of images for object recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, May 2004, pp. 662-666. Haynes, K., Liu, X., Mio, W., (2006). Object Recognition Using Rapid Classification Trees. 2006 International Conference on Image Processing Haynes, K. (2011). Using Image Steganography to Establish Covert Communication Channels. International Journal of Computer Science and Information Security, Vol. 9, No. 9 Duda, R., Hart, P., Stork, D. (2001). Pattern Classification. WileyInterscience Publications, NY