This article was downloaded by: [The University of Manchester Library] On: 12 November 2014, At: 00:08 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK The Journal of The Textile Institute Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tjti20 Dictionary learning framework for fabric defect detection ab b b ac Jian Zhou , Dimitri Semenovich , Arcot Sowmya & Jun Wang a College of Textiles, Donghua University, Shanghai, China. b School of Computer Science and Engineering, University of New South Wales, Sydney, Australia. c Key Laboratory of Textile Science & Technology, Ministry of Education, Shanghai, China. Published online: 30 Sep 2013. To cite this article: Jian Zhou, Dimitri Semenovich, Arcot Sowmya & Jun Wang (2014) Dictionary learning framework for fabric defect detection, The Journal of The Textile Institute, 105:3, 223-234, DOI: 10.1080/00405000.2013.836784 To link to this article: http://dx.doi.org/10.1080/00405000.2013.836784 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions The Journal of The Textile Institute, 2014 Vol. 105, No. 3, 223–234, http://dx.doi.org/10.1080/00405000.2013.836784 Dictionary learning framework for fabric defect detection Jian Zhoua,b, Dimitri Semenovichb, Arcot Sowmyab and Jun Wanga,c* a College of Textiles, Donghua University, Shanghai, China; bSchool of Computer Science and Engineering, University of New South Wales, Sydney, Australia; cKey Laboratory of Textile Science & Technology, Ministry of Education, Shanghai, China Downloaded by [The University of Manchester Library] at 00:08 12 November 2014 (Received 16 May 2013; accepted 9 August 2013) We present a new approach using dictionary learning framework to address textile fabric defect detection. Textile fabrics are textured materials whose images exhibit high periodicity among the repeated sub-patterns determined by weaving structure. Inspired by the image de-noising using the learned dictionary, we learn a dictionary from patches of textile fabric images, such a dictionary is able to approximate training samples well through a linear summation of its elements. Fabric defects can be regarded as a local anomaly against the relatively homogeneous texture. When modelling new samples with the dictionary learned from only the examples containing normal fabrics, the approximated version of an abnormal or defective sample will no longer contain defective region, resulting in a larger dissimilarity than a normal one, since the learned dictionary has been tuned to normal fabric structural features. Therefore, simply measuring the similarity between the original and its approximation is able to efficiently discriminate defective samples from normal, and a recently developed novelty detection algorithm, the support vector data description, is used to handle classification task. Experimental results show that the proposed algorithm can control both false alarm rate and missing detection rate within 5%, and extensions are also conducted. Keywords: fabric defect detection; dictionary learning; novelty detection; image approximation Introduction In modern textile industry, defect detection on raw woven fabrics is quite important for visual quality control, as the presence of defects has a great impact on costs and grading of the final product. Suffering from both low efficiency and high labour intensity with manual inspection, automated inspection using computer vision has drawn a considerable attention in recent decades, and numerous algorithms have been proposed to address the problem of fabric defect detection. Some survey work can be found in Kumar (2008) and Ngan, Pang, and Yung (2011). Fabric defects are mainly caused by unplanned machine malfunctions or faulty yarns during weaving process. Due to the characteristics of the weaving process used in fabric formation, images of normal textile fabrics are dominated by texture, exhibiting high periodicity in warp and weft directions. If there is a defect occurring in a fabric, the local regularity (periodicity) of the fabric will be disrupted, causing a local anomaly against its homogeneous texture background. Such anomaly manifests in various ways, e.g. local structure or intensity change, and in a broad sense, a defect (anomalous region) in a fabric image can be defined as any abnormality deriving from the homogeneous texture. *Corresponding author. Email: junwang@dhu.edu.cn Ó 2013 The Textile Institute Therefore, the fabric defect detection can be considered as a problem of identifying anomalous pixels/blocks in image samples. In terms of detection strategy, fabric defect detection schemes can be loosely categorised into feature-extraction-based and non-feature-extraction-based. In the first approach, feature extraction is of great importance in fabric detection, which involves extracting efficient textural features in spatial and/or spectral domains that are capable of robustly characterising background fabric textures and sensitive to the abnormal regions caused by defects. In the spatial domain, some commonly used texture features for detecting defects include neighbouring information (Kumar, 2003), grey-level cooccurrence matrix (Wen, Chiu, Hsu, & Hsu, 2001) and fractal dimensions (Bu & Huang, 2008; Sari-Sarraf & Goddard, 1999); and in spectral domain, Fourier spectral features (Chan, 2000), wavelet transform coefficients (Kim & Kang, 2007; Yang, Pang, & Yung, 2004), Gabor wavelet features (Hou & Parker, 2007) and auto-regressive spectral features (Bu, Huang, Wang, & Chen, 2010) are used. However, feature extraction approach always confronts feature selection problems, since there is no straightforward way to guarantee the optimality of features used in case of the absence of negative samples Downloaded by [The University of Manchester Library] at 00:08 12 November 2014 224 J. Zhou et al. (defects) involving in training process. Of the non-feature extraction detection scheme, Gabor filter is considered to be the most successful approach for detecting fabric defects (see Arivazhagan, Ganesan, & Bama, 2006; Escofet, Navarro, & Pladellorens, 1998; Kumar & Pang, 2002) in that it does not need an explicit feature extraction stage but utilises a set of optimised Gabor filters and segment defects from the filtered images straightforwardly. However, the choice of filter parameters is a quite complicated task, since the detection performance heavily relies on how the filters can match or be tuned to the property of a specific defect, e.g. the scale and orientation of a defect. Although several methods have been developed for optimisation of filter banks (Bodnarova, Bennamoun, & Latham, 2002; Hou & Parker, 2007; Kumar & Pang, 2002; Mak & Peng, 2008; Srikaew, Attakitmongcol, Kumsawat, & Kidsang, 2011), a prior knowledge (e.g. templates or defects) is still required for optimising filters, e.g. artificial defects were involved in training filters (Hou & Parker, 2007); limited defect types were used to optimise filter parameters (Kumar & Pang, 2002); non-defective samples were required to optimise filters using a Gabor wavelet network (Mak & Peng, 2008; Srikaew, Attakitmongcol, Kumsawat, & Kidsang, 2011). Ideally, if a normal fabric has homogeneous sub-pattern and stable cycle, template-matching will likely offer a good solution to the defect detection problem. However, with the significant degree of stochastic variation in normal fabric samples, choosing suitable matching template and alignment render the simple template-matching method a challenging task for detecting defects, leaving little work has been done. Inspired by the image de-noising techniques using learned dictionaries and sparse representation (Elad & Aharon, 2006), we present a fabric defect detection framework via dictionary learning. The proposed method is quite similar to template matching but using an adaptive template. In the image de-noising application, the target images are usually natural images and a small patch size (e.g. 8 8) is used to capture local image features. On contrary, textile fabric images are essentially synthetic texture images with repetitive spatial structures (not random), whereby learning a dictionary from patches of large size (e.g. 36 36) is effective for approximating training samples well even with a very small dictionary. The proposed detection framework involves learning a dictionary from defect-free samples – such a dictionary admits a linear representation of the training patches as a summation of its elements. When modelling unknown samples with the learned dictionary, normal patches can be approximated well, since it has captured the key features of normal samples, allowing for their natural variation. On the other hand, any anomalous patches (patches with defects) containing structural features not found in normal samples cannot be modelled well, i.e. causing substantial difference between the original and its approximated version. The classifier adopted in this paper is the support vector data description (SVDD), a recently developed novelty detection algorithm. There are some advantages of the proposed method: (1) Taking advantage of repetitive texture, the learned dictionary can efficiently model the natural variation of fabric textures. (2) Using learned dictionary for approximation, complex fabric defect can be converted to template-matching problem, and the defects can be identified in a more natural way, not needing to consider defect and fabric types, alleviating the feature selection problem in some sense. (3) Computational simplicity in detection stage (online) makes it suitable for real-time applications. Dictionary learning The basic idea of approximating a signal is to find the basic functions or vectors (called dictionary) whose linear combination can approximate the signal as close as possible subject to certain constraints. Basically, finding such a dictionary can be viewed as an optimisation problem measuring the quality of the approximation, e.g. most useful l2 norm. Suppose that there is an M N data matrix X ¼ ½x1 ; x2 ; . . . ; xN ; xi 2 RM that contains N vectors of dimension M in its columns. Then, when l2 norm is used, the problem of finding such a specific dictionary can be generally formulated as follows: min D; a N X 2 kxi Dai k2 ; s:t: 8j; kdj k 6 1; ð1Þ i¼1 where D ¼ ½d1 ; d2 ; . . . ; dK ; dj 2 RM , is the dictionary needing to be learned, and ai 2 RK is the coefficient vector for xi 2 X , and the constraint to dj is to prevent D from having arbitrarily large values. In this work, since we only use the data point xi obtained from normal fabric images to learn dictionary with Equation (1), the learned dictionary D will only capture the structural features of the normal fabrics. In Equation (1), both D and α are unknown, and how to find D is referred as a dictionary learning problem (Mairal, Bach, Ponce, & Sapiro, 2010), which is associated with matrix factorisation. This can be made more explicit, when rewriting Equation (1) in matrix fashion: 2 min kX DAkF ; D; A ð2Þ where A ¼ ½a1 ; a2 ; . . . ; aN ; ai 2 RK , is the coefficient matrix. Generally, Equation (2) is a generic matrix Downloaded by [The University of Manchester Library] at 00:08 12 November 2014 The Journal of The Textile Institute factorisation problem, which is also related to singular value decomposition (SVD), independent components analysis (ICA) and non-negative matrix factorisation (NMF), depending on different constraints or/and objective functions imposed. SVD aims to constrain basis functions of D to be orthogonal and ICA to constrain the basic functions of D to be statistically independent. NMF enforces all elements in both D and A to be nonnegative. Since dictionary learning can be easily extended to solve the matrix factorisation problem, e.g. NMF and sparse coding (Mairal et al., 2010), the aforementioned problems can be formulated with some extra constraints as follows: Singular value decomposition min D; a N X kxi Dai k22 ; s:t: DT D ¼ I; 225 justified to handle defect detection as a novelty detection task (Markou & Singh, 2003a, 2003b). Novelty detection, also called one-class classification or outliers detection, involves learning a model that can describe normal data or normal status, and therefore can reject any potential anomalous events against normality. Novelty detection is quite useful to deal with those applications which have plenty of normal samples, while the negative samples (e.g. all possible types of defects) are quite small or difficult to obtain. SVDD is an effective algorithm for novelty detection (Tax, 2001). The underlying idea of SVDD is to find a minimal volume hyper-sphere which can enclose a certain proportion of the training samples. Given a training set of interest fxi g16i6n with n the total number of training samples, the primal SVDD problem is defined as follows (Tax, 2001): ð3Þ min R2 þ vN1 P R;a;n i¼1 ni i ð6Þ 2 where dictionary elements are orthogonal to each other. Non-negative matrix factorisation min D; a N X 2 kxi Dai k2 ; s:t: D P 0; 8i; ai P 0; ð4Þ i¼1 where non-negative constraint is imposed on D and α. Sparse representation Sparse representation (SP) (Elad, 2010) can be seen as an extension of Equation (1), which has recently led to state of the art results in signal-processing task such as image de-noising (Elad & Aharon, 2006) and classification tasks (Yang, Yu, Gong, & Huang, 2009; Yang, Wang, & Huang, 2011). In this case, an over-complete dictionary (K > M) is involved in representing a signal in a sparse fashion, i.e. using as few elements as possible to achieve best approximation. To create sparse representation, an additional penalty measuring sparseness of the coefficient αi is introduced, which can be formulated as: min D; a N X 2 ðkxi Dai k2 þ kkai k1 Þ; s:t: 8j; kdj k 6 1; ð5Þ i¼1 where k is a regularisation parameter controlling the trade-off of approximation error and sparsity. s:t:kxi ak 6 R2 þ ni ; ni P 0; 8i; where R and a are the hyper-sphere radius and centre, respectively; v 2 ð0; 1Þ is the parameter controlling the trade-off between sphere volume and the errors; and ni are slack variables to allow the possibility of outliers. Since Equation (6) can only create a spherical boundary which is restrictive for many applications, kernel functions are usually introduced to create a more flexible data description, e.g. the Gaussian kernel function jðx; yÞ ¼ expðckx yk2 Þ. The kernelised version of Equation (6) can be obtained via its dual: min P a ai aj jðxi ; xj Þ P ij P s:t: i ai jðxi ; xi Þ i ai ¼ 1; 0 6 ai 6 vN1 ; 8i; ð7Þ where jð; Þ is the kernel function and αi are the Lagrange multipliers. This is a quadratic programming problem and can be solved by standard algorithms. Let a 2 Rn be a solution of Equation (7) then, heuristically, the majority of its entries will be zero (the constraints in this problem are analogous to an l1-norm penalty term). The data points xi with a[0 are then called the support vectors. Supposing that y is an unlabelled sample; then, y will be classified as an outlier if its distance from the centre a exceeds the hyper-sphere radius R: jðy; yÞ 2 X i ai jðxi ; yÞ þ X ai aj jðxi ; xj Þ[R2 : ð8Þ i;j Novelty detection using SVDD Since fabric defects are quite different in their characteristics and scale, it is impossible to collect all potential types of defects to train an algorithm, so it may then be Considering the simplicity and flexibility of Gaussian kernel, in this paper we also employ Gaussian kernel 2 function jðx; yÞ ¼ eckxyk . When Gaussian kernel is Downloaded by [The University of Manchester Library] at 00:08 12 November 2014 226 J. Zhou et al. Figure 1. Image patch extraction and column vector formation. used, there are two parameters v and c needing to be determined before training SVDD according to Equation (7). v 2 ð0; 1Þ is a user-specified parameter referring to the rejection fraction of training samples, which can be used to control the false alarm rate (FAR) (Bu, Wang, & Huang, 2009). Fabric defect detection using learned dictionary Image patch extraction The proposed framework is based on image patches, which means any image patches containing defective regions will be identified as defects. Image patches are obtained from both overlapping and non-overlapping ways, and the patch division with overlapping is illustrated in Figure 1. Non-overlapping division means the patches are divided sequentially without overlapping in the image samples. In this paper, non-overlapping division only appears in SVDD training stage (see in the following Section). To apply Equation (1) to learn dictionary for patch representation, each image patch is converted into column vectors by concatenating columns of patch, which is shown in Figure 1. Thereafter, image patch equals to column vector without additional specification. Proposed approach The whole detection framework is illustrated in Figure 2, consisting of the following stages: Dictionary learning stage In this step, non-defective fabric images are partitioned into small overlapping patches of w w pixels, allowing for including a wider range of fabric structures into the training data. Dictionary can be learned from the matrix whose columns are image patches by Equation (1) (or Equations (3)–(5)). Since the dictionaries are learned from the normal fabrics, the learned dictionary will only capture the normal structural features. Two groups of the dictionary elements learned from a twill fabric and a plain fabric are shown in Figure 3. Note that all the dictionary elements shown in Figure 3 are reshaped back into w w patches for displaying. SVDD training stage Still obtaining a collect of w w pixels image patches from defect-free images (non-overlapping), the learned dictionary is used to approximate these patches, extracting similarity-based features to train SVDD classifier. Testing stage To perform defect detection, overlapping patches are collected from the newly acquired fabric images and features are computed as above. The patches are then labelled with the trained SVDD classifier. Figure 2. Overview of defect detection framework. Feature extraction To better understand how the proposed detection framework can discriminate defects from normal samples, some of approximation examples using dictionary learned from Equation (1) are presented in Figure 4. Figure 4(a) is the defect-free patches and Figure 4(d) is The Journal of The Textile Institute Downloaded by [The University of Manchester Library] at 00:08 12 November 2014 the defective versions of (a). Note the residual images presented in Figure 4(c) and (f) show the difference between originals and their approximated versions. For 227 defect-free patches, their approximated versions exhibit a high degree of similarity to their originals (see light differences in Figure 4(c)); for the defective patches, the defective regions do not appear in their approximated versions, causing lower similarity to their originals (see large differences in Figure 4(f)). This is because the dictionaries have been tuned to normal structural features, and imposing the error caused by the defective regions not found in learning stage spreads the whole approximation in order to pursue as small error as it can provide. This motivates us to use the similarity between the original and its approximated version as a feature to discriminate defects. Let y be a column vector representing one of these patches and its approximation ^y can be computed by solving the following least-square problem: a ¼ arg min ky Dak2 ; 2 a ð9Þ ^y ¼ Da : In our framework, it is possible to measure the similarity between y and its approximation ^y. We select Euclidean distance and correlation coefficient as similarity measurements, which are expected to capture intensity variation and structural disparities between y and ^y. Then the two features can be calculated as, Figure 3. Dictionary elements of normal fabric samples: (a) and (c) are a twill fabric and a plain fabric images, respectively; (b) and (d) 12 dictionary elements learned from (a) and (c), respectively. Feature 1: Euclidean distance F1 ¼ ky ^yk2 and Feature 2: Correlation coefficient F2 ¼ ðyly ÞT ð^yl^y Þ : w2 ry r^y Figure 4. Eight approximation examples. (a) defect-free patches; (b) approximated patches corresponding to (a); (c) residual images; (d) defective patches; (e) approximated patches corresponding to (d); (f) residual images. 228 J. Zhou et al. where ly , l^y and ry , r^y are the means and standard deviations of the vectors y and ^y, respectively. Note that all the features are normalised onto [0, 1] using Softmax scaling (Theodoridis & Koutroumbas, 2008) which is given by: Downloaded by [The University of Manchester Library] at 00:08 12 November 2014 x0 ¼ 1 ðx lx Þ 1 þ exp qrx ð10Þ where lx and rx are the mean and standard deviation of x, respectively, and q is a constant set to 5 in this work. The normalised features are then used to train the SVDD classifier with Gaussian kernel. Dictionary size Before using learned dictionary to approximate image patches, the cardinality K of dictionary D is the key input to our method and needs to be determined beforehand. The parameter K directly controls the magnitude of approximation error, which decreases monotonically for all patches as K increases. To achieve effective defect detection, we wish to find such a dictionary size that the “normal” patches are able to be approximated well, while anomalous patches are not. To be more specific, the choice of a large value of K increases the undesired possibility of capturing the appearance of anomalous regions, lowering the discriminative power for identifying defective patches, while K taken to be too small will result in under-fitting of normal patches, leading to a high FAR. Since negative samples are not available for optimising the parameter K, we can only rely on the positive samples to find its value. In this work, we determine the dictionary size heuristically, based on the marginal reduction in the approximation error. To quantify the quality of an approximation, we define a metric, termed approximation index (AI) with the following formulation: MA AI ¼ M 100%; O PN PN r^xi rx i¼1 i ; M ¼ ; MA ¼ i¼1 O N N ð11Þ where rxi and r^xi denote the standard deviations of intensities of original image patches xi and their approximations ^xi ; with N being the total number of patches. Approximation index will lie in the range of 1–100% attaining the value of 100 when the approximated patches are identical to the originals. To evaluate the efficiency of defined metric in quantifying approximation, we compute the AI of three fabric types with varied regularity (see Figure 5(a)–(c)) using different patch sizes. It can be readily observed that the Figure 5. Plots of AI: (a) a twill fabric; (b) and (c) are plain fabrics; (d), (e) and (f) are AI plots of (a), (b) and (c). Downloaded by [The University of Manchester Library] at 00:08 12 November 2014 The Journal of The Textile Institute 229 Figure 6. Approximation results of a defective image: rows from top to bottom using patch size 16 16, 26 26 and 36 36, respectively. twill fabric exhibiting the highest regularity of texture can be approximated well with only a small dictionary size, while the two plain fabrics need more dictionary elements to capture stochastic variations. In addition, using the smaller patch sizes (e.g. 16 16) results in larger AI, as smaller patches contain fewer variations and are easier to approximate with a small number of dictionary elements. Approximation results for a defective sample using different values of K and patch sizes are shown in Figure 6. It can be seen that the smaller patches (16 16) can model the original image more efficiently than larger ones (26 26 or 36 36) for the same dictionary size, e.g. given K = 20, Figure 6(e) accounts for more particulars of the defective region than Figure 6(k) or (q). In this work, we chose K accounting for around 60% of the AI, which was found empirically to be robust and perform well for various fabric textures and defect types. To prevent the selection of too small a K, e.g. some twill fabrics can achieve 60% of AI with only two elements, we set the lower bound on K to be 4 and the upper bound to be 16. Experiments Data-set All textile fabric images used in this work were captured from a production line, and each individual image has the size of 512 512 with 256-grey levels. We chose nine different types of textile fabric images (referred to as D1 through to D9). Since the defects such as dirt and holes which cause significant changes in recorded intensity are easy to identify, this study only focuses on those defect types that manifest through structural changes or slight grey-level changes, such as miss pick, slack end, double filling, thread ends, etc. Each data-set (D1-D9) consists of 12 defect-free images and three images containing anomalies. Two normal images of each type were used for learning dictionary; one half of the remainder is used for training the SVDD classifier, the other half together with the images containing anomalies were used for testing. In the dictionary learning stage, the number of dictionary elements K for Equation (1) is determined and chosen according to the AI criterion as discussed previously. To acquire training samples, we use two 512 512 defect-free images which are subdivided into w w patches with the w/2 overlap in both vertical and horizontal directions. We randomly selected 500 image patches from each image and a total of 1000 image patches are used to learn the dictionary. In the SVDD training stage, image patches are collected from the five defect-free images without overlapping. We employ the SVDD classifier with the Gaussian kernel function which jðx; yÞ ¼ expðckx yk2 Þ can be solved as an OCSVM (Schlkopf, Platt, Shawe-Taylor, 230 J. Zhou et al. Smola, & Williamson, 2001) problem with LIBSVM (Chang & Lin, 2011). The parameter c was estimated by the method proposed in Khazai, Homayouni, Safari, and Mojaradi (2011), which can be formulated by: Downloaded by [The University of Manchester Library] at 00:08 12 November 2014 c¼ lnðN ð1 vÞ þ 1Þ 2 maxðjjxi xj jj Þ ; All experiments are implemented in Matlab (R2010a) on a Linux machine, and Equation (3) is solved with inline function svd() in Matlab; Equations (1), (4) and (5) are solved by SPAMS devised by Mairal et al. (2010). ð12Þ where N is the number of training samples. As mentioned before, v 2 ð0; 1Þ is the parameter that can be used to control the FAR and we evaluated performance with different values of v. The image patches used for training SVDD are sampled without overlapping. To prevent defects from occurring at the boundary of adjacent patches, in testing stage, we also divide image samples into w w patches with the w/2 overlap in both vertical and horizontal directions. Evaluation criteria We adopt receiver-operating characteristics curves (ROCs) Fawcett (2006) to evaluate detection performance. For this, the FAR is defined as Nf /Nnt, where Nf is the number of samples that are incorrectly labelled as defects and Nnt is the number of normal samples; and the correct detection rate (CDR) is similarly defined as Nm/Ndt, where Nm is the number of anomalous samples that are correctly labelled as defects and Ndt is the total number of anomalous samples. Figure 7. The ROC curves for all datasets D1-D9 using patch size 16 16, 26 26 and 36 36. The Journal of The Textile Institute Downloaded by [The University of Manchester Library] at 00:08 12 November 2014 Results and discussion Effect of the patch size Since the proposed method is based on the image patch strategy for defect detection, the size of patch has straightforward effect on proposed algorithm performance, e.g. AI mentioned in Section Dictionary size. To show the effect using different patch sizes for detection performance, we tested all data-sets with patch sizes of 16 16, 26 26 and 36 36. Figure 7 presents the detection results using the dictionary learned from Equation (1), and its ROC curves are computed for each data-set (D1-D9) on the hold-out sample under v ¼ 0:01; 0:03; 0:05; 0:07; 0:09. From Figure 7, it can be seen that small size patches generally perform better than larger patches with an exception of datasets D5 and D8. Specifically, larger patch sizes (e.g. 36 36) have the worst performance on 231 average, while the medium patch size of 26 26 achieves the best performance, controlling both the FAR and MDR (missing detection rate, MDR = 1-CDR) less than 5%. The possible explanation is that the smaller patches are quite sensitive to subtle texture changes yet as a consequence are prone to a high FAR on less regular plain fabrics. Furthermore, our algorithm generally performs better on twill fabrics than on plain fabrics, since the twill fabrics exhibit relatively higher regularity than plain fabrics, resulting in a lower FAR. We note that the size of the defect also has an impact on algorithm performance. By examining the defects present in each of the data-sets, we note that D3 and D6 have defects at a relatively small scale and are of lower contrast, making large image patches comparatively less efficient at their identification. Meanwhile, defects in D5 are likely to occur at scales which make larger patches more suitable. Figure 8. Detection results of DL, SVD, NMF and SP under patch size 26 26 and v = 0.05. J. Zhou et al. Downloaded by [The University of Manchester Library] at 00:08 12 November 2014 232 Figure 9. Figurative examples of detection results for D1-D9. Downloaded by [The University of Manchester Library] at 00:08 12 November 2014 The Journal of The Textile Institute The above results indicate that our algorithm is more robust in adapting to different fabric textures than defect types and that the performance partly depends on the types of defects likely to be present. Hence, in practical applications, we recommend using different patch sizes depending on specific process requirements, e.g. if monitoring of small-scale defects is important, a 16 16 patch size is to be preferred even though patches that are too small may suffer from a loss of relevant information. According to Figure 7, the medium patch size of 26 26 seems to be a good compromise. Some figurative examples of detection results are given in Figure 9. They were obtained using the dictionary learned from Equation (1), patch size 26 26 and setting v = 0.05. 233 approximation index, resulting in an inefficiently modelling texture and losing discriminative power for minor defects. On the other hand, despite the orthogonal constraint imposed on SVD, it still allows arbitrary signs for its dictionary and coefficient vectors, leaving as small approximation error as the unconstrained dictionary learned from Equation (1) in a certain dictionary size. Different from DL, SVD and NMF, the extra sparsity constraint imposes SP to approximate image patches using as few as its dictionary elements. Such ability to select only a few dictionary elements for each sample means that SP is capable of restoring the globe textural structure, but ignores the small detail regions such as defects, which somehow increase the discriminative power for some subtle intensity or/and structural changes defects, e.g. data-sets D3, D6 and D9. Extensions As mentioned early, dictionary learning can be easily extended to handle matrix factorisation problem. We extend the proposed method to learn dictionaries for defect detection from SVD, NMF and SP. Here, the dictionaries of SVD and NMF are learned from Equations (3) and (4), and other procedures are identical to that of Equation (1). For SP, the K and λ of Equation (5), our experiments show that using dictionary size K = 16 is adequate for a “sparse” representation. Sparsity parameter λ yielding about 11 zero coefficients on average for each training sample was found to result in the best performance. In addition, considering the object function of Equation (5) does not focus on pursuing least-squared error, the feature Euclidean distance is replaced with another feature called Residual dispersion, which is given by: ; R¼ Feature 1: Residual dispersion F1 ¼ jjR Rjj 2 jy ^yj; is mean of where j j is absolute value operator and R vector R. The detection results for the dictionaries learned from Equations (1), (3), (4) and (5) (referred as DL, SVD, NMF and SP) are presented in Figure 8, in which the ROC curves are also computed by testing each data-set under m ¼ 0:01; 0:03; 0:05; 0:07; 0:09 and a medium patch size 26 26. From Figure 8, it can be seen that DL and SVD exhibit a quite close detection performance, while NMF has the worst detection performance, especially for D1 and D9. The poorer performance gap for NMF compared to SVD and DL could be mainly due to the nonnegativity constraint imposed in the NMF problem with the fact that known solution methods do not guarantee global optimality also playing a role, since NMF only permits additive combination of it elements (no subtraction can occur), which needs larger K to achieve required Conclusions We propose a dictionary learning framework for textile fabric defect detection. A dictionary is learned from defect-free fabric images; such dictionary is able to approximate training samples well through a linear summation of its elements. Benefiting from the flexibility of dictionary learning, the major features of a specific fabric type can be incorporated into the learned dictionary, which ensures that the typical structural variation can be effectively modelled. Our algorithm resembles the classic template-matching methods, but using adaptive templates modelled by the learned dictionary and not needing to consider the alignment between the templates and sample images, alleviating feature selection problem. Experimental results show that our method can achieve as high as 95% CDR with less than FAR of 5%. The extensions to SVD, NMF and SP demonstrate that our algorithm has quite close performance as SVD, and SP shows more advantageous in finding minor defects than DL, since the use of sparsity constraint helps to maintain discriminative power while reducing approximation error. This also guides us to find more discriminative dictionary to improve performance across defect types further. Acknowledgement This research was supported by the Fundamental Research Funds for the Central Universities. This work was supported by the National Natural Science Foundation of China (Grant No. 61379011). References Arivazhagan, S., Ganesan, L., & Bama, S. (2006). Fault segmentation in fabric images using gabor wavelet transform. Machine Vision and Applications, 16, 356–363. Bodnarova, A., Bennamoun, M., & Latham, S. (2002). Optimal gabor filters for textile flaw detection. Pattern Recognition, 35, 2973–2991. Downloaded by [The University of Manchester Library] at 00:08 12 November 2014 234 J. Zhou et al. Bu, H., & Huang, X. (2008). Anovel multiple fractal features extraction framework and its application to the detection of fabric defects. Journal of The Textile Institute, 99, 489–497. Bu, H.-G., Huang, X.-B., Wang, J., & Chen, X. (2010). Detection of fabric defects by auto-regressive spectral analysis and support vector data description. Textile Research Journal, 80, 579–589. Bu, H. G., Wang, J., & Huang, X. (2009). Fabric defect detection based on multiple fractal features and support vector data description. Engineering Applications of Artificial Intelligence, 22, 224–235. Chan, C. (2000). Fabric defect detection by Fourier analysis. IEEE Transactions on Industry Applications, 36, 1267–1276. Chang, C.-C., & Lin, C.-J. (2011). LIBSVM: A library for sup-port vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 1–27. Retrieved from http:// www.csie.ntu.edu.tw/ cjlin/libsvm Elad, M. (2010). Sparse and redundant representations: From theory to applications in signal and image processing. New York: Springer. Elad, M., & Aharon, M. (2006). Image de-noising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image Processing, 15, 3736–3745. Escofet, J., Navarro, R., Pladellorens, J., et al. (1998). Detection of local defects in textile webs using gabor filters. Optical Engineering, 37, 2297–2307. Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27, 861–874. Hou, Z., & Parker, J. M. (2007). Texture defect detection using support vector machines with adaptive gabor wavelet features. Paper presented at the 7th IEEE Workshop on Applications of Computer Vision, WACV 2005, Breckenridge, CO, USA. Khazai, S., Homayouni, S., Safari, A., & Mojaradi, B. (2011). Anomaly detection in hyperspectral images based on an adaptive support vector method. IEEE Geoscience and Remote Sensing Letters, 8, 646–650. Kim, S. C., & Kang, T. J. (2007). Texture classification and segmentation using wavelet packet frame and gaussian mixture model. Pattern Recognition, 40, 1207–1221. Kumar, A. (2003). Neural network based detection of local textile defects. Pattern Recognition, 36, 1645–1659. Kumar, A. (2008). Computer-vision-based fabric defect detection: A survey. IEEE Transactions on Industrial Electronics, 55, 348–363. Kumar, A., & Pang, G. K. (2002). Defect detection in textured materials using gabor filters. Industry Applications, IEEE Transactions on, 38, 425–440. Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2010). Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11, 19–60. Mak, K. L., & Peng, P. (2008). An automated inspection system for textile fabrics based on gabor filters. Robotics and Computer-Integrated Manufacturing, 24, 359–369. Markou, M., & Singh, S. (2003a). Novelty detection: A Review – Part 1: Statistical approaches. Signal Processing, 83, 2481–2497. Markou, M., & Singh, S. (2003b). Novelty detection: A Review – Part 2: Neural network based approaches. Signal Processing, 83, 2499–2521. Ngan, H. Y., Pang, G. K., & Yung, N. H. (2011). Automated fabric defect detection a review. Image and Vision Computing, 29, 442–458. Sari-Sarraf, H., & Goddard, J. S. Jr. (1999). Vision system for on-loom fabric inspection. Industry Applications, IEEE Transactions on, 35, 1252–1259. Schlkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., & Williamson, R. C. (2001). Estimating the support of a high-dimensional distribution. Neural Computation, 13, 1443–1471. Srikaew, A., Attakitmongcol, K., Kumsawat, P., & Kidsang, W. (2011). Detection of defect in textile fabrics using optimal gabor wavelet network and two-dimensional PCA. Artificial Intelligence and Lecture Notes in Bioinformatics, 6939, 436–445. Tax, D. M. J. (2001). One-class classification (Unpublished Ph.D. thesis). Delft University of Technology, Delft, The Netherlands. Theodoridis, S., & Koutroumbas, K. (2008). Pattern recognition. London: Academic Press. Wen, C.-Y., Chiu, S.-H., Hsu, W.-S., & Hsu, G.-H. (2001). Defect segmentation of texture images with wavelet transform and a co-occurrence matrix. Textile Research Journal, 71, 743–749. Yang, J., Yu, K., Gong, Y., & Huang, T. (2009, June). Linear spatial pyramid matching using sparse coding for image classification. Paper presented at the Computer Vision and Pattern Recognition, Conference (CVPR), Miami, FL, USA. Yang, X., Pang, G., & Yung, N. (2004). Discriminative training approaches to fabric defect classification based on wavelet transform. Pattern Recognition, 37, 889–899. Yang, J., Wang, J., & Huang, T. (2011, July). Learning the sparse representation for classification. Paper presented at the Multimedia and Expo (ICME) International Conference, Seven Springs, PA, USA.