Decision Boundary A Decision Boundary is a partition in n-dimensional space that divides the space into two or more response regions. A decision boundary can take any functional form, but it is often useful to derive the optimal decision boundary that maximizes long-run accuracy.x The use of decision boundaries is widespread, and forms the basis of a branch of statistics known as Discriminant Analysis. Usually Discriminant Analysis assumes a linear decision bound and has been applied in many settings. For example, the clinical psychiatrist might be interested in identifying the set of factors that best predict whether an individual is likely to evidence some clinical disorder. To achieve this goal the researcher will identify a set of predictor variables taken at time 1 (e.g., symptoms, neuropsychological test scores, etc), and will construct a linear function of these predictors that best separates depressed from non-depressed or schizophrenic from nonschizophrenic patients diagnosed at time 2. The resulting decision bound can then be applied to symptom and neuropsychological test data collected on new patients to determine whether they are at risk for that clinical disorder later in life. Similar applications can be found in machine learning (e.g., automated speech recognition) and several other domains. To make this definition more rigorous, suppose we have two categories of clinical disorders, such as depressed and non-depressed individuals with predictor variables in ndimensional space. Denote the two multivariate probability density functions fD(x) and fND(x) and the two diagnoses RD and RND. To maximize accuracy it is optimal to use the following decision rule: If fD(x)/fND(x) > 1 then RD, else RND. (1) Notice that the optimal decision bound is the set of points that satisfy FD(x)/fND(x) = 1 It is common to assume that fD(x) and fND(x) are multivariate normal. Suppose that D and ND denote the depressed and non-depressed means, respectively, and that D and ND denote the multivariate normal covariance matrices. In addition, suppose that D = ND = . Under the latter condition the optimal decision bound is linear. Expanding Equation 1 yields (2)-n/2 ||-1/2 exp[-1/2(x - D)’ -1 (x - D)] fD(x)/fND(x) = 1 = --------------------------------------------------(2)-n/2 ||-1/2 exp[-1/2(x - ND)’ -1 (x - ND)] = exp[-1/2(x - D)’ -1 (x - D) + 1/2(x - ND)’ -1 (x - ND)] (2) Taking the natural log of both sides of Equation 2 yields h(x) = ln [fD(x)/fND(x)] = (ND - D)’ -1x + ½(D’ -1 D - ND’ -1 ND) (3) which is linear in x. As a concrete example, suppose that the objects are two-dimensional with D = [100 200]’, ND = [200 100], D = ND = 50I (where I is the identify matrix). Applying Equation 3 yields .04x1 - .04x2 = 0 Further Readings Ashby, F.G., & Maddox, W.T., (1993). Relations between prototype, exemplar, and decision bound models of categorization. Journal of Mathematical Psychology, 37, 372400. Fukunaga, K. 1972). Introduction to statistical pattern recognition. New York: Academic Press. Morrison, D.F. (1967). Multivariate statistical methods. New York: McGraw-Hill.