The profile’s assessment grid as a tool for clinical praxis. An application to functional disability K. Gibert1 , R. Annicchiarico2 , and C. Caltagirone2 3 1 2 3 Dep. of Statistics and Operation Research, Universitat Politècnica de Catalunya, C. Jordi Girona, 1-3, Barcelona 08034, SPAIN.karina.gibert@upc.edu IRCCS (Istituto di Ricovero e Cura a Carattere Scientifico) Fondazione Santa Lucia, via Ardeatina 306, 00179 Rome, Dep. of Neurology, Università Tor Vergata, via O. Raimondo 18, 00173 Rome Summary. In this paper, the Profile’s Assessment Grid is presented. It is a graphical tool which visualizes the results of a complex combination of statistical models devoted to determine the more suitable profile of a given patient, from a predefined set of profiles referred to a certain disease or medical problem. Those profiles may correspond to different stages of a certain disease or different physical characteristics of the patient . . .. In this work, an application on 4 different profiles of functionaldisabilities is presented. Being a graphical tool it brings to the physician the possibility of determining the profile of the patient in the daily activity very quickly in a friendly way, without requiring the comprehension of the statistical models involved in the process, what usually constitutes a serious limitation for wide application in clinical praxis. The advantages and implications of the PAG in the improvement of treatment of functional disabilities is also discussed. Key words: Profile’s Assessment Grid, clustering, class recognition, logistic regression, visualization, functional disability, medical profile’s discovery, data mining. 1 Introduction In modern medicine it is more and more convenient to get references about standard typologies of patients as well as standard treatments (or use of services) associated to these typologies for a certain disease or medical problem. This research begun in the particular field of disability4 . It is clear that nowadays longevity is raising in our society. Disability is usually defined as the difficulty or inability to independently perform basic activities of daily living or other tasks 4 Without loss of generality paper focus on the particular field of disability, but the same technical arguments as well as the impact into clinical praxis hold for other fields where standard profiles and/or treatments are not well found yet [?] [?] [?]. 838 K. Gibert, R. Annicchiarico, and C. Caltagirone essential for independent living without assistance. Disability is associated with increasing use of healthcare services and with increasing needs for formal and informal care [?]. This is one of the most critical public health problems that healthcare systems of developed countries are facing. As in many other medical fields, disability has no clear limits yet and it is very difficult to define different levels in different patients. However, identifying a set of standard profiles of disability may be of great help towards the establishment of standard treatment programs for every profile as well as the elaboration of general medical guidelines about this matter. In a previous work [?] innovative clustering techniques [?] were used to identify 4 profiles of disability upon a sample of neuropsychiatric patients and their scores to an specific assessment scale proposed by the WHO to measure disability [?]. Those profiles resulted into four increasing levels of disabilities based on functionality of the patient rather than on the cause of the disability itself (from very compromised persons that can even move to totally independent persons) which are introduced bellow and are the reference for the present work. Patients of the same profile are qualitatively homogeneous and they will presumably respond to same treatment. In fact, once the profiles are well known, it is of great medical interest to design protocols of standardized treatments associated to each profile [?]. For the particular domain faced here, the rehabilitative approach is fundamental in disabled population in order to improve and/or to maintain their quality of life [?]. Large numbers of disabled patients require treatment; this raises the need of rationalizing rehabilitation programs, thus improving the use of both human and economic resources. From the clinical point of view it is then very important to have a tool assisting the physician to properly determine the profile of a new patient in a very friendly way. This would allow a quick and proper decision about the more suitable treatment for that patient, which can immediately be started, having then a more efficient management and increasing success of therapies. Such a tool is also of great help in systematic analysis on the effect of the treatment, since it allows quick measuring changes on the profile of the patient and, as a consequence, on his improvement, allowing assessment on the effectiveness of different treatments to the different profiles, an proposals of new standardized programs which can improve current treatments. Regarding this need, this research is devoted to the development of a tool that, given a new patient, can immediately assign him to one of the profiles previously identified. In order to provide a tool easy to use in the clinical praxis, our idea was to find a formal model which could be transported to a graphical representation. The result is the development of The Profiles Assessment Grid (PAG), which is introduced in this paper. The PAG is an interpretative tool that allows, in a graphical way, the effective and immediate identification of the profile corresponding to a single patient. To build the PAG statistical methods were used to first identify a reduced set of relevant variables to determine the profile of a given patient. Later, a graphical transformation is done in such a way that the position of the patient in a unitary cube becomes a representation of his more likely profile. After finding the model, a specific software, Grid-KLASS, is developed. In a very friendly way, the user inputs to the system the values of the patient to the small The PAG as a clinical tool for assessing diagnoses 839 set of relevant variables, and the system displays the position of the patient in a labelled unitary cube providing the more suitable profile. 2 The Data The Sample The sample included 96 subjects, 60.4% males and 39.6% females; mean age was 56 years. Patients: 76 neurological patients, from 18 to 80 years, recovering from Oct.99 to Feb.00 at the IRCCS S. Lucia hospital in Rome, Italy. They included 20 spinal cord injury patients (age 47.20, SD 17.6), 20 Parkinson disease patients (age 69.25, SD 6.53), 20 stroke patients (age 63.40, SD 15.96), and 16 depressed patients (age 46.56, SD 11.15). Control group: 20 healthy subjects (age 55.05, SD 15.57). Inclusion criterion: absence of cognitive disorders, according to a prior clinical evaluation performed by a neurologist. Each subject was interviewed by a trained physician according to the WHO-DASII disability assessment scale [?]. Profiles of functional disability In a previous work [?], the sample was clustered using an innovative technique called Clustering based on rules [?], which consists of a mixture of two main elements: on one hand, an AI process, that manages a Knowledge Base (KB) including prior medical knowledge, even if partial; on the other hand, a later clustering process on the sample biased though an induction on KB. This methodology has already been successfully applied in the medical field [?] [?] [?]. Every patient from the sample is labelled with its corresponding profile, according to the clustering results. The interpretation of the resulting classes produced a new taxonomy of 4 profiles of disability ordered according to the severity of the functional disability (FD) presented in [?]: Low ( 31 self-dependent subjects); IntermediateI, (24 subjects with a low to moderate degree of both physical and emotional disability); IntermediateII, ( 6 subjects with moderate to severe disability without emotional problems); High ( 32 subjects with highest degree of disability). The WHO-DASII assessment scale [?] is the assessment scale proposed by the WHO to measure the degree of disability of a patient. It incorporates a set of instruments for the detection of both physical and mental health factors related to disability. The WHO-DASII version 3.1a is a scale containing 96 items for the assessment of disability levels according to the ICF classification. It includes general information about the patient and self-reported difficulty in functioning in six major domains considered important in most cultures: Understanding & Communicating (6 items), Getting Around (5 items), Self Care (4 items), Getting Along with People (5 items), Life Activities (8 items) and Participation in Society (8 items). The WHODASII employs a 5-point rating scale for all items in which 1 indicates no difficulty and 5 indicates extreme difficulty or inability to perform the activity. The WHODASII standardized global score ranges from 0 (non-disabled) to 100 (maximum disability). 840 K. Gibert, R. Annicchiarico, and C. Caltagirone 3 The Profile’s Assessment Grid In this section the Profile’s Assessment Grid (PAG) is presented. As it is a graphical transformation of complex logistic modelling built from the data, the methodological components of the Grid are presented before. Binary logistic regression First, logistic regression is used to identify a reduced set of items from WHO-DASII, that can significantly characterize every profile. Since profiles are increasingly ordered, regressions are embedded: • First of all logistic regression will be used to fit a model for the probability of a patient belonging to the High level ( πHigh ) by using the whole sample. • Later, and a second model is build for the probability (πIntII |Level < High) by a logistic regression of IntermediateII level of FD, disregarding sample patients labelled as High. • Finally, only patients labelled as IntermediateI and Low level of FD will be considered for modelling (πIntI |Level < IntII) by logistic regression. Determining the profile of a new patient Being ξ ∈ [0, 1] a predetermined threshold to decide between classes, the values provided by the three models can be used as indicating in the following classifying algorithm: Estimate πHigh by applying equation 1 If pHigh is >= ξ then assign patient to High profile Else Estimate πIntII by applying equation 2 If pIntII is >= ξ then assign patient to IntermediateII profile. Else Estimate πIntI by applying equation 3. If pIntI is >= ξ then assign patient to IntermediateI profile. Else assign patient to Low profile. The Profile’s Assessment Grid The values provided by the three models obtained with the procedure presented above can constitute the coordinates of three orthogonal axes of a graphical representation, called the Profile’s Assessment Grid, such as Fig. 1. Thus, given the three equations, the corresponding probabilities for a given patient (pHigh , pIntII , pIntI ) can be found at once, and the patient can be located as a 3D point inside the PAG, being the pIntI represented on the X axis, the pIntII on the Z axis and the pHigh on the Y axis. Depending on the area where the patient goes, his profile can be rapidly assigned by the position of the patient in the graphic (see Fig. 1), since the cube is divided in colored and conveniently labelled areas, according to the threshold ξ. In the application presented here ξ = 0.5 (see Fig 1). Very often, assessment scales, as it is the case of WHO-DASII, are long interviews containing a big number of items, and there is long time required to fill in the scale and to analyze it before obtaining the final evaluation. Many of the items are often closely correlated and the score provided to one of them can determine the responses of a certain subset of other items. The goal of the PAG is to obtain a reliable evaluation of the patient using the minimal set of relevant items from the assessment scale, as well as to provide a graphical representation directly suggesting patient’s evaluation. The software: Grid-KLASS The software Grid-KLASS was specifically developed implementing the PAG. Main technical characteristics are: Input: Variables included in the 3 logistic equations from the 4 profiles. The PAG as a clinical tool for assessing diagnoses 841 Fig. 1. The Profile’s Assessment Grid Fig. 2. Location of a patient in the Profile’s Assignment Grid. Output: Two kind of outputs: • Visual: displays the PAG with patient’s position (fig. 2) on the screen • Output files: PDF, PostScript or LaTex file containing the graphic Use: Assign to the patient the profile indicated by the color and corresponding label of the area where patient is positioned inside the PAG. Interface: graphical and friendly Programming language: Java 842 K. Gibert, R. Annicchiarico, and C. Caltagirone 4 Application to functional disability Following the methodology presented in §3, first step towards building the PAG is to estimate the three logistic equations for the four profiles of FD found in [?]. Only 7 items are involved in at least one of the three models, what really is a reduced set from the 96 contained in the WHO-DASII: B2: (Physical healht in the past 30 days) B4: (Mental healht in the past 30 days) B9: (How much worry or distress had you about your health in the past 30 days) S2: (Taking care of your household responsibilities) S4: (Joining and community activities) S5: (Have you been emotionally affected) S9: (Getting dressed) First the whole sample of 96 patients was used to estimate the model for providing the probability that a patient has a High degree of disability: pHigh = e−35.93+1.70∗B2+3.35∗B4+3.98∗B9+2.20∗S4 (1 + e−35.93+1.70∗B2+3.35∗B4+3.98∗B9+2.20∗S4 ) (R2 = 90.1) Then a second logistic regression is built using the 64 patients which were not labelled as High profile. This second model provides the probability of a patient to be IntermediateII or not, conditioned that the patient is not from High profile. The resulting logistic equation is: pIntII = e−2.63−1.37∗B4+1.30∗S9 (1 + e−2.63−1.37∗B4+1.30∗S9 ) (R2 = 49.5) The third model was estimated using the remaining 58 patients that labelled as IntermediateI or Low profile. It provides the probability of a patient belonging to IntermediateI profile or not, provided that he is not in High nor in IntermediateII profiles. The resulting equation to distinguish between IntermediateI (high values of pIntI ) and Low (low values of pIntI ) is: pIntI = e−13.20+2.20∗B9+1.89∗S2+1.40∗S5 (1 + e−13.20+2.20∗B9+1.89∗S2+1.40∗S5 ) (R2 = 83.3) Using the results obtained by this three models, the software Grid-KLASS was specifically developed. It has a friendly graphical interface which, given the scores of a new patient to the items B2, B4, B9, S2, S4, S5 displays the PAG and the position of the patient inside (see fig. 2). The color of the area where patient is positioned indicated the disability profile to be assigned. Performance assessment The points (pHigh , pIntII , pIntI ) corresponding to the 96 sample patients were found and located in the PAG to find the profile to be assigned to the patient (see fig. 2). The profile found with the PAG was then compared with the original profile of every patient. A 95.7% of the patients were correctly assigned to the High or non-High profile; a 93.4% of the non-High patients were correctly assigned to IntermediateII or non-IntermediateII profile; from the last group, a 94.3% were correctly assigned to IntermediateI or Low level. Thus providing a global tax of correct assignment of 91.7%. The PAG as a clinical tool for assessing diagnoses 843 5 Conclusions and future work In this paper the Profile’s Assessment Grid and its application to recognition of functional disability profile of a patient is presented. The PAG provides a graphical representation that directly suggests the evaluation of the patient. It permits a reliable evaluation of the patient using a minimal set of 7 relevant items from the WHO-DASII, with a global tax of 91.7% of correct assignment. The PAG seems to be a promising tool for supporting decision-making on clinical praxis. However, at present a new independent sample is being collected in order to verify that the tax of correction is still high for new patients, not involved in logistic models estimation. The PAG is an instrument specially designed to allow a quick assessment of the patient profile, once the profiles are already known. Although it responds to the very simple idea of graphically representing in a 3D space the results of 3 embedded logistic equations, it supposes a great advantage for the daily work of the physician, which usually is not interested in statistical models, nor application of complex procedures as proposed in §??. Even if the construction of the PAG is not trivial, its interpretation is really easy even by persons non familiar neither with informatics nor statistics. The identification of the patient’s profile, instead of requiring any knowledge on logistic regression, is reduced to looking to the color of the area where the patient is located inside the cube, and the label associated to that color. Position of the patient gives also an idea about the confidence of the assignment (patients located near the frontiers have more uncertainty). The PAG can be used to identify the patient profile at the beginning, what permits decisions about the more suitable treatment. But it is also useful to follow the evolution of the patient along the treatment by observing how his situation in the PAG moves along time. The need to treat large numbers of disabled patients raises the need to rationalize rehabilitation programs in order to improve the use of both human and economic resources and the PAG can also help to evaluate their effectiveness. The need to apply evidence-based validation criteria to the field of disability and rehabilitation motivated the search for suitable instruments to analyze the outcomes of each different treatment. The Profile’s Assignment Grid responds to this need, but it is not constrained to specific fields. In fact, given a set of ordered profiles (extensions to sets with more than 4 profiles are also possible by using several PAGs) of any pathology and a given set of patients correctly diagnosed, a specific PAG for this set of profiles could be built by finding the corresponding logistic equations and thresholds. Referring the thresholds, in this particular application ξ = 0.5 is used. This provides a very nice graphical representation with simmetrical divisions. Thresholds can be adapted to ensure correct treatments to patients. Moving the threshold corresponds to parallel translations of the cutting planes in the PAG: the bigger is the threshold, the bigger will be the yellow area devoted to Low profile and smaller the other ones5 . Increasing the threshold to 0.9 implies that treatment associated to High profile will only be prescribed to the very reduced set of patients with a very high probability of being really High and that patients with pHigh = 0.8, for example, will get prescription of a treatment from a lower degree of disability. From the 5 The area devoted to a certain profile in the cube ows to the embedded regressions, but is not correlated with the probability of observing that profile. 844 K. Gibert, R. Annicchiarico, and C. Caltagirone graphical point of view this corresponds to a narrower red area in the cube. Future research is required to provide accurate proposals for this thresholds. Fuzzifying the limits between the areas of the cube is also an approach to be considered very likely to provide a decision model closer to the physician’s way of thinking. Research in this direction is actually in progress. References [FW74] G. Furnival and R. Wilson, Regression by leaps and bounds, Technometrics 16 (1974), 499–511. [Har01] F.E. Harrell, Regression modeling strategies, Springer, New York, 2001. [HTF01] T.J. Hastie, R.J. Tibshirani, and J. Friedman, The elements of statistical learning, Springer, New York, 2001. [Urb04] S. Urbanek, Model selection and comparison using interactive graphics, Ph.D. thesis, Augsburg, 2004. [UVW03] A. R. Unwin, C. Volinsky, and S. Winkler, Parallel coordinates for exploratory modelling analysis, Computational Statistics & Data Analysis 43 (2003), no. 4, 553–564. [VR02] W. N. Venables and B. D. Ripley, Modern applied statistics with S, 4th ed., Springer, New York, 2002.