PROPOSED TECHNIQUE TO INCREASE INTER-RATER RELIABILITY OF PRESSURE ULCER CLASSIFICATION Harkina Rangi Department of Industrial Engineering, University of Louisville ABSTRACT A process flow diagram (PFD) was proposed to reduce inter-rater variability when classifying pressure ulcers. It was designed to function as a simple, effective and standard method for nurses to be able to consistently and accurately classify pressure ulcers based upon problems noted in previous research. The following method was used when designing the PFD: (1) classification system was chosen as basis for PFD (classification system criteria included choosing a simple, validated system with the lowest inter-rater variability), (2) needed improvements to current system were found, (3) questions to create less ambiguity were formulated and (4) the addition of more quantitative data as a decision factor was created. Using the method above the PFD was created based on the National Pressure Ulcer Advisory Panel’s staging system. Levels were created in the PFD to address problems with the current system, such as differentiating stage 1-pressure ulcers from deep tissue injuries. Polar questions were introduced to create less ambiguity for the rater as well as the use of quantitative techniques such as temperature and skin firmness measurements. The PFD is in the proposal stage and must be validated before implementation. The ideal method of implementation would be through a computer automation method. INTRODUCTION Pressure ulcers are lesions of the skin that are major concerns for people who are wheelchair bound, in bed rest or in a state of immobility. The National Pressure Ulcer Advisory Panel (NPUAP) defines a pressure ulcer as a ‘localized injury to the skin and/or underlying tissue usually over a bony prominence, as a result of pressure, or pressure in combination with shear and/or friction.’ The Centers for Disease and Prevention Control estimated that 159,000 nursing home patients suffered from pressure ulcers in 2004 (CDC, 2009). Not only were these patients in a state of pain and discomfort, but the extensive health care costs to treat the ulcers may have created greater unrest. The cost to treat a single patient suffering from one ulcer typically ranges from $15,000 to $27,000 in the United States. The overall cost to the US in 2007 was $11 billion (IHI, 2007), while the United Kingdom estimated spending close to ₤300 million in 1988 (Andersen et al., 2008). The misdiagnosis of pressure ulcers can increase total treatment costs and delay correct treatment causing the pressure ulcers to evolve into more advanced stages. Moisture lesions are often misdiagnosed as pressure ulcers, however the causes and treatment of these two infections are very different. Moisture lesions are caused by the presence of moisture (for example, due to incontinence of urine and/or feces) (EPUAP, 2007). Their treatment is to dry the moist area, whereas pressure ulcers can be treated by removal of the pressure with the addition of medication/dressings. Deep tissue injuries, which are also often misclassified as earlier levels of pressure ulcers, are described by the NPUAP (2007) as pressure related injuries to subcutaneous tissues under intact skin, these lesions initially, have the appearance of bluish purple bruises. These injuries evolve much faster than pressure ulcers in initial stages because they start from the inside out instead of from the outer surface to the inner surface. To aid in correct diagnosis and treatment of pressure ulcers, classification systems have been developed. These systems create a standard basis for which all hospitals/nursing homes/health care industries can record and document the regression or improvement in a patient’s condition. The aim of the classification systems is to provide a consistent method of skin assessment using a numerical score (Russell, 2002). NPUAP’s system is the most widely used one today; this became the standardized method when it was adopted by the European Pressure Ulcer Advisory Panel (EPUAP) and is now used as the international standard. Other classification systems include the Torrance system and the Stirling Scale. Advantages to using a classification system include: help in finding patients at risk of developing pressure ulcers, allow limited resources for pressure ulcer treatment to be allocated appropriately, and to be able to create treatment plans to prevent further evolution of pressure ulcers. There are, however, some disadvantages of using the classification systems: inter-rater reliability, complexity of scoring systems and difficulty in assessing the depth of skin damage (Russell, 2002). In this paper the concept of inter-rater reliability will be examined as well as studies that have investigated this issue. Literature Review There exists variation between nurse diagnoses when classifying the different stages of a pressure ulcer (interrater reliability). A study was conducted at eight hospital sites to assess the inter-rater reliability when diagnosing pressure ulcers using the EPUAP scale (Nixon et al., 2004). A group of six clinical research nurses (CRNs) with three years of post registration experience in the care of older people, medical vascular surgery, or orthopaedic nursing along with 109 ward based nurses (WNs) and one CRN team leader assessed pressure ulcer stages in 378 patients. There were no major differences in classification results between the CRN team leader and the CRN nurses (100% consensus for pressure ulcer classification). However there were many discrepancies between CRN and WN diagnoses, outlined below: 1) CRNs recorded pressure ulcers, while WNs did not 2) WN recorded pressure ulcers while CRNs did not 3) Both recorded pressure ulcers, but either the CRN or WN recorded more than one 4) CRNs and WNs recorded pressure ulcers at different sites (example, the CRN may have recorded the ulcer at the sacrum, but the WN recorded it at the buttock). Based upon the results, it was concluded that there are problems with the early diagnosis of pressure ulcers (Stage I and II) and there was poor reliability in nurse assessments for Stage II ulcers with non-blanchable erythema (skin redness). Another study conducted by Russell and Reynolds (2001) compared the inter-rater reliability of the EPUAP classification system (based on the NPUAP system) and the 1-Digit Stirling Scale (a pressure ulcer classification system based on four main stages, each with sublevels). Pressure ulcer classifications of the respondents were compared to those of specialists to measure their accuracy. It was found that respondents’ answers matched those of the specialists when using the EPUAP scale 61.9% of the time, while matching the Stirling Scale only 30.2% of the time. It should be noted that this study was conducted using photographs and not real people. A study based upon that of Russell and Reynolds was done by using people in place of photographs (Pedley, 2004). Its aim was to find whether inter-rater reliability was higher in using the EPUAP classification system, 2Digit Stirling Scale, or the 1-Digit Stirling Scale (the 2digit Stirling scale has more options and is more complex as compared to the 1-Digit Stirling Scale). A total of thirtyfive observations were recorded by two Registered Nurses (RNs). A sheet with descriptions of the stages for each classification was given to the nurses who were then asked to mark which of the descriptors applied to the patients. Each of the nurses was also asked at the end of the study which scale they preferred in terms of the wording of the scales or lack of wording. Three main problems they had with the scales when diagnosing patients are listed below: 1) The EPUAP and 1-Digit Stirling scales did not provide a category for ulcers with slough, eschar or necrotic tissue. 2) Both the EPUAP and 1-Digit Stirling Scales did not distinguish between erythema and bluish purple discoloration of the skin. 3) With all of the scales the nurses were unable to diagnose the patients accurately when there was only redness of the skin and blanching. In the first problem, the slough or eschar prevented the nurses from examining the depth of the ulcer. In a stage III diagnosis no bone, muscle or tendon is to be visible, where as in stage IV it is. The slough or eschar may have been preventing the nurses from being able to see the exposed areas. The second problem of not being able to distinguish between erythema and bluish or purple bruising could cause the observers to rate the ulcer as Stage I, when in reality it could be a Deep Tissue Injury (DTI), which is far more advanced. The last problem was common among other studies as well and can become a greater problem when diagnosing patients with darker skin pigments (Nixon et al., 2008, Edsberg et al., 2007). Pressure ulcer grading scales are open to bias and reflect the users experience and knowledge. The 2-Digit Stirling Scale was preferred by both the RNs as compared to the EPUAP scale (preferred second) and the 1-Digit Stirling Scale (which was only preferred for one assessment). The inter-rater reliability followed this sequence of preference, with the highest being when the nurses used the 2-Digit Stirling Scale and the lowest with the 1-Digit Stirling Scale. A major limitation with this study was that only two nurses were used. Because the 2-Digit Stirling Scale has more categories than the other two, there could be greater potential for variation between nurses if a greater number took part in the study, but since only two were used, this variation was not high. Other studies comparing the 2Digit Stirling Scale to EPUAP’s scale show higher interrater reliability when using EPUAP. Also, it should be noted that the EPUAP was updated (based upon the NPUAP scale) to include reference to bluish bruising of skin (DTI) and the presence of slough in 2007 (Edsberg, 2007). Studies done by Healey (1995) looked at 3 scales: 1Digit Stirling, Torrance and Surrey. A total of 109 nurses looked at photographs of pressure ulcers in white patients and graded them using the three scales listed above. Out of all the nurses the least complex scale was preferred by nurses, the Surrey scale, which is very similar to the EPUAP scale (Russell, 2002). He found that levels of agreement in diagnosis/ classification when using these scales was: 1) Surrey= 67% 2) Torrance= 60% 3) Stirling= 39% He also found that inter-rater reliability was highest among nurses classifying severe ulcers and lowest when determining skin redness (erythema). The scale associated with the highest inter-rater reliability was the Surrey scale. Linda Russell (2002) examined three scales qualitatively to determine which one should be used in daily practice. The Torrance, Stirling, and EPUAP (equivalent to NPUAP scale) were compared. Major pitfalls found in the Torrance scale included (Russell, 2002): 1) The scale doesn’t take into account patients with little to no subcutaneous fat 2) It doesn’t address deeply bruised intact skin (which could possibly be a deep tissue injury) 3) The scaling system doesn’t accurately follow the stages of pressure ulcer development 4) There is no specification for deep full thickness necrotic areas. An important note in addition to the disadvantages listed above is that the scale was not validated prior to implementation. The Stirling scale, although very descriptive, was complex and rated hard to use (Healey, 1995). This scale was also not validated prior to implementation. The EPUAP/ NPUAP scale was validated and based upon numerous studies. It has been used with greater ease by nurse practitioners as compared to the Torrance and Stirling scales (Russell 2002). Based upon the previously listed studies, the EPUAP scale (same as the NPUAP scale) appears to be the most simple and effective by the nurses as compared to other scales. This paper will now look solely at the EPUAP scale and examine potential improvements that can be made to increase inter-rater reliability. A study conducted by Beekman et al. investigated inter-rater reliability when classifying pressure ulcers and also looked at how accurately nurses are able to differentiate pressure ulcers from moisture lesions. A total of 1452 nurses were shown 20 photographs of normal skin, blanchable erythema, pressure ulcers (four grades), moisture lesions and combined lesions (Beekman et al., 2007). Nurses were required to grade each of the photographs for each of the differing prognoses. The findings are listed below: 1. High inter-rater reliability when differentiating between stage II and stage III ulcers 2. Nurses incorrectly diagnosed blanchable erythema as non-blanchable erythema 3. Moisture lesions were often diagnosed as pressure ulcers. Another study conducted by Defloor to examine the inter-rater reliability of the EPUAP scale used 44 nurses who looked at 56 photographs of normal skin, pressure ulcers (four grades), incontinence lesions (moisture lesion) and blanchable erythema (Defloor et al., 2004). It was found that there was confusion when differentiating moisture lesions and pressure ulcers. Determining the difference between pressure ulcers and moisture lesions is crucial because each diagnosis requires a different preventative or treatment measure to be taken. A study was conducted by Andersen and Karlsmark to obtain more objective data to aid in the identification and classification of pressure ulcers (2007). This study was performed based upon a case study of a patient with multiple pressure ulcers; each of the pressure ulcers were found to have a hypo-echogenic subepidermal layer (layer under the skin that does not reflect ultrasounds well) in the ultrasounds. In this study, the hypo-echogenic subepidermal layer was tested for in 11 patients (total of 15 pressure ulcers). In addition, the temperature, elasticity and redness were also evaluated using different technologies: 1) DermaSpectrometer from Cortex Technology to measure skin redness 2) DermaTemp model DT-1001 from Exergen to measure temperature 3) 20MHz B-mode scanner from Cortext (ultrasound) for the detection of a hypechogenic epidermal layer and, 4) Dermalab USB measurement equipment from Cortex Technology. Tests were conducted at the location of the pressure ulcer and on normal skin adjacent to the ulcers. Results were then compared to see if any major differences between the two were detected. It was concluded that the hypoechogenic subepidermal layer was found on all pressure ulcers locations but not on normal skin. Also, the skin redness was useful indication of a pressure ulcer. The elasticity and temperature, however, did show significant differences between normal skin and infected skin. This could be due to a lack of sensitivity of the measurement devices; perhaps if more studies are conducted, there will be greater differences. This paper proposes the use of a process flow diagram to help address some of the problems found in the literature review. METHOD The process flow diagram (PFD) was based on criteria taken from the studies discussed in this paper. One of the first considerations that were made was the type of scale to base the process flow diagram on. Due to difficulties of the present scales, improvements based on the discussed studies were made. The following method was followed: 1. Choose a classification system as basis for PFD. Criteria for choosing system: a. Validated system b. Simple (low number of stages) c. Highest inter-rater reliability compared to other systems (from previous studies) 2. Find improvements that need to be made with current classification system. a. Creation of levels to address the problems (using previous studies) 3. Questions to create less ambiguity for rater 4. Creation of more quantitative data as a decision factor a. Quantitative measure: skin firmness and skin temperature. RESULTS Please Appendix A for the Process Flow Diagram (PFD). 1. NPUAP Classification System a. Validated before implementation b. Most widely used (NPUAP system adopted by EPUAP system) c. Studies show highest inter-rater reliability using this method compared to other methods. 2. Current problems: a. Differentiating between moisture lesion & pressure ulcer (Defloor et al., 2004, Beekman et al., 2007, Edsberg., 2007) b. Differentiating between blanchable & nonblanchable erythema (Beekman et al., 2007 Healey, 1995, Nixon et al., 2008) c. Differentiating between Stage 1 & Stage 2 ulcers (Beekman et al., 2007) d. Differentiating between Stage 2 & Stage 3 ulcers (Nixon et al., 2008) 3. Yes/No Questions in PFD to decrease ambiguity 4. See discussion section for explanation of PFD and stage classification DISCUSSION The PFD was designed to create a simple and effective way for nurses/care givers to successfully diagnose and assess pressure ulcers. This proposal addresses current problems found in research studies with the NPUAP/EPUAP classification system. However the technique needs to be tested and validated before it is deemed suitable for implementation. The problems found from previous research studies were addressed by the PFD. The differentiating factors between stages 1 and 2 and between stages 2 and 3 were the questions addressing each stage as described by the NPUAP/EPUAP system (see PFD). The PFD was divided into two main sections, the left and the right. This division was created when the user answered the question, ‘is the skin intact.’ Answering ‘Yes’ to the question lead to the left side of the PFD, which distinguishes, between DTI, stage 1 and stage 2 pressure ulcers. Answering ‘No’ to the question lead to the right side of the PFD, which distinguishes, between DTI, stage 2, stage 3, and stage 4 pressure ulcers. Additional recommendations include process flow improvements (PFI): PFI1. In order to identify erythema of the skin (skin redness) in dark pigmented skin, a chart displaying pictures of each stage for different shades of skin needs to be created from previous documentation. The use of documented photographs from patients with a stage 1 ulcer with a variety of skin pigment tones would be ideal. The nurse can then compare the patient’s skin color/redness to the one in the photograph in order to accurately diagnose the infected area. The DermaSpectrometer can be used (Andersen et al., 2008) however, more studies may need to be conducted to validate its use as an instrument for the classification of pressure ulcers. a. For those with color blindness photographs can be used to show contrast between the infected area and normal skin from previously documented cases. PFI2. In order to measure the firmness of the skin (Stage I ulcer descriptor- firm skin) the ultrasonic elastogram can be used. This technique is suggested because the ultrasound to find the presence of the hypoechogenic subepidermal layer gave the best results (Andersen et al., 2008). No literature was found on the use of this technique for the detection of pressure ulcers, however they have been used in detecting cancerous tumors (Ophir et al., 1999). Perhaps studies on this can be conducted to find a way to detect skin firmness. PFI3. In order to measure the temperature of the skin a different model of the infrared derma temp (~$600) can be used (Andersen et al. 2008), one that is more sensitive at detecting temperature differences between two points. The two other factors that cause the high levels of inter-rater reliability are the experience and training levels of nurses. Many studies have suggested creating mandatory educational programs that must be updated every one to two years to keep up with any changes in pressure ulcer classification (Russell, 2002). There are setbacks including costs of the programs at such recurrent rates, however the misdiagnosis of the ulcers in just one patient at one hospital can cost the hospital tens of thousands of dollars. The use of this PFD will help reduce differences between nurses/caregivers with different experiences, by asking a set pattern of questions leading to a standard classification diagnosis. FUTURE WORK The proposed PFD is in its initial stage, and experimental studies need to be conducted to test for improvements in inter-rater reliability of the PFD pattern. The next step needs to be the testing of the current PFD to determine improvements in current questions (less ambiguous, different question pattern, etc). Creation of computer automation system that follows PFD process would be ideal and would need to be completed after numerous studies have been conducted on the effectiveness of the PFD. Studies conducted state that redness indexes are the most reliable ways to test for Stage I pressure ulcer (Andersen et al., 2002), however testing of more sensitive systems should be done. Testing using an ultrasonic elastogram will need to be conducted to determine accuracy when used for pressure ulcers. ACKNOWLEGDMENTS I would like to thank Dr. Grady Hollman and the Department of Industrial Engineering, University of Louisville. REFERENCES [1]Black, J., Baharestani, M., Cuddigan, J., Dorner,B., Edsberg, L., Langemo, D., Posthauer, M., Ratliff, C., Taler, G., & The National Pressure Ulcer Advisory Panel (NPUAP).(2007). Natonal Pressure Ulcer Advisory Panel’s Updated Pressure Ulcer Staging System [Electronic Version.] Dermatology Nursing,19, 343-349. [2] Andersen, E., & Karlsmark,T. (2008). Evaluation of four noninvasive methods for examination and characterization of pressure ulcers [Electronic Version] Skin and Research Technology,14, 270-276. [3]Russell, L. (2002).Pressure Ulcer Classification: the systems and the pitfalls[Electronic Version.] British Journal of Nursing,11, S49-S59. [4]Nixon,J., Thorpe,H., Barrow,H., Phillips,A., Nelson,E., Mason,S.,& Cullman,N.(2005). Reliability of pressure ulcer classification and diagnosis[Electronic Version]. Journal of Advanced Nursing, 50, 613623. [5] Pedley,G.(2003). Comparison of pressure ulcer grading scales: a study of clinical utility and inter-rater reliability[Electronic Version]. International Journal of Nursing Studies,41, 129-140. [6] Beekmen, D., Schoonhoven, L., Fletcher, J., Furtado,K., Gunningberg, L., Heyman,H., Lindholm,C., Paquay, L., Verdu, J., & Defloor,T. (2007. EPUAP classification system for pressure ulcers: European reliability study[Electronic Version]. JAN Research Methodology, 682-691. [7] Defloor,T., & Schoonhoven L. (2004). Inter-rater reliability of the EPUAP pressure ulcer classification system using photographs[Electronic Version].Journal of Clinical Nursing,13, 952959. [8] Russell, L. & Reynolds, T. (2001).How accurate of pressure ulcer grades? An image based survey of nurse performance. Journal of Tissue Viability,11, 67-75. [9] Healy, F. (1995).The reliability and utility of pressure sore grading scales [Electronic Version]. Journal of Tissue Viability,5, 111-114. [10] Center for Disease Control and Prevention. (2009, February). NCHS Data Brief. Retrieved April 12, 2010, from http://www.cdc.gov/nchs/data/databriefs/db14.pdf [11] European Pressure Ulcer Advisory Panel. (2007). Retrieved April 10, 2010, from http://www.epuap.org/review6_3/page6.html [12] Ophir, J., Alam, S., Garra, B., Kallel, F., Konofagou, E., Krouskop, T., & Varghese, T. (1999). Elastography: ultrasonic estimation and imaging of the elastic properties of tissues. Journal of Engineering in Medicine, 213, 203-233. [13] Institute for Healthcare Improvement. (2007, May). Relieve the Pressure and Reduce Harm. Retrieved December 2, 2010, from http://www.ihi.org/IHI/Topics/PatientSafety/SafetyGeneral/Improvement Stories/FSRelievethePressureandReduceHarm.htm BIOGRAPHICAL SKETCH Harkina Rangi is currently a Mechanical Engineering graduate student at the University of Louisville. She completed her undergraduate degree from the University of Louisville in Bioengineering in addition to a one and a half year internship at Boston Scientific Corp. before starting her graduate work. Teaming up with the industrial engineering department at the University of Louisville, Harkina was recognized for her preliminary research on inter-variability of pressure ulcer classification. She has been both an active member and President of the University’s Biomedical Engineering Society (BMES) as well as an active member on the Engineering Student Councils Board, advocating activities that introduce and teach engineering concepts to middle and high school students. She is currently a member of ASME and BMES. APPENDIX A Figure 1: Process Flow Diagram, proposed improvement technique