THE APPLICATION OF HAZARD FUNCTION IN THE CASE STUDY OF THE USE OF AN IUD FOR WOMEN NOR AFIAH BINTI NOR HANAFI UNIVERSITI TEKNOLOGI MALAYSIA THE APPLICATION OF HAZARD FUNCTION IN THE CASE STUDY OF THE USE OF AN IUD FOR WOMAN NOR AFIAH BINTI NOR HANAFI A report submitted in partial fulfillment of the requirements for the award of the degree of Master of Science Mathematics Faculty of Science Universiti Teknologi Malaysia NOVEMBER 2009 iii To my beloved mother and father iv ACKNOWLEDGEMENTS First and foremost, all praise be to Allah S.W.T., the Almighty, the Benevolent for His blessing and guidance for giving me the inspiration to complete this dissertation. A lot of thanks to Dr. Zarina Bt Mohd Khalid for her guidance and support for me during the whole semester to complete this dissertation. Furthermore, I would also like to dedicate my appreciation to my fellow friends for their motivation throughout this semester. Without their endless support, this dissertation will not complete as presented here. Last but not least, I also extend my appreciation to all my colleagues and others who have provided assistance in various occasions. To me, their help in giving views and tips had provided me with useful aid in completing this project. v ABSTRACT This dissertation presents the application of hazard function in the case study of the use of an IUD for women. The data in this case study is the secondary data that had been retrieved from Modeling Survival Data in Medical Research Second Edition by Collet, 1952. The Kernel nonparametric method and parametric Weibull hazard estimation methods, such as Maximum Likelihood and Least Square method in estimating hazard function are being compared with Kaplan-Meier estimate of hazard function. These comparisons are done through graphical comparison and analysis of standard error. The analysis of these results is done by using Mathcad and Minitab. vi ABSTRAK Disertasi ini memaparkan aplikasi fungsi risiko di dalam kajian kes terhadap pengunaan IUD di kalangan wanita. Data di dalam kajian kes ini merupakan data darjah kedua yang diambil dari Modelling Survival Data in Medical Research Second Edition oleh Collet, 1952. Teknik tidak berparameter Kernel dan teknik berparameter Weibull seperti teknik kebarangkalian maksimum dan segi empat sama terkecil dalam menganggar fungsi risiko telah dibandingkan dengan teknik menganngarkan fungsi risiko Kaplan-Meier. Perbandingan ini dilakukan melalui perbandingan graf dan analisis ralat am. Analisis ini dijalankan menerusi penggunaan Mathcad dan Minitab. vii TABLE OF CONTENTS CHAPTER TITLE PAGE SUPERVISOR AUTHENTICATION 1 TITLE i DECLARATION ii DEDICATION iii ACKNOWLEDGEMENTS iv ABSTRACT v ABSTRAK vi TABLE OF CONTENTS vii LIST OF TABLES x LIST OF FIGURES xi LIST OF SYMBOLS xiii LIST OF APPENDICES xiv INTRODUCTION 1 1.1 Background of Problem 1 1.2 Statement of Problem 2 1.3 Objectives of Study 3 1.4 Scope of the Study 4 1.5 Significance of the Study 4 1.6 Organization of the Report 5 viii 2 SURVIVAL ANALYSIS 7 2.1 Introduction 7 2.1.1 The Survival Time 7 2.1.2 The Hazard Function 9 2.1.3 Weibull Hazard Function 16 2.2 The Nonparametric Hazard Function Estimation 18 2.2.1 18 The Kaplan-Meier Estimate of Hazard Function 2.2.2 2.3 The Kernel Estimate of Hazard Function 19 The Parametric Weibull Hazard Function Estimation 21 2.3.1 21 The Maximum Likelihood Estimation of Weibull Hazard Function 2.3.2 The Least Square Estimate of Weibull Hazard 25 Function 3 THE ESTIMATION OF HAZARD FUNCTION 32 PROCEDURES 3.1 Introduction 32 3.2 The Data of Time to Discontinuation of the Use of an 33 IUD for 18 Women 3.3 3.4 The Nonparametric Estimation of the Hazard Function 35 3.3.1 The Kaplan-Meier Estimate of Hazard Function 35 3.3.2 Kernel Estimate of Hazard function 36 The Parametric Estimation of Weibull Hazard Function 41 3.4.1 41 The Maximum Likelihood Estimate of Weibull Hazard Function 3.4.2 The Least Square Estimate of Weibull 45 Hazard Function 4 RESULTS AND DISCUSSION 47 4.1 47 Introduction ix 4.2 The Nonparametric Estimation of Hazard Function for 48 the Time to Discontinuation of the Use of an IUD for 18 Women 4.2.1 The Kaplan-Meier Estimate of Hazard Function 48 4.2.2 The Kernel Estimate of Hazard Function 51 4.2.3 The Comparison of Nonparametric Approaches 57 in Estimating the Hazard Function 4.3 The Parametric Estimate of Weibull Hazard Function for 59 the Time to Discontinuation of the Use of an IUD for 18 Women 4.3.1 The Maximum Likelihood Estimate of Weibull 59 Hazard Function 4.3.2 The Least Square Estimate of Weibull Hazard 62 Function 4.3.3 The Comparison of the Standard Error of 65 Estimated Weibull Hazard Function 4.4 The Comparison of the Nonparametric and Parametric 67 Approaches in Estimating the Hazard Function 5 CONCLUSIONS AND SUGGESTIONS 71 5.1 Conclusions 71 5.2 Suggestions 72 5.2.1 Suggestions Based on Findings 72 5.2.2 Suggestions for Future Resarch 73 REFERENCES 74 Appendices A - E 78 x LIST OF TABLES TABLE NO. TITLE PAGE 3.1 Time in weeks to discontinuation of the use of an IUD 34 4.1 Kaplan-Meier type estimate of the hazard function for 39 the data of the time to discontinuation of the use of an IUD for 18 women 4.2 Kernel estimate of the hazard function for the data of 58 the time to discontinuation of the use of an IUD for 18 women 4.3 Maximum Likelihood estimate of the hazard function 60 for the data of the time to discontinuation of the use of an IUD for 18 women 4.4 Least Square estimate of the hazard function for the 63 data of the time to discontinuation of the use of an IUD for 18 women 4.5 The comparison of standard error of estimated Weibull hazard function 66 xi 4.6 The standard error (RMSE) of nonparametric and parametric estimation methods of hazard function for the data of the time to discontinuation of the use of an IUD for 18 women 68 xii LIST OF FIGURES TABLE NO. 2.1 TITLE The form of the Weibull hazard function, h(t ) = λkt k −1 PAGE 17 for different values of k 2.2 Carl Friedrich Gauss 27 4.1 Kaplan-Meier type estimate of the hazard function for 50 the data of the time to discontinuation of the use of an IUD for 18 women 4.2 Epanechnikov Kernel type estimate of the hazard 54 function for the data of the time to discontinuation of the use of an IUD for 18 women 4.3 Biweight Kernel type estimate of the hazard function for the data of the time to discontinuation of the use of an IUD for 18 women 55 xiii 4.4 Maximum Likelihood estimate of Weibull hazard 61 function for the data of the time to discontinuation of the use of an IUD for 18 women 4.5 Least Square estimate of Weibull hazard function 64 for the data of the time to discontinuation of the use of an IUD for 18 women 4.6 The graphical comparison of the nonparametric and parametric approaches in estimating the hazard function for the data of the time to discontinuation of the use of an IUD for 18 women 69 xiv LIST OF SYMBOLS b - Bandwidth c - Censoring time dj - Death at time jth death time FT ( . ) - Cumulative density function fT( . ) - Probability density function hT (t ) - Failure rate at time t H( .) - Integrated hazard function h(t ) , λ (t ) - Hazard function at time t hˆ(t ) , h * (t ) , λˆ (t ) - Estimated hazard function K (x) - Kernel function of variable x k - Shape parameter of Weibull hazard function L(θ ) - Likelihood function of random variable θ Ln - Modified empirical distribution of L n - Number of observation nj - Observation at risk at jth death time S (t ) - Survival function at time t Sˆ (t ) - Estimated survival function se( θ ) - Standard error of θ t(j) - jth death time W( λ ,k) - Weibull hazard distribution λ - Scale parameter of Weibull hazard function σˆ 2 - Estimate of variance xv LIST OF APPENDICES APPENDIX A TITLE The application of Minitab in estimating the hazard PAGE 78 function B Estimated hazard function 82 C Standard error of Kernel estimate of hazard 86 function D The application of Mathcad in Kernel estimate of the 90 hazard function E The Newton-Raphson procedure 93 CHAPTER 1 INTRODUCTION 1.1 Background of the Problem The beginning of survival analysis may be traced back to early work on mortality in the seventeenth century as Graunt published the first Weekly Bill of Mortality in London and Healey published the first lifetable. Ever since then, the lifetable method has been used regularly by actuaries, statisticians, and biomedical researchers in governmental and private agencies. During World War II, reliability of military equipment became a significant issue. This directed to the study of the durability or the “lifetime” of industrial devices. After the war, the techniques used to analyze the reliability of industrial devices were further expanded and pertained to the study of survival time of cancer patients. The phrase “lifetime analysis” used by industrial reliability engineers was altered to “survival analysis” by cancer researchers. During four decades ago, survival analysis has turn into one of the most regularly used techniques for analyzing data in disciplines ranging from medicine, epidemiology, and environmental health, to criminology, marketing, and astronomy. 2 The primary variable in survival analysis is survival time, no longer limited to mean the time to death. The term “survival time” is used loosely for the time period from a starting time point to the occurrence of a certain event. Examples of survival time are: the time to the development of diabetic retinopathy from the time of diagnosis of diabetes, the time to parole of a prisoner, the duration of first marriage, workman’s compensation or other insurance claims and lifetime of electronic devices and computer components. In summarizing survival data, there are two functions of central interest. The two functions are the survival function and the hazard functions. There are many ways to estimate these functions. For this dissertation, the focused is on estimating the hazard function instead of the survival function. The hazard function will be applied in the case study of the use of an IUD for woman. The Kernel and parametric Weibull hazard functions estimates, which are Maximum Likelihood and Least Square methods will be compared with Kaplan-Meier estimate in order to find the most suitable approach that can be used to estimate the hazard function. 1.2 Statement of the Problem There are two types of nonparametric estimation of hazard function that will be discussed in this dissertation such as Kaplan-Meier estimate and Kernel Method estimate. While for the parametric estimate of Weibull hazard function, Maximum Likelihood and Least Square Method will be conclude. From these two approaches, there will be a most suitable approach to estimate the hazard function of a finite data compared with others. The problem arises when we want to determine the suitable approach that can be used to estimate the hazard function. In order to achieve a solution 3 to this, the comparison of all these approaches need to be done and once again we will have some problem on how to estimate the hazard function using the Kaplan-Meier, Kernel method, Maximum Likelihood Estimate and Least Square Method, how to find the standard error of the estimated hazard function and how to find the most appropriate approach among these two approaches to estimate hazard function. 1.3 Objectives of the Study The objectives of this study are: 1.3.1 To study the basic concept of Kaplan-Meier, Kernel method, Maximum Likelihood Estimate and Least Square Method in order to estimate the hazard function of the time to discontinuation of the use of an IUD for 18 woman. 1.3.2 To compare the nonparametric and parametric estimates using Kaplan-Meier, Kernel method, Maximum Likelihood Estimate and Least Square Method in order to find their advantages and disadvantages. 1.3.3 To determine the most suitable approach that can be used to estimate the hazard function through graphical comparison and the analysis of error. 4 1.4 Scope of the Study This study will discuss the application of two types of approaches of hazard function which are nonparametric and parametric approaches in the case study of the use of an IUD for woman. For nonparametric approaches, Kaplan-Meier estimate and kernel method estimate will be discussed and for parametric approaches, Maximum Likelihood Estimation and Least Square Method will be discussed. In addition, this study will also determine the most precise approach to estimate the hazard function. 1.5 Significance of the Study The outcomes of this study will give advantages to statistical and medical fields. In statistics, this study will offer advance knowledge on the use of suitable approaches in analyzing non-negative random variable for instance time to certain event. While in medical, this research will advanced the knowledge to medical practitioners on the risk of a person at certain times after being given some kind of treatment so that they can improve their treatment to cure the patients. 5 1.6 Organization of the Report This report includes five chapters which are Introduction, Survival Analysis, The Estimation of Hazard Function Procedures, Results and Discussion and Conclusions and Suggestions. The introduction of the study is stated in Chapter 1 of the report. This chapter contains six subtopics such as background of the problem, statement of the problem, objectives of the study, scope of the study, significance of the study and organization of the report. Chapter 2 contains the discussion about survival analysis. In this chapter, survival analysis is being introduced by the failure time and the hazard function. After that, the following subtopics, which are the nonparametric hazard function estimation and parametric Weibull hazard function estimation, are discussed. Next, Chapter 3 explains about the estimation of hazard function procedures. This chapter stated five interesting subtopics which are introduction, the medical data, the nonparametric estimation of the hazard function and the parametric estimation of Weibull hazard function. Then, Chapter 4 includes the results and discussions of the report. The main things that being stated in this chapters are introduction, the nonparametric estimation of hazard function for the time to discontinuation of the use of an IUD for 18 women, the parametric estimate of Weibull hazard function for the data of time to discontinuation of 6 the use of an IUD for 18 Women and the graphical comparison of the nonparametric and parametric approaches in estimating the hazard function . Lastly, Chapter 5 includes the conclusions and suggestions. The Conclusions and suggestions are brought upon based on findings as preparation for future research. CHAPTER 2 SURVIVAL ANALYSIS 2.1 Introduction Survival analysis is a broadly used technique in a multiplicity of disciplines to review the assets of durations between certain events. Essential examples of durations are unemployment spells, survival times of patients, and durations between subsequent transactions in a financial security. 2.1.1 The Survival Time The main variable in survival analysis is survival time, no longer restricted to mean the time to death. The expression “survival time” is used freely for the time period from a starting time point to the occurrence of a certain event. Examples of survival time are the time to the growth of diabetic retinopathy from the time of diagnosis of diabetes, 8 the time to parole of a prisoner, the duration of first marriage, workman’s compensation or other insurance claims, and lifetime of electronic devices and computer components. Survival time is a nonnegative random variable evaluating the time interval from a starting point to the occurrence of a given event. Because of its two special features, survival time is not agreeable to standard statistical methods. First, the accurate survival time possibly longer than the duration of the study time (or observation time) and is consequently unknown. For example, in a ten-year longitudinal study of diabetic retinopathy, numerous patients may not expand retinopathy in the ten year period and, as a result, the exact retinopathy-free time is unknown. If a patient with no retinopathy goes into the study at the starting of year 2 and is still retinopathy-free at the last part of the study, his retinopathy-free time is no less than nine years (usually denoted by 9C years). In an insurance claim, the overall amount of money an insurance company forfeits in the case of a severe automobile accident at a certain time may not be precisely known. It may be known that the total amount is at least $5000 (or $5000C). In countless clinical and epidemiological studies, applicants could leave the study before it stops and therefore become lost to follow-up. They possibly will die of a ground unconnected to the illness under study or merely reject to continue their contribution. For example, in a three-year study of mortality due to lung cancer, an applicant may decline to carry on after two years. His survival time is 2C years. These incomplete observations are known as “censored” observation, and in particular, “right censored” observations on the contrary to “left censored” observations, in which the accurate survival time is unknown but is fewer than the observation time. 9 Majority of survival analysis methods center on right censoring as it occurs far more regularly than left censoring. The imperfectness of data results the conventional statistical methods inappropriate. 2.1.2 The Hazard Function The hazard function is a particularly important characteristic of a lifetime distribution. It indicates the way the risk of failure varies with age or time, and this is of interest in most applications. Prior information about the shape of the hazard function can help guide model selection. Finally it factors affecting an individual’s lifetime vary over time, it is often essential to approach modeling through the hazard function. The function FT ( . ) and fT( . ) supply two mathematically corresponding ways of identifying the distribution of a continuous nonnegative random variable, and there are numerous other corresponding functions. One with particular value in the current context is the hazard function, or age-specific failure rate, classified by hT (t ) = lim+ Δ →0 pr (t ≤ T < t + Δ | t ≤ T ) . Δ (2.1) By the definition of conditional probability, excluding the suffix T, that h(t ) = f (t ) / F (t ) . (2.2) 10 Condition there is an atom fj of probability at time aj, h(t) holds an element h j δ (t − a j ) , where h j = f j / F (a j ) , (2.3) and for a solely discrete distribution with atoms {fj} at points {aj}, a1 < a 2 < ... , h(t ) = ¦ h j δ (t − a j ) , where h j = f j / F (a j ) = f j /( f j + f j +1 + ...) . (2.4) For continuous distribution, the probability density function is f T (t ) = − FT' (t ) = lim Δ →0 + pr (t ≤ T < t + Δ ) . Δ (2.5) 11 By (2.2) and (2.5), h j = − F ' (t ) / F (t ) = − d log F (t ) / dt in order that, because F(0)=1, t F (t ) = exp§¨ − ³ h(u )du ·¸ © 0 ¹, = exp[− H (t )] (2.6) say, where H( .) is called the integrated hazard. In addition, f (t ) = h(t ) exp[− H (t )] . (2.7) If and only if h( .) is constant, with value ρ say, the distribution is exponential, F (t ) = e − ρt , f (t ) = ρe − ρt . (2.8) For discrete distribution, it pursues on applying (2.4) recursively, or by a straight appliance of the product law of probabilities, that 12 F (t ) = ∏ (1 − h j ) ; (2.9) a j <t to have T ≥ t it is essential and adequate to endure all points of support before t. To define an integrated hazard in the discrete case the most productive convention is to take H (t ) = ¦ log(1 − h ) , j (2.10) a j <t so that (2.6) still holds: F (t ) = exp[− H (t )] . If the hj are small H (t ) ≈ ¦h j a j <t and the right-hand side could be taken as another definition. (2.11) 13 Cox and Oakes (1984) stated that there are many causes why considerations of the hazard function possibly an excellent idea: (i) It may be physically informing to regard the direct ‘risk’ connecting to an individual identified to be alive at age t; (ii) Comparison of groups of individuals is occasionally most wisely made via the hazard; (iii) Hazard-based models are often suitable when there is censoring or there are some sorts of failure; (iv) Comparison with an exponential distribution is mainly straightforward in terms on the hazard; (v) The hazard is the unique type for the ‘single failure’ system of the complete intensity function for more elaborate point process, such that systems in which several point events can arise for each individual. Lee and Go (1997) stated that the hazard function gives the conditional failure rate and defined as h(t ) = lim lim P (t < T < t + Δt / t ) / Δt . Furthermore, the hazard Δt →0 function is also recognized as the instantaneous failure rate, age-specific failure rate, or conditional mortality rate. It is an assess of the flatness to failure as a function of the age of the individual in the logic that the quantity Δth(t ) is the predicted proportion of age t individuals who will fail in the small interval t to t+ Δt . It is not actually a probability as 14 its value can be larger than one. The hazard function has a significant task in survival analysis. The hazard is a singularly significant assess in survival analysis. Assume that the distribution and density functions of T, the “failure time” random variable, are indicated by F and f correspondingly. Subramanian and Bean (2008) had defined hazard function, indicated by λ (t ) , is the limit of P (t ≤ T < t + h | T ≥ t ) as h tends to 0, and takes the form f (t ) / F (t ) , where F (t ) = 1 − F (t ) . When data are right censored, the monitored data are n independent and identically distributed replicates of ( X , δ ) , where X = min(T , C ) = T ∧ C and δ = I ( X = T ) , and C is an independent censoring time. This is the classical random censorship (CRC) model for which estimation of λ (t ) has been well considered, which is an outstanding basis of reference for a variety of hazard estimation methods. In addition, Horová et al. (2009) affirmed that the survival process can also be described by the hazard function λ = λ (x) , such that, the probability that an individual dies at time x, conditional on he or she having survived to that time. If the lifetime distribution F has a density f, for F ( x) > 0 the hazard function is defined by λ ( x) = f ( x) . F ( x) Since F (0) = 1 , the survival function can be expressed by the formula 15 x F ( x) = exp§¨ − ³ λ (t )dt ·¸ . © 0 ¹ Let cohort of the initial size N0 die out with the time dependent death rate μ = μ (x ) , such that, the size of the cohort N = N (x) at time x develops according to the differential equation N ' ( x) = − μ ( x) N ( x) ; N (0) = N 0 whose result is given by x N ( x) = N 0 exp§¨ ³ μ (t )dt ·¸ . 0 © ¹ In this association the survival function F is defined as F ( x) = N ( x) . N0 Hence, the death rate μ equals the hazard function λ . Consequently λ ( x) = − N ' ( x) . N ( x) 16 2.1.3 Weibull Hazard Function A probability distribution that acts a vital task in the analysis of survival data is the Weibull distribution, established by W. Weibull in 1951 in the perspective of industrial reliability testing. Certainly, this distribution is as vital to the parametric analysis of survival data as the normal distribution in linear modeling. In application, the supposition of a constant hazard function, or equally of exponential distributed survival times is seldom acceptable. Collet (1952a) stated that, a more common form of hazard function is such that h(t ) = λkt k −1 , (2.12) for 0 ≤ t < ∞ , a function that depends on two parameters k and λ , which are equally greater than 0. In the particular case where k = 1, the hazard function obtains a constant value λ , and the survival times have an exponential distribution. For additional values of k, the hazard function increases or decreases monotonically, that is, it does not vary direction. The shape of the hazard function depends significantly on the value of k, and so k is identified as the shape parameter, while the parameter λ is a scale parameter. The common form of this hazard function for different values of k is revealed in Figure 2.1. 17 Figure 2.1: The form of the Weibull hazard function, h(t ) = λkt k −1 , for different values of k. For this particular choice of hazard function, the survival function is { t } S (t ) = exp − ³ λku k −1 du = exp(−λt k ) . 0 The equivalent probability density function is then f (t ) = λkt k −1 exp(−λt k ) , (2.13) 18 for 0 ≤ t < ∞ , which is the density of a random variable that has a Weibull distribution with scale parameter λ and shape parameter k. The scale parameter k = α −1 is frequently used in place of k. This distribution will be indicated W( λ ,k). The right-hand tail of this distribution is longer than the left-hand one, and thus the distribution is positively skewed. Franklin (2005) stated that the exponential is not very flexible. It also imposes the characteristic that the hazard function is a constant. Substantively this means that the chance of failure is the same at all survival times. Whether the case has survived 10 days or 10 months or 10 years it has the same chance of failure in the next moment. Many find this property highly restrictive. The Weibull has a much more flexible form and it includes the exponential as a special case. This makes it more attractive than the exponential. 2.2 The Nonparametric Hazard Function Estimation 2.2.1 The Kaplan-Meier Estimate of Hazard Function The estimation of the hazard function as well as the survival function has been striking topics in survival analysis when the data are censored. Within the analysis of survival data, the hazard function is estimated from the observed survival times. 19 Estimation of the hazard function has drawn quite fewer attentions than that of the survival function. For the survival function, the Kaplan–Meier estimator is mainly extensively used. This estimator is self-consistent and its asymptotic properties such as strong consistency and asymptotic normality. Furthermore, Kaplan-Meier estimate can also been used to estimate the hazard function. Collet (1952b) stated that a single sample of survival data may also be reviewed through the hazard function, which shows the dependence of the immediate risk of death on time and Kaplan-Meier is also one way of estimating hazard function. Kaplan-Meier is a natural way of estimating the hazard function for survival data. It takes the ratio of the number of deaths at a given death time to the number of individuals at risk at that time. 2.2.2 The Kernel Estimate of Hazard Function A kernel is a weighting function applied in non-parametric estimation techniques. Kernels are used in kernel density estimation to estimate random variables' density functions, or in kernel regression to estimate the conditional expectation of a random variable. Kernels are also used in time-series, in the use of the periodogram to estimate the spectral density. An extra use is in the estimation of a time-varying intensity for a point process. 20 A kernel is a non-negative real-valued integrable function K satisfying the following two requirements: +∞ ³ K (u)du = 1 and K (−u ) = K (u ) for all values of u. −∞ The first requirement ensures that the method of kernel density estimation results in a probability density function and the second requirement ensures that the average of the corresponding distribution is equal to that of the sample used. If K is a kernel, then so is the function K* defined by K*(u) = Ȝ−1K(Ȝu), where Ȝ > 0. This can be used to select a scale that is appropriate for the data. Huang (2005) had stated the common choices for the kernel are: the Epanechnikov kernel with K(u) = 0.75(1 − u2) for − 1 u 1, 21 and the biweight kernel with K(u) = 15 (1 − u 2 ) 2 for − 1 u 1. 16 By definition, the Epanechnikov and Biweight kernel are these function respectively: (3/4)(1-u2) and (15/16)(1-u2)2 for -1<u<1 and zero for u outside that range. Here u equals to (x-xi)/h, where h is the bandwidth and xi are the values of the independent variable in the data, and x is the value of the scalar independent variable for which one seeks an estimate. 2.3 The Parametric Weibull Hazard Function Estimation 2.3.1 The Maximum Likelihood Estimation of Weibull Hazard Function Maximum likelihood estimates of the two defining parameters of the Weibull distribution are the most efficient estimates (lowest variances) but are numerically complex and are usually calculated with a computer program. 22 Lawless (2003) stated that to gain a likelihood function, L(θ ) ∝ Pr( Data;θ ) or the properties of statistical procedures according on censored data it is essential to regard the process by which both lifetimes and censoring times arise. To achieve this, a probability model apparently need for the censoring mechanism. Fascinatingly, it happens that the monitored likelihood function for lifetime parameters holds the similar from under a broad range of mechanisms. First, several notation for censored data is set up. Assume that n individuals have lifetimes signified by random variables T1 , T2 ,..., Tn . As an alternative of the observed values of each lifetime, however, there is a known time t i which is either the lifetime or a censoring time. A variable δ i = I (Ti = t i ) is defined as ­1, Ti = t i . ¯0, Ti > t i δi = ® The observed data then consist of (t i , δ i ) , i = 1,..., n . With this notation, let t i stand for either random variable or realized value. This disobeys the convention where capital letters symbolize random variables and lowercase letters symbolize realized values, but no confusion should occur. The most significant result is that for a variety of censoring mechanisms, the observed likelihood function equals to n L = ∏ f (t i ) δ i S (t i ) 1−δ i . i =1 23 Perhaps the most vital property of the maximum likelihood process is that it generates an estimate of the variance of the distribution of the estimated quantities. This estimated variance is also accustomed to compensate for the incomplete information from the censored observations. As mentioned, the influence of censored values is infrequently a concern once it is accounted for in the estimation process. Consider the hazard function or escape rate h( y | x, θ ) = lim Pr( y ≤ y < y + Δ | y ≤ y, x) / Δ = Δ =0 f ( y | x, θ ) . 1 − F ( y | x, θ ) The hazard function is just another approach of characterizing a distribution, like the density function, the distribution function, the survivor function, the moment generating function or the characteristic function. It is just mainly suitable and interpretable way of describing a distribution or durations. Given the hazard, the distribution can be calculated as y F ( y | x, θ ) = 1 − exp§¨ − ³ h( s | x, θ )ds ·¸ , © 0 ¹ and hence the density function. The exponential model entails that the hazard function remains constant over the duration of the spell, equal to exp( x' β ) in the previous specification. To explain this, take a person and look at their chances of finding a job on the first day of being unemployed. These chances are the same as the chances that this same person would find a job on the fiftieth day given that he has been failed in finding 24 work in the first forty–nine days. This may be rational, but it might also be something that is wished to force from the outset. Therefore, an addition allowing the hazard function to either increase, stay constant, or decrease over time is considered. This addition is known as the Weibull distribution: h( y | x, β , α ) = (α + 1) y α exp( x' β ) . Note that this reduces to the exponential distribution if α = 0. The implied density function for the Weibull distribution is f ( y | x, β , α ) = (α + 1) y α exp( x' β ) exp(− y α +1 exp( x' β )) . The moments of this distribution are § k +1 · E[ y k | X ] = exp(− kx' β /(α + 1))Γ¨ ¸. © α +1¹ (Note that for the case with α = 0 this reduces to the exponential case with E ( y k | X ) = exp(− kx' β )Γ(1 + k ) , and thus with k = 1 the mean of the exponential distribution is E[ y | X ] = exp(− x' β ) ). 25 The log likelihood function for this model is N ( ) L(α , β ) = ¦ ln(α + 1) + α ln y t + xt' β − y t (α + 1) exp( xt' β ) . t =1 2.3.2 The Least Square Estimate of Weibull Hazard Function The Least Square Estimate is so frequently applied in engineering and mathematics problems that are frequently not thought of as an estimation problem. Least square is a time honored estimation procedure, that has been in apply since the early nineteenth century. It is possibly the most extensively used technique in geophysical data analysis. The technique of least squares developed in the fields of astronomy and geodesy as scientists and mathematicians hunted to provide solutions to the challenges of steering the Earth's oceans during the Age of Exploration. The exact picture of the manners of celestial bodies was key to allowing ships to navigate in open seas where before sailors had depended on land sightings to decide the locations of their ships. 26 The method was the conclusion of numerous progresses that happened during the course of the eighteenth century. (i) The mixture of different remarks taken under the similar circumstances as opposed to just trying one's best to monitor and documentation a single observation accurately. This method was used by Tobias Mayer while studying the librations of the moon. (ii) The mixture of different remarks as being the best estimate of the true value; errors decrease with aggregation rather than increase, possibly initially expressed by Roger Cotes. (iii) The mixture of different remarks taken under different conditions as performed by Roger Joseph Boscovich in his study on the shape of the earth and PierreSimon Laplace in his work in clarifying the differences in motion of Jupiter and Saturn. (iv) The growth of a criterion that can be evaluated to establish when the solution with the minimum error has been attained, expanded by Laplace in his Method of Situation. 27 Figure 2.2: Carl Friedrich Gauss At the age of eighteen, Carl Friedrich Gauss is attributed with expanding the fundamentals of the basis for least-squares analysis in 1795. however, Legendre was the first to publish the method. An early exhibition of the potency of Gauss's method appeared when it was used to forecast the upcoming location of the recently discovered asteroid Ceres. Then, on January 1, 1801, the Italian astronomer Giuseppe Piazzi revealed Ceres and was able to trail its path for 40 days before it was gone in the glare of the sun. According on this data, it was preferred to establish the location of Ceres after it appeared from behind the sun without answering the complicated Kepler's nonlinear equations of planetary motion. The only forecasts that fruitfully allowed Hungarian astronomer Franz Xaver von Zach to relocate Ceres were those achieved by the 24-year-old Gauss using leastsquares analysis. Gauss did not issue the method until 1809, when it emerged in volume two of his study on celestial mechanics, Theoria Motus Corporum Coelestium in sectionibus conicis solem ambientium. In 1829, Gauss was capable to state that the least-squares 28 approach to regression analysis is optimal in the logic that in a linear model where the errors have a mean of zero, are uncorrelated, and have equal variances, the best linear unbiased estimator of the coefficients is the least-squares estimator. This outcome is known as the Gauss–Markov theorem. The inspiration of least-squares analysis was also autonomously originated by the Frenchman Adrien-Marie Legendre in 1805 and the American Robert Adrain in 1808. Agnew and Constable (2004) stated that, unlike maximum likelihood, which can be applied to some problem for which we know the common form of the joint probability density function, in least squares the parameters to be estimated must arise in terms for the means of the observations. When the parameters appear linearly in these expressions then the least squares estimation problem can be solved in closed form, and it is relatively simple to derive the statistical properties for the resulting parameter estimates. One very simple example that will be treated in some detail in order to demonstrate the more general problem is that of fitting a straight line to a collection of pairs of observations (xi, yi) where i = 1, 2, . . . , n. A reasonable model is of the form y = β 0 + β1 x (2.14) is assumed and a mechanism for determining β 0 and β1 is needed. This is just a particular case of the more general problem of fitting a polynomial of order p, for which one would need to find p + 1 coefficient. 29 The most frequently used method for finding a model is that of least squares estimation. It is supposed that x is an independent (or predictor) variable which is known exactly, while y is a dependent (or response) variable. The least square (LS) estimates for β 0 and β1 are those for which the expected values of the curve minimize the sum of the squared deviations from the observations. The problem is to find the values of β 0 , β1 that minimize the residual sum of squares n S ( β 0 , β1 ) = ¦ ( y i − β 0 − β 1 x1 ) 2 . (2.15) i =1 Note that this involves the minimization of vertical deviations from the line (not the perpendicular distance) and is therefore not symmetric in y and x. In addition, if x is treated as the dependent variable instead of y one might well expect a different result. To obtain the minimizing values of β i in (2.15) the equations resulting from setting ∂S = 0, ∂β 0 is solved, namely ∂S =0 ∂β 1 (2.16) 30 ¦y i ¦x y i i i i = nβˆ0 + βˆ1 ¦ xi i = βˆ0 ¦ xi + βˆ1 ¦ xi2 i (2.17) i Solving for the β̂ i yields the least squares parameter estimates: βˆ0 = (¦ xi2 ¦ iy i − ¦ xi ¦ xi y i ) /(n¦ xi2 − (¦ xi ) 2 ) βˆ1 = (n¦ xi y i − ¦ xi ¦ y i ) /(n¦ x i − (¦ xi ) 2 ) 2 (2.18) where the ’s are implicitly taken to be from i = 1 to n in each case. Having generated these estimates, it is usual to question how much faith the values of β 0 and β1 holds, and whether the fit to the data is reasonable. Possibly a different functional form would offer more appropriate fit to the observations, as example, involving a series of independent variables, so that y ≈ β 0 + β 1 x1 + β 2 x 2 + β 3 x3 (2.19) f (t ) = Ae −αt + β e − βt (2.20) or decay curves 31 or periodic functions f (t ) = A cos ω1t + B sin ω1t + C cos ω 2 t + D sin ω 2 t (2.21) In equations (2.20) and (2.21) the functions f(t) are linear in A, B, C and D, but nonlinear in the other parameters α , β , ω1 and ω 2 . When the function to be fit is linear in the parameters, then the partial derivatives of S with respect to them obtain equations that can be solved in closed form. Normally, nonlinear least squares problems do not provide a solution in closed form and one must resort to an iterative procedure. However, it is sometimes likely to convert the nonlinear function to be fitted into a linear form. As an example, the Arrhenius equation models the rate of a chemical reaction as a function of temperature via a 2parameter model with an unknown constant frequency factor C and activation energy EA, so that α (T ) = Ce − E A / κT (2.22) Boltzmann’s constant, k is known a priori. If one estimates α at various values of T, then C and EA can be obtained by a linear least squares fit to the transformed variables, log α and log α (T ) = log C − EA . κT (2.23) CHAPTER 3 THE ESTIMATION OF HAZARD FUNCTION PROCEDURES 3.1 Introduction This section discusses the procedures to estimate the hazard function of the time to discontinuation of the use of an IUD for 18 women. The general purpose of this dissertation is to determine the most suitable approach between parametric and nonparametric approaches that can be used to estimate the hazard function of the case study of the use of an IUD for woman. The parametric estimates that will be compared are maximum likelihood estimate and least square estimate. To assess and evaluate the most appropriate estimate, all these methods have been used to estimate the hazard function of the time to discontinuation of the use of an IUD for 18 women. 33 3.2 The Data of Time to Discontinuation of the Use of an IUD for 18 Women. The data of the time to discontinuation of the use of an IUD for 18 women is obtained from the secondary references as stated in “Modeling Survival Data in Medical Research (Second Edition)” by Collet (1952). Furthermore, this data is obtained from a research conducted by World Health Organisation (WHO). In trials involving contraceptives, prevention of pregnancy is an obvious criterion for acceptability. However, modern contraceptives have very low failure rates, and so the occurrence of bleeding disturbances, such as amenorrhoea (the prolonged absence of bleeding), irregular or prolonged bleeding, become important in the evaluation of a particular method of contraception. To promote research into methods of analyzing menstrual bleeding data for women in contraceptive trials, the World Health Organisation has made available data from clinical trials involving a number of different types of contraceptive. Part of this data set relates to the time from which a woman commences use of a particular method until discontinuation, with the discontinuation reason being recorded when known. The data in Table 3.1 refer to the number of weeks from the commencement of use of a particular type of intrauterine device (IUD), known as the Multiload 250, until discontinuation because of menstrual bleeding problems. Data are given for 18 women, all who were aged between 18 and 35 years and who had experienced two previous pregnancies. Discontinuation times that are censored are labeled with an asterisk. 34 Table 3.1: Time in weeks to discontinuation of the use of an IUD. 10 13* 18* 19 23* 30 36 38* 54* 56* 59 75 93 97 104* 107 107* 107* In this data, the time origin corresponds to the first day in which a woman uses the IUD, and the end point is discontinuation because of bleeding problems. Some women in the study ceased using the IUD because of the desire for pregnancy, or because they had no further need for a contraceptive, while others were simply lost to follow up. These reasons account for the censored discontinuation times of 13, 18, 23, 38, 54 and 56 weeks. The study protocol called for the menstrual bleeding experience of each woman to be documented for a period of two years from the time origin. For practical reasons. each woman could not be examined exactly two years after recruitment to determine if they were still using the IUD, and this is why there are three discontinuation times greater than 104 weeks that are right censored. This data was originally analyzed to summarize the distribution of discontinuation times and to estimate the median time to discontinuation of the IUD, or the probability that a woman will stop using the device after a given period of time. But, for this dissertation, this data is used to assess and evaluate the most appropriate estimation method in estimating the hazard function. 35 3.3 The Nonparametric Estimation of the Hazard Function 3.3.1 The Kaplan-Meier Estimate of Hazard Function. As for Kaplan-Meier estimate, a usual technique of estimating the hazard function for unground survival data is to take the ratio of the number of deaths at a given death time to the number of individuals at risk at that time. If the hazard function is supposed to be constant among successive death times, the hazard per unit time can be obtained by further dividing by the time interval. Hence, if there are dj deaths at the jth death time, t(j), j = 1, 2, …, r, and nj at risk at time t(j), the hazard function can be estimated by d hˆ(t ) = j , n jτ j (3.1) for t(j) t < t(j+1), where IJj = t(j+1) - t(j) . Note that it is impossible to apply equation (3.1) to estimate the hazard in the interval that begins at the final death time, since this interval is open-ended. The estimate in equation (3.1) is known as Kaplan-Meier type estimate, since estimated survival function obtained from it is the Kaplan-Meier estimate. This holds since hˆ(t ) , t(j) t < t(j+1), is an estimate of the risk of death per unit time in the jth interval. The probability of death in that interval is hˆ(t )τˆ j , that is dj / nj. Hence, an 36 estimate of the corresponding survival probability in that interval is 1 – (dj / nj) and the estimated survival function is given by k § dj Sˆ (t ) = ∏ ¨1 − ¨ n j =1 © j · ¸ . ¸ ¹ (3.2) The approximate standard error of hˆ(t ) can be found from the variance of dj, which may be assumed to have a binomial distribution with parameters nj and pj, where pj is the probability of death in the interval of length IJ. Consequently, var(dj) = njpj(1-pj), and estimating pj by dj / nj gives § nj − d j · ¸ se{hˆ(t )} = hˆ(t ) ¨ ¨ nd ¸ © j j ¹. However, when dj is small, confidence interval constructed using this standard error will be too wide to be of practical use. 3.3.2 Kernel Estimate of Hazard Function The general kernel estimator of h * (t ) or the failure hazard at point t is given by n h * (t ) = ¦ i =1 di K θ (t − t ( i ) ). n − i +1 37 When ș = m, that is Kș(z) = (1/m)K(z/m), we will refer to this estimator as the 1parameter estimator. The estimate h * (t ) can be regarded as a convolution smoothing of the formal derivative of the empirical cumulative hazards Hˆ (t ) = ¦ ti ≤t ai , where a i = d i /( No. of items at risk at time t i ) = d i /( N − rank of t i + 1). Let T1, T2, . . . , Tn be independent and identically distributed lifetimes with distribution function F. Let C1, C2, . . . , Cn be independent and identically distributed censoring times with distribution function G which are usually assumed to be independent from the lifetimes. In the random censorship model we observe pairs (ti, di), i = 1, 2, . . . , n, where ti = min (Ti, Ci) and di = I{ti = Ti} indicates whether the observation is censored or not. It follows that the {ti} are independent and identically distributed with distribution function L satisfying L (t ) = F (t )G (t ) where E = 1 − E is the survival function for any distribution function E. The survival function F is the probability that an individual survives for a time greater or equal to x. Kaplan and Meier proposed the product limit estimate of F : Fˆ (t ) = § n− j · ¨¨ ¸¸ ∏ { j:t ( j ) < t } © n − j + 1 ¹ d( j ) (3.3) where t(j) denotes the j-th order statistics of t1, t2, . . . , tn and d(j) the corresponding indicator of the censoring status. 38 The hazard function h(t) is the probability that an individual dies at time t, conditional on he or she having survived to that time. If the life distribution F has a density f, for F (t) > 0 the hazard function is defined by h(t ) = f (t ) F (t ) (3.4) and the cumulative hazard function as H(t)= −log F (t). (3.5) Nelson proposed to estimate the cumulative hazard function H is estimated by H n (t ) = di ¦ n − i +1 (3.6) t( i ) ≤t Let [0, T], T > 0, be such an interval for which L(T) < 1. First, let us make some assumptions: (1) h ∈ C k0 [0, T ], k 0 ≥ 2. (2) Let Ȟ, k be nonnegative integers satisfying 0 Ȟ k − 2, 2 k k0. 39 (3) Let K be a real valued function on R satisfying conditions (i) Support (K) = [−1, 1],K(−1) = K(1) = 0. (ii) K ∈ Lip[−1, 1]. (iii) ­0 , 0 ≤ j < k; j ≠ v ° v ³−1 x K (t )dt = ® (−1) v! , j = v ° β ≠0 , j=k k ¯ 1 j Such a function is called a kernel of order k and the class of these kernels is denoted by SȞk. (4) Let {b(n)} be a non-random sequence of positive numbers satisfying lim b(n) = 0 , lim b(n) 2v +1 n = ∞ . n →∞ n →∞ These numbers are called bandwidths or smoothing parameters. The definition of the kernel given above is very suitable for the next considerations and moreover it will be very reasonable to assume that Ȟ and k have the same parity. This fact enables us to choose an optimal kernel. 40 The kernel estimate of the Ȟth derivative of the hazard function h is the following convolution of the kernel K with the Nelson estimator Hn: §t −u· ¸dH n (u ) b ¹ b 1 n § t − t (i ) · d (i ) ¸ = v +1 ¦ K ¨¨ , K ∈ S vk b i =1 © b ¸¹ n − i + 1 h *b( v,k) = 1 v +1 ³ K ¨© (3.7) Small values of b lead to very spiky estimates (not much smoothing) while larger b values lead to oversmoothing. The bandwidth, b can be alleviated by increasing the bandwidth of the kernel to a larger value such as 0.5. The estimate of variance, σ 2 can be done by σˆ 2 = 1 n 2 ¦ (t i − t ) n − 1 i =1 (3.8) The asymptotic (1- α ) confidence interval for h *(bv, K) (t ) is given by 1/ 2 h* (v) b, K ­° hˆb , K (t )V ( K ) ½° (t ) ± ® 2 v +1 ¾ °̄ (1 − Ln (t ))nb °¿ Φ −1 (1 − α / 2) (3.9) 41 where ĭ is the normal cumulative distribution function and Ln is the modified empirical distribution of L Ln (t ) = 1 n ¦1{t ≤t} . n + 1 i =1 i When estimating near 0 or T, then boundary effects can occur because the “effective support” [t – b , t + b] of the kernel K is not contained in [0, T]. This can lead to negative estimates of hazard functions near endpoints. The same can happen if kernels of higher order are used in the interior. In such cases it maybe reasonable to truncate hb*, K (t ) = max(hb*, K (t ),0) . The similar considerations can be made for the confidence intervals. The boundary effects can be avoided by using kernels with asymmetric supports. 3.4 The Parametric Estimation of Weibull Hazard Function 3.4.1 The Maximum Likelihood Estimate of Weibull Hazard Function The Weibull probability distribution of survival times is defined by two parameters which are the scale parameter, k and shape parameter, λ . The two parameters of the Weibull parametric distribution provide additional flexibility that potentially increases the accuracy of the description of collected survival data. 42 The maximum likelihood estimation method will be used to estimate the shape and scale parameter of probability density function of Weibull distribution. After obtaining the estimates, the values of these estimates will be fitted into the Weibull Hazard function and the graph of maximum likelihood estimate of hazard function will be plotted. The survival times of n individuals are now taken to be a censored sample from a Weibull distribution with scale parameter λ and shape parameter k. Suppose that there are r deaths among n individuals and n – r right censored survival times. By using equation n ∏{ f (t )}δ {S (t )} 1−δ i i i i , (3.10) i =1 the likelihood of the sample data can be obtained. The probability density, survival and hazard function of a W (λ , k ) distribution are given by f (t ) = λkt k −1 exp(−λt k ), S (t ) = exp(−λt k ), h(t ) = λkt k −1 . Note that, the scale parameter λ = α −1 is often use in place of λ . So, from expression (3.10), the likelihood of the n survival times is n ∏{λkt k −1 i exp(−λt ik )}δ i {exp(−λt ik )}1−δ i , i =1 where δ i is zero if the ith survival time is censored and unity otherwise. Equivalently, by equation 43 δi ­ f (t i ) ½ ® ¾ S (t i ) , ∏ i =1 ¯ S (t i ) ¿ n the likelihood function is n ∏{λkt k −1 δ i i } exp(−λt ik ) . i =1 This is regarded as a function of λ and k, the unknown parameters in the Weibull distribution, and so can be written L(λ , k ) . The corresponding log-likelihood function is given by n n n i =1 i =1 i =1 log L(λ , k ) = ¦ δ i log(λk ) + (k − 1)¦ δ i log t i − λ ¦ t ik , n and noting that ¦δ i = r , the log-likelihood becomes i =1 n n i =1 i =1 log L(λ , k ) = r log(λk ) + (k − 1)¦ δ i log t i − λ ¦ t ik . 44 The maximum likelihood estimate of λ and k are found by differentiating this function with respect to λ and k, equating the derivatives to zero, and evaluating them at λ̂ and k̂ . The resulting equations are r λ̂ n − ¦ t ik = 0 , ˆ (3.11) i =1 and n n r ˆ + ¦ δ i log t i − λˆ ¦ t ik log t i = 0 . ˆ k i =1 i =1 (3.12) From equation (3.11), n λ̂ = r / ¦ t ik , ˆ (3.13) i =1 and on substituting for λ̂ in equation (3.12), the following equation is obtained n r + ¦ δ i log t i − kˆ i =1 r n ¦t kˆ i kˆ i =1 i i ¦t log t i = 0 . (3.14) This is a non-linear equation in k̂ , which can only be solved using an iterative numerical procedure. Once the estimate, k̂ , which satisfies equation (3.14), has been found, equation (3.13) can be used to obtain λ̂ . 45 In practice, a numerical procedure, such as Newton-Raphson algorithm, is used to find the values λ̂ and k̂ which maximize the likelihood function simultaneously. This procedure is described in APPENDIX E. In that appendix it was noted that an important by-product of the Newton-Raphson procedure is an approximation to the variance-covariance matrix of the parameter estimates, for which their standard errors can be obtained. 3.4.2 The Least Square Estimate of Weibull Hazard Function The linear equation is as follows; ª 1 º ln ln « » = β ln x − β ln α . ¬1 − F ( x) ¼ Then, ­ ª º½ ° « » °° 1 1 ° »¾ , x = ¦ ln ®ln « i · »° n i =1 ° « § ¨1 − ¸ °¯ «¬ © n + 1 ¹ »¼ °¿ n y= 1 n ¦ ln xi , n i =1 46 ­ § ª ­ ½ ª ­ ½º ·¸½ ­ ½º ¨ « ° °° n ° n n °° ° » « » ° 1 °° ¸° ° °° 1 °° β̂ = ®n¦ (ln xi )¨ ln «ln ® » ¾ − ®¦ ln «ln ® » ¦ ln xi ¾ , ¾ ¾ ¨ « ° i °» ¸° ° i =1 « ° i °» i =1 ° i =1 ° 1− 1− ¨ ¸ ° ° ° ° °¿ ° « » « » °¯ ° + n 1 n 1 + ¿ ¯ ¿ ¯ ¬ ¼ ¬ ¼ © ¹¿ ¯ and αˆ = e § y−x · ¨− ¸ ¨ βˆ ¸ © ¹ To compare the Maximum Likelihood Estimate and Least Square Estimate, we used the Standard error test. It can be calculated as below se(estimate) = Variance[log(estimate )] , with 95% confidence interval; log(estimate) ± 1.96*se(estimate). CHAPTER 4 RESULTS AND DISCUSSION 4.1 Introduction This section discusses the result of the research on the application of hazard function in the case study of the use of an IUD for women. The findings will be divided into two important topics discussed in the earlier chapter which are parametric and nonparametric estimation of hazard function. 48 4.2 The Nonparametric Estimation of Hazard Function for the Time to Discontinuation of the Use of an IUD for 18 Women. 4.2.1 The Kaplan-Meier Estimate of Hazard Function The first nonparametric estimation method of estimating the hazard function is Kaplan-Meier method. The estimated hazard function for survival time t(j), j = 1, 2, …, r, where t(j) is the time to discontinuation of the use of an IUD for 18 women in weeks for jth interval and r is the greatest observation time is obtained by, dj . hˆ(t ) = n jτ j Note that, dj are the number of the discontinuation of the use of an IUD at the jth discontinuation time, nj are risk of discontinuation at time t(j) and IJj = t(j+1) + t(j). Next, the standard error of this hazard function is obtained by using the following equation, § nj − d j se{hˆ(t )} = hˆ(t ) ¨ ¨ n d © j j · ¸, ¸ ¹ 49 while the confidence interval is given by hˆ(t ) ± zα se{hˆ(t )} . 2 Table 4.1: Kaplan-Meier type estimate of the hazard function for the data of the time to discontinuation of the use of an IUD for 18 women. Time 95% Confidence τj nj dj hˆ(t ) se{ hˆ(t ) } 0- 10 18 0 0.0000 - - 10- 9 18 1 0.0062 0.0060 (-0.0056,0.0180) 19- 11 15 1 0.0061 0.0059 (-0.0055,0.0177) 30- 6 13 1 0.0128 0.0123 (-0.0113,0.0370) 36- 23 12 1 0.0036 0.0035 (-0.0033,0.0105) 59- 16 8 1 0.0078 0.0073 (-0.0065,0.0221) 75- 18 7 1 0.0079 0.0073 (-0.0064,0.0222) 93- 4 6 1 0.0417 0.0380 (-0.0328,0.1162) 97- 10 5 1 0.0200 0.0179 (-0.0151,0.0551) Interval Interval Table 4.1 shows the results of the estimated hazard function, the standard error and 95% confidence interval of the hazard function. From Table 4.1, the highest value of estimated hazard function hˆ(t ) is 0.0417 and the smallest value is 0.0000. For the standard error of the hazard function se( hˆ(t ) ) the highest value is 0.0380 with 95% confidence interval (-0.0328,0.1162). The smallest value is 0.0035 with 95% confidence interval (-0.0033,0.0105). 50 The value of estimated hazard function hˆ(t ) , increases from 0 to 10 weeks, 19 to 30 weeks and 36 to 93 weeks as the time to discontinuation of the use of an IUD for 18 women increases. For the rest of the weeks, the estimated hazard function, hˆ(t ) is decreasing. This result occurs since as the time goes by, the number of risk nj and the size of the time interval, IJj are increasing and decreasing differently. The same results happen with the standard error of the hazard function. As the value of estimated hazard function, hˆ(t ) increases or decreases, the standard error will also follow the same pattern. This happens since the value of standard error depends on the value of estimated hazard function. After calculating the estimated hazard function, the graph of hazard curve is plotted. Figure 4.1 shows a plot of the estimated hazard function, hˆ(t ) by using the Kaplan-Meier estimate. Figure 4.1: Kaplan-Meier type estimate of the hazard function for the data of the time to discontinuation of the use of an IUD for 18 women. 51 The solid line shows the hazard curve of the time 18 women commence use of an IUD until discontinuation. The number of weeks from the commencement of use of a particular type of intrauterine device (IUD) until discontinuation because of menstrual bleeding problems are recorded. The downward and upward lines occur every time the discontinuations occur. The largest number the estimated hazard function of the time to discontinuation of the use of an IUD occurs at the time interval 97 until 104 weeks. The estimated hazard function starts at 0.0000 at 0 weeks and ends with 0.0200 at 107 weeks. After 107 weeks, the hazard curve remains the same. This implies that there exist the proportion of long term survival which equals to 0.0200. 4.2.2 The Kernel Estimate of Hazard Function For kernel estimate of hazard function, two types of kernel function, K(t) had been used which are Epanechnikov and Biweight kernel. The following equation shows all these kernel function. Epanechnikov Kernel: K (h) = 0.75(1 − h 2 ) Biweight Kernel: K (h) = 15 (1 − h 2 ) 2 16 (4.1) (4.2) 52 where h = t − t( j) b , for t ( j ) ≤ t < t ( j +1) based on r ordered discontinuation times, t (1) , t ( 2) ,..., t ( r ) and b is the bandwidth. The larger the value of b, the greater the degree of smoothing. ~ The Kernel estimate of the hazard function hb, K (t ) is the following convolution of the Kernel K, § t − t (i ) ~ 1 hb , K (t ) = ¦ K ¨¨ b © b · d (i ) ¸¸ . ¹ n − i +1 (4.3) By using all the above equations, the following estimations of hazard function are obtained respectively, § § t − t( i ) · 2 · d ( i ) · ~ 1 n §¨ ¸ for Epanechnikov kernel and ¸ ¸ hb, K (t ) = ¦ 0.75¨1 − ¨¨ ¨ © b ¸¹ ¸ n − i + 1 ¸ b i =1 ¨ © ¹ © ¹ 2 2 § · ~ 1 n ¨ 15 §¨ § t − t(i ) · ·¸ d (i ) ¸ ¸ for biweight kernel hb, K (t ) = ¦ ¨ 1− ¨ b i =1 ¨ 16 ¨ ¨© b ¸¹ ¸ n − i + 1 ¸¸ ¹ © © ¹ where d(i) is the indicator function that indicates whether the observation is censored or not and n is the number of observations. For this case study, the censored discontinuation times occur when some women in the study ceased using the IUD 53 because the desire of pregnancy, or because they had no further need for a contraceptive, while others were simply lost to follow up. For this paper, the value of bandwidth, b is assumed to be equal to 110 since b needs to be greater than ( t − t i ) in equation (4.3) to avoid the value of estimated hazard ~ function, hb, K (t ) becomes negative and the n equal to 18 observations. ~ The estimated hazard function, hb, K (t ) for Biweight and Epanechnikov Kernel are stated in the appendix. After calculating the estimated hazard function, the graph of hazard function is plotted agains time, t in the interval from 0 until 107 weeks for based on r ordered discontinuation times, t (1) , t ( 2) ,..., t ( r ) where t(r) is the greatest discontinuation time and b = 110 is the bandwidth. 54 h(t) Epanechnikov Kernel of Estimated Hazard Function 7.00E-03 6.00E-03 5.00E-03 4.00E-03 3.00E-03 2.00E-03 1.00E-03 0.00E+00 1 13 25 37 49 61 73 85 97 Weeks, t EK(b=110) Figure 4.2: Epanechnikov Kernel type estimate of the hazard function for the data of the time to discontinuation of the use of an IUD for 18 women. Figure 4.2 shows the plotted Epanechnikov Kernel estimate of hazard function for the time 18 women commence use of an IUD until discontinuation. The estimated hazard function starts at 0 weeks and ends at 107 weeks. According to Figure 4.3, the lowest value occurs at the 0 months while the highest value occurs at 74 weeks. 55 The estimated hazard function is increasing during interval 0 until 74 weeks. This implies that woman run a high risk of discontinuation of the use of an IUD about 0 until 74 weeks after the use of an IUD. After that, during interval 74 until 107 weeks, the estimated hazard function is decreasing. This means that woman run a low risk of discontinuation of the use of an IUD about 74 until 107 weeks after the use of an IUD. Biweight Kernel of Estimated Hazard Function 1.00E-02 h(t) 8.00E-03 6.00E-03 4.00E-03 2.00E-03 0.00E+00 1 13 25 37 49 61 73 85 97 Weeks, t BK(b=110) Figure 4.3: Biweight Kernel estimate of hazard function for the data of the time to discontinuation of the use of an IUD for 18 women. 56 Figure 4.3 shows the plotted Biweight Kernel estimate of hazard function for the data of the time to discontinuation of the use of an IUD for 18 women. The estimated hazard function starts at 0 weeks and ends at 107 weeks. According to Figure 4.3, the lowest value occurs at the 0 weeks while the highest value occurs at 75 weeks. The estimated hazard function is increasing during interval 0 until 75 weeks. This implies that woman run a high risk of discontinuation of the use of an IUD about 0 until 75 weeks after the use of an IUD. After that, during interval 75 until 107 weeks, the estimated hazard function is decreasing. This means that woman run a low risk of discontinuation of the use of an IUD about 75 until 107 weeks after the use of an IUD. 57 4.2.3 The Comparison of Nonparametric Approaches in Estimating the Hazard Function ~ The square of standard error of the estimated hazard function sse{ hb, K (t ) } is calculated by using equation ~ hb, K (t )V ( K ) ~ sse{hb, K (t )} ≈ (1 − Ln (t ))nb while the 95% confidence interval of the estimated hazard function is obtained by using equation α ~ ~ hb, K (t ) ± {sse(hb, K (t ))}1 / 2 Φ −1 (1 − ) 2 where 1 V ( K ) = ³ K 2 (t )dt , −1 Ln is the modified empirical distribution function of L 58 Ln (t ) = 1 n ¦1{t ≤t} n + 1 i =1 i and ĭ is the normal cumulative distribution function. Table 4.2: Kernel estimate of the hazard function for the data of the time to discontinuation of the use of an IUD for 18 women. Mean of the Mean of Estimate Standard Error Lower Bound Upper Bound Epanechnikov 0.0058 0.0019 0.0035 0.0081 Biweight 0.0062 0.0022 0.0036 0.0088 Kernel 95% Confidence Interval Table 4.2 shows the Kernel estimate of the hazard function for the data of the time to discontinuation of the use of an IUD for 18 women. The value of the mean of Epanechnikov Kernel estimate of hazard function is 0.0058 with 0.0019 standard error and (0.0035,0.0081) 95% confidence interval. For the Biweight Kernel, the mean of estimated hazard function is 0.0062 with 0.0022 standard error and (0.0036,0.0088) 95% confidence interval. Since the value of the mean of standard error of Epanechnikov Kernel is smaller than Biweight Kernel, therefore, the Epanechnikov Kernel estimate of hazard function is the most precise nonparametric approaches to estimate the hazard function for the data of time to discontinuation of the use of an IUD for 18 women. 59 4.3 The Parametric Estimate of Weibull Hazard Function for the Data of Time to Discontinuation of the Use of an IUD for 18 Women. 4.3.1 The Maximum Likelihood Estimate of Weibull Hazard Function The survival time of n woman are taken to be a censored sample from a Weibull distribution with scale parameter k and shape parameter Ȝ. Suppose there are r deaths among n individuals and n-r right censored survival times. The hazard function of a Weibull, W(k, Ȝ) distribution is given by k§t· h(t ; k , λ ) = ¨ ¸ λ ©λ¹ k −1 . (4.4) By using Maximum Likelihood estimate, the following results hold; n ˆ kˆ = r / ¦ t iλ i =1 and (4.5) 60 n r + ¦ δ i log t i − λˆ i =1 r ¦t n ¦ t λ log t ˆ i λˆ i =1 i i i = 0. (4.6) The above equation is a non-linear equation in λ̂ , which can be solved by numerical procedure. Once the estimate of λ̂ , which satisfies equation (4.6), has been found, equation (4.5) can be used to obtained k̂ . The Maximum Likelihood estimate of k̂ and λ̂ , the standard error and the 95% confidence interval are obtained by fitting the Weibull distribution to the observed data. This can be done by using MINITAB software. Table 4.3: Maximum likelihood estimate of the hazard function parameter for the data of the time to discontinuation of the use of an IUD for 18 women. Parameter Estimate Standard Error Shape, k̂ 1.6764 Scale, λ̂ 98.6440 95% Confidence Interval Lower Bound Upper Bound 0.4604 0.9786 2.8717 20.1601 66.0859 147.2420 Table 4.3 shows the maximum likelihood estimate of the hazard function parameter for the data of the time to discontinuation of the use of an IUD for 18 women. The value of estimated shape parameter, k̂ is 1.6764 with 0.4604 standard error and (0.9785,2.8717) 95% confidence interval. For the scale parameter, λ̂ the estimated value is 98.6440 with 20.1601 standard error and (66.0859,147.2420) 95% confidence interval. 61 By substituting the values of estimated shape and scale parameters into equation (4.5), the following figure is obtained. MLE of Weibull Hazard Function 0.02 h(t) 0.015 0.01 0.005 0 1 11 21 31 41 51 61 71 81 91 101 Weeks, t MLE Figure 4.4: Maximum likelihood estimate of Weibull hazard function for the data of the time to discontinuation of the use of an IUD for 18 women. Figure 4.4 shows the maximum likelihood estimate of Weibull hazard function for the data of time to discontinuation of the use of an IUD for 18 women. The solid line shows the Weibull hazard curve which starts at 0 weeks and ends at 107 weeks. The lowest value of the Weibull hazard curve is 0 at weeks 0 and the highest value is 0.018 at 107 weeks. 62 The values of the Weibull hazard function are increasing as the time to discontinuation of the use of an IUD for 18 women increasing. This occurs since the value of the shape parameter, kˆ is greater than 1 which equal to 1.6764. 4.3.2 The Least Square Estimate of Weibull Hazard Function For the least square estimate of Weibull, W(k, Ȝ) shape and scale parameter, the following equation are obtained by using the linear equation is as follows; ­ ª 1 º½ ln ®ln « » ¾ = λ ln t − λ ln k ¯ ¬1 − F (t ) ¼ ¿ where F(t) is the conditional density function for Weibull distribution. Then, ­ ª º½ ° » °° « 1 n ° 1 »¾ , « t = ¦ ln ®ln i · »° n i =1 ° « § ¨1 − ¸ °¯ «¬ © n + 1 ¹ »¼ °¿ y= 1 n ¦ ln t i , n i =1 63 ­ ½ § ª ­ ª ­ ½º ½º ·¸½ ­ ¨ « ° °° n ° n n ° °° « » » °° 1 °° ° 1 °° ¸° ° λ̂ = ®n¦ (ln t i )¨ ln «ln ® » ¦ ln t i ¾ , » ¾ − ®¦ ln «ln ® ¾ ¾ ¨ « ° i °» ¸° ° i =1 « ° i °» i =1 ° i =1 ° 1− 1− ¸ ¨ ° ° ° ° ° °¿ « » « » °¯ ° n 1 n 1 + + ¿ ¯ ¿ ¯ ¬ ¼ ¬ ¼ ¹¿ ¯ © (4.7) and § y −t · ¨− ¸ λ̂ ¹ kˆ = e © . (4.8) The least square estimate of k̂ and λ̂ , the standard error and the 95% confidence interval are also obtained by fitting the Weibull distribution to the observed data. This can be done by using MINITAB software. Table 4.4: Least square estimate of the hazard function parameter for the data of the time to discontinuation of the use of an IUD for 18 women. Parameter Estimate Standard Error Shape, k̂ 1.3379 Scale, λ̂ 109.9520 95% Confidence Interval Lower Bound Upper Bound 0.5016 0.6416 2.7897 33.0248 61.0293 198.0920 Table 4.4 shows the least square estimate of the hazard function parameter for the data of the time to discontinuation of the use of an IUD for 18 women. 64 The value of estimated shape parameter, kˆ is 1.3379 with 0.5016 standard error and (0.6416, 2.7897) 95% confidence interval. For the scale parameter, λ̂ the estimated value is 109.9520 with 33.0248 standard error and (61.0293, 198.0920) 95% confidence interval. By substituting the values of estimated shape and scale parameter into equation (4.4), the following figure is obtained. h(t) Least Square Estimate of Weibull Hazard Function 0.014 0.012 0.01 0.008 0.006 0.004 0.002 0 1 11 21 31 41 51 61 71 81 91 101 Weeks, t LSE Figure 4.5: Least square estimate of Weibull hazard function for the data of the time to discontinuation of the use of an IUD for 18 women. 65 Figure 4.5 shows the least square estimate of Weibull hazard function for the data of time to discontinuation of the use of an IUD for 18 women. The solid line shows the Weibull hazard curve which starts at 0 weeks and ends at 107 weeks. The lowest value of the Weibull hazard curve is 0 at weeks 0 and the highest value is 0.0121 at 107 weeks. The values of the Weibull hazard function are increasing as the time to discontinuation of the use of an IUD for 18 women increasing. This occurs since the value of the estimated shape parameter, kˆ is greater than 1 which equal to 1.3379. 4.3.3 The Comparison of Standard Error of Estimated Weibull Hazard Function Standard error based on each of the estimated Weibull parameters are again more accurately constructed using a logarithmic transformation. The following equation shows the standard error of Weibull parameters se(estimate) = Variance[log(estimate )] . An approximate 95% confidence interval based on the normal distribution and the transformed log-estimates is log(estimate) ± 1.96*se(estimate) 66 The log-transformation improves the normal distribution approximation by creating a more symmetric distribution. Table 4.5: The Comparison of Standard Error of Estimated Weibull Hazard Function. Standard Error Estimation Parameter Method Maximum Estimate Standard Error Lower Upper Bound Bound 2.8717 Shape, k̂ MLE 1.6764 0.4604 0.9786 Estimate Scale, λ̂ MLE 98.6440 20.1601 66.0859 147.2420 Least Square Shape, k̂ LSE 1.3379 0.5016 0.6416 Estimate Scale, λ̂ LSE 109.9520 33.0248 61.0293 198.0920 Likelihood 2.7897 Table 4.5 shows the comparison of standard error of estimated shape and scale parameter of Weibull hazard function for maximum likelihood estimate and least square estimate. The values in bold numbers show the standard error of respective parameters. From Table 4.5, maximum likelihood estimation method has the lowest standard errors of shape and scale parameter compared with least square estimation method such that k̂ MLE =0.4604 is less than kˆLSE = 0.5016 and λˆMLE = 20.1601 is less than λˆLSE = 33.0248 . Thus, maximum likelihood estimation method is more precise than least square estimation method in estimating the shape and scale parameter of Weibull hazard 67 function. This result holds since the maximum likelihood estimation process takes into account censored observations making the estimates unbiased. 4.4 The Comparison of the Nonparametric and Parametric Approaches in Estimating the Hazard Function The comparison of the nonparametric and parametric approaches in estimating the hazard function of the use of an IUD for 18 women is done by comparing their standard errors which are root mean square errors (RMSE). The RMSE is given by RMSE = 1 n 2 ¦ et n i =1 where et = h(t ) − hˆ(t ) . Note that h(t ) is the Kaplan-Meier estimate of hazard function and hˆ(t ) is the best nonparametric and parametric estimate of hazard function chosen earlier. 68 Table 4.6: The standard error (RMSE) of nonparametric and parametric estimation methods of hazard function for the data of the time to discontinuation of the use of an IUD for 18 women. Estimation Method Standard Error (RMSE) MLE 0.0070 EK 0.0085 Table 4.6 shows the standard error (RMSE) of nonparametric and parametric approaches in estimating the hazard function for the data of the time to discontinuation of the use of an IUD for 18 women. From Table 4.6, the Maximum Likelihood Estimate of hazard function has smaller value of standard error which is 0.0070 compared with Epanechnikov Kernel, which is 0.0085. Therefore, the Maximum Likelihood estimate of hazard function is the most appropriate approach to estimate the hazard function for the data of the time to discontinuation of the use of an IUD for 18 women. . 69 Figure 4.6: The Graphical Comparison of the Nonparametric and Parametric Approaches in Estimating the Hazard Function for the data of the time to discontinuation of the use of an IUD for 18 women. Figure 4.6 shows the estimated hazard function for the data of the time to discontinuation of the use of an IUD for 18 women by using Kaplan-Meier, Epanechnikov Kernel and Maximum Likelihood estimation method. The solid line shows the estimated hazard function by using Kaplan-Meier estimation method. This line is compared with two other doted line which are Epanechnikov Kernel and Maximum Likelihood estimation method of estimating hazard function. 70 From weeks 0 until 36, the Epanechnikov Kernel and Maximum Likelihood estimate of hazard function seem equally fitted with Kaplan-Meier estimate of hazard function. Then, during weeks 36 until 94, the Maximum Likelihood estimate has overestimated the Kaplan-Meier hazard function estimate compared with the Epanechnikov Kernel estimate. Next, on 94 until 107 weeks, the Epanechnikov Kernel estimate has underestimated the Kaplan-Meier hazard function estimate. In conclusion, the Maximum Likelihood estimation of hazard function fitted well with Kaplan-Meier hazard function estimation compared with Kernel estimate of hazard function. This holds since the Maximum Likelihood estimation fitted the overall KaplanMeier estimate of hazard function. Furthermore, this result also true since the value of bandwidth chosen for Epanechnikov Kernel estimate of hazard function is biased compared with Maximum Likelihood estimate that takes into account censored observations making the estimates unbiased. CHAPTER 5 CONCLUSIONS AND SUGGESTIONS 5.1 Conclusions This dissertation has compared two nonparametric approaches which are Biweight Kernel and Epanechnikov Kernel and two parametric approaches which are Maximum Likelihood and Least Square estimate of hazard function with Kaplan-Meier estimate of hazard function. As stated in the introduction, the objective of this dissertation was to identify the most suitable approach that can be used to estimate the hazard function through the analysis of error and graphical comparison. Furthermore, this dissertation is also conducted to study the basic concept of Kaplan-Meier, Kernel method, Maximum Likelihood Estimate and Least Square Method in order to estimate the hazard function of the time to discontinuation of the use of an IUD for 18 women and to find their advantages and disadvantages. 72 The findings of this dissertation recommend that in all-purpose, the most suitable estimate that can be used to analyze the data of the time to discontinuation of the use of an IUD for 18 women is the Maximum Likelihood estimation of hazard function. This is because, it had the smallest value of standard error (Root Mean Square Error) and fitted well with Kaplan-Meier hazard function estimation compared with other estimates of hazard function. Furthermore, it also true since Maximum Likelihood Estimation of hazard function takes into account censored observations making the estimates unbiased. The results of this study imply that the biasness of a certain estimate of hazard function will influence the results of the most appropriate approaches in estimating the hazard function. However, these results are only true for the data of the use of an IUD for 18 women. The same study need to be carried out for other medical data in order to see if there are any resemblances with the options for the most appropriate approaches to estimate the hazard function. 5.2 Suggestions 5.2.1 Suggestions Based on Findings Based on the findings and the conclusions of the study, the following are several suggestions to be considered: 73 1. Other parametric distribution can be used to fit the estimated hazard function such as exponential, log-logistic and gamma distribution. 2. Other choices of bandwidth can be used in Kernel function estimate to get more precise results. 3. Other survival data can be applied into the estimated hazard function. 5.2.2 Suggestions for Future Research Since this research had only focused on the nonparametric and parametric approaches to estimate the hazard function, it is suggested that further studies need to be carried out on the semiparametric approaches in order to estimate the hazard function. Future study could also be carried out on survival data that has supplementary information that is recorded on each individual which is refer as explanatory variables since these variables may all have an impact on time that patients survives. 74 REFERENCES Agnew, D.C., Constable, C. (2004). Least Square Estimation. Collet, D. (1952). “Modeling Survival Data in Medical Research”. Boca Rotan, FL: Chapman & Hall. Collet D (1994). “Modeling Survival Data in Medical Research”. Chapman & Hall. Commenges, D., Huber, C. Nikulin, M.S., (2003). “Probability,Statistics and Modeling In Public Health”. New York, NY: Springer Science + Business Media. Cook, A. (2008). Survival and Hazard Functions. Introduction to Survival Analysis, National University of Singapore. Cox, D.R., Hinkley, D.V., Reid, N., Snell, E.J. (1991). “Statistical Theory and Modeling: In Honour of Sir David Cox, FRS. Edited By D.V. Hinkley, N. Reid and E.J. Snell”. London, New York : Chapman & Hall. Cox, D. R., Oakes, D. (1984). “Analysis of Survuval Data”. London, New York, Tokyo, Melbourne, Madras: Chapman & Hall. 75 Franklin, C. H. (2003). Maximum Likelihood Estimation. Duration Models: Exponential and Weibull Likelihoods, University of Wisconsin-Madison. Grimshaw, S. D., McDonald, J., McQueen, G. R., Thorley, S. (2003). Estimating Hazard Functions for Discrete Lifetimes. Radical Eye Software. Horová, I., Pospísil, Z., Zelinka, J. (2009). Hazard Function for Cancer Patients and Cancer Cell Dynamics. Journal of Theoretical Biology. 258, 437-443. Horová, I., Zelinka, J. (2006). Kernel Estimates of Hazard Functions for Biomedical Data Sets. Huang, B. (2005). Nonparametric Estimation of Hazard Function from Censored Data by Kernel Method. Ives, M., Funk, R., Dennis, M. (2000). LI Analysis Training Series. Survival Analysis / Life Tables, Chesnut Health System. Kim, C., Bae, W., Park, B.U. (2005). Nonparametric Hazard Function Estimation Using the Kaplan-Meier Estimate. Nonparametric Statistics, Taylor and Francis Group. 17(8), 937 - 948. Lee, E. T., Go, O. T. (1997). Survival Analysis in Public Health. Annual Review Public Health. 18, 105-134. 76 Müller, H. G., Wang, J. L. (2007). Density and Failure Rate Estimation with Application to Reliability. Encyclopedia of Statistics in Quality and Reliability. Nochai, T., Bodhisuwan, W. (2000). Statistical Reliability Analysis of Sometypes of Two-Parameter Life Time Distributions. Proceedings of the 2nd IMT-GT Regional Conference on Mathematics, Statistics and Applications. June, 13-15. Universiti Sains Malaysia, Penang. Pérez, G. E., Cimadevila, H. L., Río, A. Q. D. (2002). Nonparametric Analysis of the Time Structure of Seismicity in a Geographic Region. Annals of Geophysics. Vol. 45(3/4). Razali, A. M., Salih, A. A., Mahdi, A. A. (2009). Estimation Accuracy of Weibull Distribution Parameters. Journal of Applied Sciences Research, INSInet Publication. 5(7), 790-795. Rodríguez, G. (2005). Nonparametric Estimation in Survival Models. Princeton. Subramanian, S., Bean, D. (2008). Hazard Function Estimation from Homogeneous Right Censored Data with Missing Censoring Indicators. Statistical Methodology, ELSEVIER, Science Direct. 5, 515-527. Tutz, G., Pritscher, L. (1996). Nonparametric Estimation of Discrete Hazard Functions. Lifetime Data Analysis, Kluwer Academic Publishers. 2, 291-308. 77 Vaal, V. A., Koshkin, G. M. (1999). Kernel Nonparametric Estimation of the Hazard Rate Function and Its Derivatives. KORUS’99, Mathematics, IEEE Xplore. 496500. Wang, J. L. (2003). Smoothing Hazard Rates. Encyclopedia of Biostatistics. Wang, Q. H. (2008). Some Bounds for the Error of an Estimator of the Hazard Function With Censored Data. Statistics and Probability Letters. 44, 319-326. Xie, Z., Yan, J. (2008). Kernel Density Estimation of Traffic Accidents in a Network Space. Geography / Geology Faculty Publications, Western Kentucky University. 78 APPENDICES APPENDIX A The Application of Minitab in Estimating the Hazard Function In this dissertation, Minitab is used to estimate the weibull hazard function of multiple myeloma patients by using Maximum Likelihood Estimation and Least Square Method of estimation. The following steps show how to estimate the hazard function. Step 1: Key in all the data of the in the column “Patients” “Times” and “Status”. 79 Step 2: Then, click “Stat”, and choose “Reliability/Survival”. Next, click “Distribution Analysis (Right Censoring)” and “Parametric Distribution Analysis”. 80 Step 3: Then, insert the “Time” in “Variables” and choose “Weibull” in “Assumed distributions”. Step 4: Next, click “Censor” and click “Use censoring columns” and select column “Status”. Insert 0 in “Censoring value” and click “OK” 81 Step 5: Select “Estimate” and choose “Least Squares (Failure Time (X) on rank (Y)) ” for Least Square estimate and “Maximum Likelihood” for Maximum Likelihood estimate and clock “OK”. 82 APPENDIX B Estimated Hazard Function t KM EK(b=110) BK(b=110) LSM MLE 0 0 0.003306 0.002948 0 0 1 0 0.003394 0.003018 0.0024865 0.00076123 2 0 0.003481 0.00309 0.0031426 0.00121656 3 0 0.003567 0.003164 0.003604 4 0 0.003652 0.003239 0.0039718 0.00194424 5 0 0.003735 0.003316 0.0042828 6 0 0.003817 0.003394 0.0045549 0.00255776 7 0 0.003898 0.003473 0.0047984 0.00283885 8 0 0.003978 0.003554 0.0050199 0.00310719 9 0 0.004057 0.003635 0.0052237 0.00336486 10 0.0062 0.004134 0.003718 0.0054129 0.00361341 11 0.0062 0.00421 0.003801 0.0055901 0.00385403 12 0.0062 0.004286 0.003885 0.0057569 0.00408767 13 0.0062 0.004359 0.00397 0.0059147 0.00431508 14 0.0062 0.004432 0.004055 0.0060646 0.00453689 15 0.0062 0.004503 0.004141 0.0062076 0.00475363 16 0.0062 0.004574 0.004227 0.0063445 0.00496574 17 0.0062 0.004643 0.004313 0.0064758 0.0051736 18 0.0062 0.004711 0.0044 0.006602 0.00537754 19 0.0061 0.004777 0.004486 0.0067237 0.00557785 20 0.0061 0.004843 0.004573 0.0068413 0.00577476 21 0.0061 0.004907 0.00466 0.006955 22 0.0061 0.00497 0.004747 0.0070651 0.00615931 0.00160045 0.002261 0.00596852 83 23 0.0061 0.005032 0.004833 0.0071721 0.00634732 24 0.0061 0.005093 0.004919 0.0072759 0.0065327 25 0.0061 0.005152 0.005005 0.007377 0.00671559 26 0.0061 0.005211 0.00509 0.0074754 0.00689613 27 0.0061 0.005268 0.005175 0.0075713 0.00707444 28 0.0061 0.005324 0.00526 0.0076649 0.00725062 29 0.0061 0.005378 0.005343 0.0077563 0.00742478 30 0.0128 0.005432 0.005426 0.0078457 31 0.0128 0.005484 0.005509 0.0079331 0.00776738 32 0.0128 0.005535 0.00559 0.0080186 0.00793599 33 0.0128 0.005585 0.005671 0.0081024 34 0.0128 0.005634 0.00575 0.0081845 0.00826818 35 0.0128 0.005682 0.005829 0.0082651 0.0084319 36 0.0036 0.005728 0.005906 0.0083441 0.0085941 37 0.0036 0.005773 0.005983 0.0084217 0.00875486 38 0.0036 0.005817 0.006058 0.0084979 0.00891422 39 0.0036 0.00586 0.006132 0.0085728 0.00907222 40 0.0036 0.005902 0.006204 0.0086465 0.00922892 41 0.0036 0.005942 0.006276 0.0087189 0.00938436 42 0.0036 0.005981 0.006345 0.0087902 0.00953857 43 0.0036 0.006019 0.006414 0.0088604 44 0.0036 0.006056 0.006481 0.0089294 0.00984349 45 0.0036 0.006092 0.006546 0.0089975 0.00999426 46 0.0036 0.006126 0.00661 0.0090646 0.01014395 47 0.0036 0.00616 0.006672 0.0091307 0.01029259 48 0.0036 0.006192 0.006733 0.0091958 0.01044021 49 0.0036 0.006223 0.006791 0.0092601 0.01058684 50 0.0036 0.006252 0.006848 0.0093235 51 0.0036 0.006281 0.006904 0.0093861 0.01087722 52 0.0036 0.006308 0.006957 0.0094479 0.01102103 0.007597 0.0081029 0.0096916 0.0107325 84 53 0.0036 0.006334 0.007009 0.0095089 0.01116395 54 0.0036 0.006359 0.007059 0.0095691 0.01130599 55 0.0036 0.006383 0.007106 0.0096286 0.01144719 56 0.0036 0.006405 0.007152 0.0096874 0.01158756 57 0.0036 0.006427 0.007196 0.0097455 0.01172712 58 0.0036 0.006447 0.007238 0.009803 59 0.0078 0.006466 0.007278 0.0098598 0.01200389 60 0.0078 0.006483 0.007316 0.0099159 0.01214113 61 0.0078 0.0065 0.007351 0.0099714 0.01227763 62 0.0078 0.006515 0.007385 0.0100264 0.01241342 63 0.0078 0.006529 0.007417 0.0100807 0.01254849 64 0.0078 0.006542 0.007446 0.0101345 0.01268287 65 0.0078 0.006554 0.007473 0.0101877 0.01281658 66 0.0078 0.006565 0.007498 0.0102404 0.01294962 67 0.0078 0.006574 0.007521 0.0102926 0.01308201 68 0.0078 0.006582 0.007542 0.0103442 0.01321377 69 0.0078 0.006589 0.007561 0.0103953 0.01334489 70 0.0078 0.006595 0.007577 0.010446 71 0.0078 0.0066 0.007591 0.0104962 0.01360532 72 0.0078 0.006603 0.007603 0.0105459 0.01373464 73 0.0078 0.006605 0.007613 0.0105952 0.01386338 74 0.0078 0.006606 0.007621 0.010644 75 0.0079 0.006606 0.007626 0.0106924 0.01411916 76 0.0079 0.006605 0.007629 0.0107403 0.01424623 77 0.0079 0.006602 0.00763 0.0107878 0.01437275 78 0.0079 0.006599 0.007628 0.010835 79 0.0079 0.006594 0.007625 0.0108817 0.01462421 80 0.0079 0.006588 0.007619 0.0109281 0.01474917 81 0.0079 0.00658 0.007611 0.010974 82 0.0079 0.006572 0.007601 0.0110196 0.01499758 0.01186589 0.01347541 0.01399155 0.01449874 0.01487363 85 83 0.0079 0.006562 0.007589 0.0110648 0.01512105 84 0.0079 0.006551 0.007574 0.0111097 0.01524404 85 0.0079 0.006539 0.007557 0.0111542 0.01536656 86 0.0079 0.006526 0.007539 0.0111984 0.01548861 87 0.0079 0.006511 0.007518 0.0112422 88 0.0079 0.006496 0.007495 0.0112857 0.01573134 89 0.0079 0.006479 0.00747 0.0113288 0.01585203 90 0.0079 0.006461 0.007442 0.0113717 0.01597229 91 0.0079 0.006441 0.007413 0.0114142 0.01609212 92 0.0079 0.006421 0.007382 0.0114564 0.01621152 93 0.0417 0.006399 0.007349 0.0114984 0.0163305 94 0.0417 0.006376 0.007314 0.01154 0.01644907 95 0.0417 0.006352 0.007276 0.0115813 0.01656723 96 0.0417 0.006327 0.007237 0.0116224 0.01668499 97 0.02 0.006301 0.007196 0.0116631 0.01680235 98 0.02 0.006273 0.007154 0.0117036 0.01691932 99 0.02 0.006244 0.007109 0.0117438 0.01703591 100 0.02 0.006214 0.007063 0.0117838 0.01715211 101 0.02 0.006183 0.007015 0.0118234 0.01726794 102 0.02 0.006151 0.006965 0.0118629 0.0173834 103 0.02 0.006117 0.006914 0.011902 0.01749849 104 0.02 0.006082 0.006861 0.0119409 0.01761323 105 0.02 0.006046 0.006806 0.0119796 0.0177276 106 0.02 0.006009 0.00675 0.012018 0.01784163 107 0.02 0.005971 0.006692 0.0120562 0.0179553 0.0156102 86 APPENDIX C Standard Error of Kernel Estimate of Hazard Function t 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 SE(EK) SE(BK) 0.001001 0.001031 0.001014 0.001043 0.001027 0.001056 0.00104 0.001068 0.001052 0.001081 0.001064 0.001094 0.001075 0.001106 0.001087 0.001119 0.001098 0.001132 0.001109 0.001145 0.00115 0.00119 0.00116 0.001203 0.001171 0.001216 0.001215 0.001265 0.001225 0.001278 0.001235 0.001292 0.001245 0.001305 t 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 SE(EK) SE(BK) 0.001254 0.001318 0.001302 0.001373 0.001354 0.001431 0.001363 0.001445 0.001372 0.001459 0.001381 0.001473 0.001439 0.001538 0.001447 0.001552 0.001456 0.001565 0.001464 0.001578 0.001472 0.001591 0.00148 0.001604 0.001487 0.001617 0.001551 0.001691 0.001558 0.001704 0.001566 0.001716 0.001573 0.001729 87 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 0.00158 0.001741 0.001586 0.001753 0.001658 0.001836 0.001664 0.001848 0.001745 0.001943 0.001751 0.001954 57 58 59 60 61 62 0.002028 0.002341 0.002031 0.002347 0.002157 0.002497 0.00216 0.002503 0.002163 0.002509 0.002165 0.002515 0.001758 0.001966 63 0.002168 0.00252 0.001764 0.001977 64 0.00217 0.002525 0.001769 0.001988 65 0.002172 0.00253 0.001775 0.001999 66 0.002174 0.002534 0.00178 0.002009 67 0.002175 0.002538 0.001786 0.002019 68 0.002176 0.002542 0.001791 0.002029 69 0.002178 0.002545 0.001796 0.002039 70 0.002179 0.002547 0.002048 71 0.002179 0.00255 0.001805 0.002057 72 0.00218 0.002552 0.001809 0.002065 73 0.00218 0.002553 0.001813 0.002074 74 0.00218 0.002555 0.001817 0.002082 75 0.002331 0.002732 0.001821 0.002089 76 0.002331 0.002733 0.001913 0.002199 77 0.00233 0.002733 0.001917 0.002207 78 0.00233 0.002732 0.0018 88 0.002024 0.002333 79 0.002329 0.002732 0.002328 0.002731 103 0.002654 0.003078 81 0.002326 0.002729 104 0.002959 0.003428 82 0.002325 0.002728 105 0.00295 0.003414 83 0.002323 0.002725 106 0.002941 0.0034 84 0.002321 0.002723 107 0.005863 0.006771 85 0.002319 56 80 86 87 88 89 90 91 92 93 94 0.00272 0.002317 0.002716 0.002314 0.002713 0.002311 0.002709 0.002308 0.002704 0.002305 0.002699 0.002302 0.002694 0.002298 0.002688 0.002478 0.002897 0.002474 0.00289 95 0.002469 0.002882 96 0.002464 0.002875 97 0.002694 98 0.002688 0.003131 99 0.002681 0.003121 100 0.002675 0.003111 0.00314 AVERAGE 0.001929 0.002192 89 101 0.002668 0.0031 0.002661 0.003089 102 90 APPENDIX D The Application of Mathcad in Kernel Estimate of the Hazard Function § ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ t := ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ © 1 1 1 3 4 4 5 5 5 5 6 6 7 8 10 10 10 10 10 11 12 12 13 14 15 15 16 16 17 18 18 18 18 23 24 36 40 40 40 50 51 52 56 65 66 76 88 91 · ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¹ § ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ d := ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ © 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 0 0 1 1 1 0 1 1 1 1 1 0 0 1 1 1 1 1 0 1 1 1 0 0 1 1 0 1 1 · ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¹ 91 h := 94.5 k := 2 i := 1 .. 48 n := 48 x := 0 .. 90 Epanechnikov Kernel 1 λE ( x) := §¨ ¸· ⋅ ©h¹ 47 ¦ i=0 ª« ª« ª x − t º 2»º d ( i)» i «0.75⋅ «1 − « » h n − i + ¬ ¬ ¬ ¼¼ λE ( x) = º» » 1¼ 92 Standard Error Standard Error of Epanechnikov Kernel ´ µ vE( K) := µ ¶ 1 −1 ( )2 ª¬0.75 1 − x2 º¼ dx vE( K) = 0.6 Biweight Kernel §1· λB ( x) := ¨ ¸⋅ ©h¹ λB ( x) = 47 ¦ i=0 2 ª ª« ª x − t º 2º» « d ( i)» 15 i «§¨ ·¸ ⋅ «1 − « » ¬© 16 ¹ ¬ ¬ h ¼ ¼ n − i + º » » 1¼ 93 APPENDIX E The Newton-Raphson Procedure Models for censored survival data are usually fitted by using the NewtonRaphson procedure to maximize the partial likelihood function, and so the procedure is outlined in this section. Let * * u ( β ) be the p × 1 vector of first derivatives of the log-likelihood function in equation n & & ' ½ ­& & log L( β ) = ¦ δ i ®β ' xi − log ¦ exp(β ' xl )¾ i =1 l∈R ( t i ) ¯ ¿ with respect to the β -parameters. This quantity is known as the vector of the efficient & & scores. Also, let I ( β ) be the p× p matrix of negative second derivatives of the log& & likelihood, so that the ( j, k)th element of I ( β ) is & ∂ 2 log L( β ) . − ∂β j ∂β k & & The matrix I ( β ) is known as the observed information matrix. 94 According to Newton-Raphson procedure, an estimate of the vector of β - &ˆ parameters at the (s + 1)th cycle of the iterative procedure, β s +1 , is &ˆ &ˆ & &ˆ & &ˆ β s +1 = β s + I −1 ( β s )u ( β s ) , & &ˆ & &ˆ for s = 0,1,2,..., where u ( β s ) is the vector of efficient scores and I −1 ( β s ) is the inverse & of the information matrix, both evaluated at β̂ s . The procedure can be started by taking &ˆ & β 0 = 0 . The process is terminated when the change in the log-likelihood function is sufficiently small. When the iterative procedure has converged, the variance-covariance matrix of the parameter estimates can be approximated by the inverse of the information matrix, & & &ˆ evaluated at β̂ , that is I −1 ( β s ) . The square root of the diagonal elements of this matrix are then the standard errors of the estimated values of β 1 , β 2 ,..., β p .