Kesahan Kandungan

OBJEKTIF 1.Dapat menjelaskan maksud kesahan dan keboleh percayaan sesuatu alat pengukuran penyelidikan 2.Dapat menghuraikan jenis-jenis kesahan dan keboleh percayaan alat pengukuran yang digunakan dalam penyelidikan KESAHAN (VALIDITY) KEBOLEHPERCAYAAN (RELIABILITY) Validity refers to the degree in which our test or other measuring device is truly measuring what we intended it to measure. Sejauh mana alat mengukur apa yang ia sepatutnya ukur Kesahan bermaksud kebolehan ujian mengukur apa yang sepatutnya diukur, Youngman & Eggleston, 1982; Sax & Newton, 1997) Kesahan sesuatu alat pengukuran merujuk kepada sejauh manakah alat yang digunakan mengukur data yang dikehendaki untuk mencapai objektif kajian (Mohd Majid Konting, 1990) Based on Internal Structure Kesahan Gagasan Construct (determination of the significance, meaning, purpose, and use of the scores) Based on Relations to Other Variables Based on content Kesahan Kriteria Criterion-referenced (scores are a predictor of an outcome or criterion they are expected to predict) Concurrent Evidence Predictive Evidence Kesahan Kandungan Content (representative of all possible questions that could be asked) Content validation is usually carried out by experts Kesahan Kandungan (Content Validity) Sejauh mana alat merangkumi kandungan sesuatu bidang.  Matlamat utama ialah untuk memastikan semua isi dan kandungan bidang yang diukur menggambarkan bidang tersebut.  Berdasarkan kepada skop dan objektif dan kandungan sesuatu bidang yang dikaji.  Pendapat pakar atau penilai luar diperlukan bagi menilai kesesuaian butiran bagi domain yang dipilih.  …is concerned with a test’s ability to include or represent all of the content of a particular construct. The question “1 + 1 = ___” may be a valid basic addition question. Would it represent all of the content that makes up the study of mathematics? It may be included on a scale of intelligence, but does it represent all of intelligence? The answer to these questions is obviously no. To develop a valid test of intelligence, not only must there be questions on math, but also questions on verbal reasoning, analytical ability, and every other aspect of the construct we call intelligence. There is no easy way to determine content validity aside from expert opinion. 1. 2. 3. Do the items appear to represent the thing you are trying to measure? Does the set of items underrepresented the construct’s content (i.e., have you excluded any important content areas or topics?) Do any of the items represent something other than what you are trying to measure (i.e., have you included any irrelevant items?) Sebelum sesuatu instrumen itu dikatakan mempunyai kesahan kandungan, lima syarat ini perlu dipenuhi: 1. 2. 3. 4. Bidang kandungan mestilah dinyatakan dalam bentuk tingkah laku yang secara umum diterima maknanya. Bidang mestilah dihuraikan dengan jelas. Bidang mestilah relevan dengan tujuan penggunaan ujian. Hakim-hakim yang berkelayakan mestilah bersetuju bahawa bidang telah disampel secara mencukupi. Evidence Based on Internal Structure To measure several components or dimensions of a construct.  Use Factor Analysis to analyzes correlations among test items and tells you the number of factors present. Its tell you whether the test is unidimensional or multidimensional.  Unidimensional – all the item measure are single construct.  Multidimensional – different set of item tap different construct or different component of a broader construct.  …… Internal Structure Factor analysis tell you how many dimensions or factors your test items represent.  Also can obtain a measure of test homogeneity (i.e., the degree to which the different items measure the same construct or trait)  Use coefficient alpha (Alpha Cronbach) for the test of homogeneity.  If the alpha is low (e.g., <.70) for the test, then some items might be measuring different constructs or some items might be bad.  Examine the items that are contributing to your low coefficient alpha and consider eliminating or revising them.  Kesahan Kriteria (Criterion Validity)      Obtained by relating your test scores to a relevant criterion. A criterion is the standard or benchmark that you want to predict accurately on the basis of scores from your test. Sejauh mana kaitan antara alat dengan kriteria luaran yang berkecuali (sama ada item mengukur kriteria yang hendak diukur). Ditentukan dengan analisis korelasi antara dua set markah. Calculate correlation coefficients for the study of validity – validity coefficients. Concurrent Validity refers to a measurement device’s ability to vary directly with a measure of the same construct or indirectly with a measure of an opposite construct. It allows you to show that your test is valid by comparing it with an already valid test. Administering the focal test and criterion test at approximately the same point in time (i.e., concurrently) and then correlating the two set of scores. If the two sets of scores highly correlated, you have concurrent evidence. e.g. A new test of adult intelligence, for example, would have concurrent validity if it had a high positive correlation with the Wechsler Adult Intelligence Scale since the Wechsler is an accepted measure of the construct we call intelligence. An obvious concern relates to the validity of the test against which you are comparing your test. Some assumptions must be made because there are many who argue the Wechsler scales, for example, are not good measures of intelligence. • Obtain predictive evidence of validity by measuring your participants at one point in time on your test and then, at a future time, measuring them on the criterion measure. • Take more time and effort than concurrent evidence, but it can provide superior evidence that your test does what you want it to do. In order for a test to be a valid screening device for some future behavior, it must have predictive validity. The SAT is used by college screening committees as one way to predict college grades. The GMAT is used to predict success in business school. And the LSAT is used as a means to predict law school performance. The main concern with these, and many other predictive measures is predictive validity because without it, they would be worthless Reliability is synonymous with the consistency of a test, survey, observation, or other measuring device. Imagine stepping on your bathroom scale and weighing 140 pounds only to find that your weight on the same scale changes to 180 pounds an hour later and 100 pounds an hour after that. Base on the inconsistency of this scale, any research relying on it would certainly be unreliable. Consider an important study on a new diet program that relies on your inconsistent or unreliable bathroom scale as the main way to collect information regarding weight change. Would you consider their results accurate? Sejauh mana instrumen mengukur dengan tekal apa yang hendak diukur.  Scores from measuring variables that are stable and consistent  Test-retest Reliability Internal Consistency Reliability Equivalent Forms Reliability Merujuk kepada ketekalan atau stabiliti markah ujian jika dilakukan pada masa yang berbeza. Contoh: Ujian diberikan kepada 100 individu untuk satu masa dan diulangi pada masa berlainan. Dua set markah ini dikorelasikan. Sekiranya individu memperoleh markah tertinggi dalam ujian 1 juga memperolehi markah tertinggi dalam ujian 2, begitu juga individu yang mendapat markah terendah dalam ujian 1 juga mendapat markah terendah dalam ujian, maka dikatakan mempunyai korelasi yang tinggi. Oleh itu soalan ujian tersebut mempunyai kebolehpercayaan yang tinggi. Refers to the consistency of a group of individual’s scores on two equivalent forms of a test designed to measure the same characteristic.  Menggunakan satu alat yang dibina dan satu lagi yang piawai.  Ditadbir ke atas subjek yang sama dan pada masa yang sama atau masa yang lain.  Equivalent form means that two tests are constructed so that they are identical in every way except for the specific items asked on the test.  This means that they have the same number of items, the items are the same difficulty level, the item measure the same construct, and the test is administered, scored, and interpreted in the same way.  The two set of scores are than correlated. If this reliability coefficient to be very high and positive, that is the individuals who do well on the first form of the test should also do well on the second form, and individuals who performed poorly on the first form of the test should perform poorly on the second test.  Internal consistency refers to how consistently the items on a test measure a single construct or concept.  The test-retest methods of assessing reliability are general methods that can be used with just about any test.  Internal consistency measures are convenient and are very popular with researchers because they require one group of individuals to take the test one time.  Two indexes of internal consistency: o Split half reliability o Coefficient alpha  Split-half reliability • Splitting a test into two equivalent halves and then assessing the consistency of the scores across the two halves of the test. • Divide the test into halves and correlate the scores from the two halves. • Compute the correlation between scores on the two halves of the test using Spearman-Brown formula. • The low correlation indicates that the test was unreliable, a high correlation indicates that the test was reliable. Coefficient alpha • Lee Cronbach 1951) developed coefficient alpha.. Alpha Cronbach • Coefficient alpha tells you the degree to which the items are interrelated. Rule of thumb: • At a minimum, greater than or equal to .07 for research purposes and somewhat greater than that value (e.g. ≥ .09) for clinical testing purposes. Pernyataan item mestilah jelas dan tepat.  Arahan mestilah jelas dan ringkas.  Item hendaklah bentuk sejenis.  Situasi dan masa pengukuran hendaklah piawai, serupa dan terkawal.  Elakkan gangguan ke atas subjek.  Elakkan kebimbangan subjek dengan memberi jaminan keselamatan dan kerahsiaan ke atas maklumat yang diberi.  Fasa terakhir tinjauan sebelum pengumpulan data bermula. Matlamatnya adalah untuk mencari masalah dalam soal selidik, termasuk soalan yang lemah, arahan yang tidak lengkap dan item yang sukar dijawab. Tidak boleh gunakan kumpulan fokus sebenar. Untuk kajian baharu, lakukan dua kali ujian rintis. Jumlah responden tidak ditentukan dengan tepat, dicadangkan sekurangkurangnya 25 orang, lebih baik antara 50 – 75 orang. Train researchers to collect observational data Develop standard written procedures for administering an instrument Obtain permission to collect and use public documents Respect individuals and sites during data gathering (ethics) Institutional or organizational (e.g., school district) Parents of participants who are not considered adults Campus approval (e.g., university or college) and Institutional Review Board (IRB)

Kesahan Kandungan

Related documents

Products

Support

Kesahan Kandungan

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib