Psychological Measurement and Testing Course Material

PsyQuesta Learn Psychology with Afa PSYCHOLOGICAL MEASUREMENT AND TESTING B.Sc Psychology SEMESTER -3 PsyQuesta Learn Psychology with Afa MODULE- 1 INTRODUCTION TO MEASUREMENT AND SCALING TECHNIQUES A. DEFINITION OF MEASUREMENT  “Assignments of numerals according to rules.” - Tyler  “Measurement consists of rules for assigning numbers to object in such a way as to represent quantities of attributes.” - Nunnally Measurement is the assignment of scores of individuals so that the scores represent some characteristics of the individuals. Psychological measurement can be achieved in a wide variety of ways, including self- report, behavioural and physiological measures. B. SCALES OF MEASUREMENT  It is the way that numbers are assigned determine scale of measurement. Each scale of measurement represents a particular property or set of properties of the abstract number system. C. LEVELS OF MEASUREMENTS 1. Nominal scale  Lowest level of measurement  Numbers are used to name, identify or classify persons, objects or groups  Nominal scales are really not scales and their only purpose is to name objects  Here, members of any two groups are never equivalent but all members of any one group are always equivalent.  Mathematical operations like addition, subtraction, multiplication and division are not possible  Admissible statistical operations: counting or frequency, percentage, proportion, mode and coefficient of contingency.  Drawback: this scale is most elementary and simple 2. Ordinal scale  Second level of measurement where there is property of magnitude but not of equal interval or an absolute zero.  Numbers denote the rank order of the objects or individuals. Here numbers are arranged from highest to lowest or lowest to highest.  This measure reflects which person is larger or smaller, heavier or lighter, brighter or duller etc. PsyQuesta Learn Psychology with Afa  E.g., socioeconomic status of people. Every member of upper class is higher in social prestige than every member of middle or lower classes. Likewise, every member of middle class is higher in social prestige than that of lower class.  Permissible statistical operations: median, percentile, rank correlation coefficient + all those are permissible for nominal scale.  Drawback: ordinal measures are not absolute quantities nor do they convey that the distance between the different rank values are equal. This is because they are not equal interval measurements nor do they incorporate absolute zero. there is no way to ascertain whether a person has any of the characteristics being measured. 3. Interval scale  Third level of measurement  Includes all the characteristics of nominal and ordinal scale  The salient feature of this scale that numerically equal distances in the scale indicate equal distances in the properties of the objects being measured.  Or we can say the unit of measurement is constant and equal  This scale does not have an absolute zero rather than the zero point which is arbitrary, i.e., the zero point does not tell the real absence of the property being measured.  Permissible statistical operations: AM (arithmetic mean), SD (standard deviation), Pearson R, t-test, F-test and other statistics based upon them.  Drawback: coefficient of variation cannot be applied. 4. Ratio scale  Highest level of measurement  Has all the properties of nominal, ordinal and interval scales plus absolute zero point  Salient feature of this scale: ratio of any two numbers is independent of the unit of measurement  Anything that can be measured from absolute zero can be measured within a ratio scale. E.g., measures of high, weight, length, width etc.  Here, the zero point is not arbitrary, but true.  All statistical operations including coefficient of variation can be used.  In social sciences like psychology and sociology, we frequently encounter with interval measurement because most of the data obtained from measurement are such that we assume an equal unit of measurement and arbitrary zero point. PsyQuesta Learn Psychology with Afa D. PROPERTIES OF LEVELS OF MEASUREMENT 1. Identity Assignment of numbers to respondent’s response and these numbers are just for the sake of identification. These numbers can’t be used for mathematical operations. E.g., roll number of students in a class. 2. Magnitude Property of “moreness”. Any scale is said to have the property of magnitude if it can be said that a particular instance of the attributes represents more, less or equal amounts of the given quantity that does another instance. E.g., height and weight of students in a class. 3. Equal interval A scale has the property of equal interval if the difference between any two points at any place on the scale has the same meaning as the difference between two other points that differ by the same number of scale units. E.g., difference between 4kg and 6kg on a weight measuring scale represent the same quantity as the difference between 14 kg and 16 kg, i.e., exactly 2 kg. Psychological tests rarely have the property of equal interval. For e.g., the difference between IQ 50 & 60 does not mean the difference between IQ 110 & 120, although each of these differ by 10 points. 4. Absolute zero An absolute zero is said to exist when nothing of the property being measured exist (temperature on 0 is not absolute zero as it still has some effect and we can’t say no temperature). It is very difficult or impossible for a psychological measurement to define zero. PsyQuesta Learn Psychology with Afa DISTINCTION BETWEEN PSYCHOLOGICAL MEASUREMENT AND PHYSICAL MEASUREMENT. PHYSICAL MEASUREMENT PSYCHOLOGICAL MEASUREMENT 1. Unit of measurement is fixedand constant throughout the measurement 1. Unit of measurement not fixed and varies throughout the measurement 2. There is a true zero point 2. There is an arbitrary zero point 3. More accurate and predictable 3. Less accurate and predicable 4. Measurement is direct 5. Entire quantity can be measured 4. Measurement is indirect 5. Entire quantity cannot be measure PsyQuesta Learn Psychology with Afa  Ps ycho logic al (or qualitative) measurement comprises the measurement of mental processes, traits, habits, tendencies and likes of an individual.  Physical (or quantitative) measurement comprises the measurement of objects,height, weight, length etc. which are physically present in the world E. PROBLEMS IN PSYCHOLOGICAL MEASUREMENT 1. Indirectness of measurement Most psychological and educational measurement are indirect. This is because psychological and educational variables cannot be observed and studied directly. E.g., intelligence of student cannot be seen, touched or experienced. 2. Incompleteness of measurement Psychological measures are generally incomplete and therefore measurement of psychological variables are also incomplete. E.g., when an investigating is assessing the attitude towards coeducation, he is required to construct a scale in which a number of samples of behaviour expressing such an attitude need to be incorporated. This number has no limit. Any attempt, to measure such an attitude would be partial and incomplete. 3. Relativity in measurement Psychological measurements are relative. This can be explained with an example: suppose Mohan, a student of class X was given arithmetical knowledge and English language test. Let us further suppose he scored 95% in English language but could not answer even a single item on the arithmetic test. We cannot say Mohan is a brilliant student on the basis English test result nor we can say he is dull on the basis of arithmetic test result. 4. Errors in measurement Measurement in physical sciences as well as behavioural sciences is most of the time not pure. It contains some uncontrolled factors, which produce some gross errors. Sources of errors in measurement are given below. Sources of errors in measurement: 1) Respondent: - One of important sources of errors is the respondent himself. Sometimes, the respondent is found reluctant to express his/her true feelings or it PsyQuesta Learn Psychology with Afa may be that sometimes due to lack of knowledge the person may not express himself clearly. In either case, the measurement loses its accuracy 2) Measurer: -Sometimes the behaviour, style and looks of the person who is measuring the phenomena, distort the process of measurement. His behaviour, style and looks may encourage or discourage certain types of replies from the respondent that affects the accuracy of the measurement. 3) Situation: - Situational factors also contribute to errors in measurement. Any situation that puts unnecessary strain on the respondent, tends to introduce errors in measurement. 4) Test instruments: -Poor psychometric qualities of the test and defective measuring instrument contribute to errors in measurement. F. CONCEPTS OF PSYCHOPHYSICS Psychophysics is defined as the scientific study of relationship between stimuli and sensations & perceptions evoked by these stimuli. It was introduced by Gustav Theoder Fechner in 1860’s. 1. Psychophysical methods: - A set of procedures that have developed to investigate sensory thresholds 2. Absolute threshold: - Absolute threshold is the lowest level of stimulus (light, sound, touch etc.) which an organism can detect at least 50% of the time. It is also known as absolute limen (AL), detection threshold, sensation threshold etc. 3. Difference threshold: - Difference threshold is the smallest amount of change in physical stimulus necessary for an individual to notice a difference in the intensity of the stimulus. It is also called difference limen (DL), just noticeable difference (JND) etc. 4. Subliminal perception: -Subliminal perception occurs when a stimulus is too weak to be perceived yet a person is influenced by it. A signal detected 49% of the time is would be subliminal. 5. Weber’s law  Developed by EH Weber.  It is the first systematic attempt to formulate a principle which governed the relationship between psychological experience and physical stimulus.  There are two types of stimuli:  Standard stimulus: fixed, no change in intensity or properties  Comparable stimulus: varies, intensity or properties can be altered “When the magnitude of standard stimulus is increased, the size of change needed for PsyQuesta Learn Psychology with Afa discrimination between the standard and comparable stimulus (or JND) is also increased.” E.g., candles in a room Equation: ∆R = K where, R ∆R = difference threshold OR just noticeable difference (JND) R = standard stimulus K = constant  Constant K in Weber’s Law is always a fraction known as Weber’s fraction or proportion. It indicates the proportion by which the standard stimulus must be increased in order to produce just noticeable difference (JND).  Weber’s law is regarded as a good measure of the overall sensitivity in different sense modalities.  Drawback: precision is lost when the standard stimulus reaches the extremes, i.e., when the standard stimulus become either very weak or very strong, the precision is lost to a great extent. 6. Fechner’s law  Developed by Gustav Theodor Fechner  Fechner’s law is an improvement up on Weber’s Law  Fechner’s law states that stimulus value and resulting psychological sensations have a logarithmic relationship i.e., when the stimulus value increase in geometrical progression, the psychological sensations increase in arithmetical progression.  Equation: R= K log S where, R=magnitude of sensation of response K=Weber’s constant S=magnitude of stimulus value  In other words, the law states that the magnitude of sensation (or response) varies directly with the logarithm of the stimulus value  Increasement by the process of multiplication is termed as geometrical progression and increasement by the process of addition is called arithmetic progression.  E.g., perceived loudness or brightness is proportional to logarithm of the actual intensity measured with an accurate non-human instrument. H.PSYCHOPHYSICAL & PSYCHOLOGICAL SCALING METHODS Scaling methods are methods through which stimuli or individuals are sorted according to some known and specified characteristics or attributes. Depending upon the classes of stimuli, there are two types of scaling methods: psychophysical and psychological scaling method Psychophysical scaling methods In this method, the stimuli presented have a physical referent, i.e., it can be PsyQuesta Learn Psychology with Afa described by a physical scale. E.g., grams, decibels, meter, centimetre etc… The purpose of the psychophysical scaling method is to discover some definite quantitative relationship subjective measurement of stimuli (as judged by the subject) and the objective measurement by physical scales. E.g., Method of average error (or method of adjustment), method of minimal changes (or method of limits), method of constant stimuli (or method of right and wrong cases) etc… Psychological scaling methods: In psychological scaling methods, stimuli possess psychological attributes, which lack an appropriate physical referent. According to Torgerson, “psychological scaling methods are procedures for constructing scales for the measurement of psychological attributes”. For measuring psychological attributes like sense of humour, extend of honesty, aggressiveness, pleasantness and so on, there are no physical scales. Therefore, these attributes are best measured by the psychological scales. 1. Method of average error 2. Method of minimal changes 3. Method of constant stimuli 4. Method of pair comparison 5. Method of rank order Method Of Average Error  Also known as method of adjustment, method of reproduction or method of equivalent stimuli  Introduced by Gustav Fechner  Subject is provided with a Standard Stimulus (St) and Comparable stimulus (Co) which maybe greater or lesser in intensity than standard stimulus. The subject is then required to adjust the Comparable stimulus until it appears to be equivalent to standard stimulus.  The difference between St and Co defines the error in each judgement.  A large number of such judgments are obtained and the arithmetic mean of those judgments are obtained. Hence the name method of average error  The value thus obtained is the value for Point of Subjective Equality (PSE).  The difference between St and PSE indicates the presence of the constant error (CE)  The purpose of this method is to determine equivalent stimuli by active adjustment of the Co by the subject in each trial and hence, the method is also known as “method of equivalent stimuli”. Method of Minimal Changes  It is also known as method of limits, method of JND.  For computing threshold in this method, two modes of presentation are PsyQuesta Learn Psychology with Afa adopted: ascending series, descending series  For computing differential threshold, the comparable stimulus (Co) is varied in possible small steps either in ascending or descending series and the subject is required to say in each step whether the Co is smaller (-), equal (=) or greater (+) than the St.  For computing absolute threshold no St is needed, the subject simply reports if they can detect the change in stimulus presented in ascending and descending series. Method of Constant Stimuli  Also known as method of right and wrong cases, method of frequency  In this method a number of constant stimuli are presented to the subject several times in random order.  To determine absolute threshold, the subject has to report each time whether he perceives or does not perceive the stimulus presented in random order Method of Pair Comparison  Formulated by Thurstone in 1927  In this method, stimuli are paired and the subject is required to make a comparative judgement by saying which member of each pair possess most of the traits being scaled.  Originally introduced by Cohen in 1894 and was further developed by Thurstone.  Extension of method of constant stimuli  Each stimulus is compared with every other stimulus and therefore each stimulus serves as a standard stimulus in turn. When the two subject is presented with the two stimuli they are required to tell which one of them is the greater. No equality judgement is allowed. Method of Rank Order  Also known as “order of merit”.  Developed from works of Cattell and Spearman.  In this method, all stimuli are presented simultaneously to the subject who is requested to rank them in order from high to low  Rank 1 indicates the best and the higher rank number indicates the inferior position of the stimulus being recorded  Most suited when number of stimulus or objects to be ranked is small.  No problem regarding the sequences of presentation of object arise.  The scale resulting from this method is known as “ordinal scale”.  Since objects are assigned higher and lower ranks, there is no way to commit the error of central tendencies. PsyQuesta Learn Psychology with Afa MODULE- 2 NATURE AND USE OF PSYCHOLOGICAL TESTS A.DEFINITION OF PSYCHOLOGICAL TEST  It is a standardized instrument consisting of a series of questions called items, which assess certain aspects of a person’s individual ability and describe it in terms of scores and categories.  A psychological test is essentially an objective and standardized measure of a sample behaviour.- Anne Anastasia, (1988). B.HISTORICAL BACKGROUND  Psychological testing in its modern form originated little 100 years ago in laboratory studies of sensory discrimination and reaction time.  It was the English biologist Sir Francis Galton who was primarily responsible for launching the testing movement.  A unifying factor in Galton's numerous and varied research activities was his interest in human heredity.He invented the first battery of tests for accessing sensory and motor aspects.  He also set up an anthropometric laboratory at the International Exposition of 1884 where, by paying three pence, visitors could be measured in certain physical traits and could take tests of keenness of vision and hearing, muscular strength, reaction time, and other simple sensorimotor functions  It was Galton's belief that tests of sensory discrimination could serve as a means of gauging a person's intellect  American psychologist James McKeen Cattell Newly established science of experimental psychology and the still newer testing movement merged in Cattell's work.  Kraepelin (1895), was interested primarily in the clinical examination of psychiatric patients, he prepared a long series of tests to measure what he regarded as basic factors in the characterization of an individual. The tests employing chiefly simple arithmetic operations, were designed to measure practice effects, memory, and susceptibility to fatique and to distraction. PsyQuesta Learn Psychology with Afa  A few years earlier. Oehm (1889), a pupil of Kraepelin, had employed tests of perception, memory, association, and motor functions in an investigation on the interrelations of psychological functions.  Another German psychologist Ebbinghaus (1897), administered tests of arithmetic computation memory span, and sentence completion to schoolchildren. The most complex of the three tests, sentence completion, was the only that showed a clear correspondence with the children's scholastic achievement  Like Kraepelin, the Italian psychologist Ferrari and his students were interested primarily in the use of tests with pathological cases.  An extensive and varied list of tests was proposed, covering such functions as memory imagination attention, comprehension Suggestibility, aesthetic appreciation, and many others. In these tests we can recognize the Brends that were eventually to lead to the development of the famous Binet intelligence Scale.  the examination of the mentally ill around the middle of the 19th century resulted in the development of numerous early tests.  In 1865 German physician Hubert Von Grashey developed the antecedent of memory drum as a means testing brain injured patients, shortly thereafter German psychiatrist Conrad Riger developed a test battery for brain damage  Experimental psychology flourished in the late 1800's in Europe and Great Britain pioneers such as Wundt, Galton, Cattle, Weschler showed that it was possible to expose the mind to scientific scrutiny and measurement. William Wundt founded the first psychological laboratory in 1879. Leipzig in Germany, followed by Galton in Britain.  Cattle studied the new experimental psychology with both Wundt and Galton. He invented the term mental test in1819,  Thorndike studied under Cattle and made monumental contributions in leaming theory and educational psychology Cattle's other scholar R S Woodworth authored the very popular and influential book Experimental Psychology.  Another scholar of Cattle, EK Strong developed a vocational interest, and the most influential scholar of Cattle, Weschler studied extensively on intelligence  The first modern intelligence test was invented in 1905 by Alfred Binet. PsyQuesta Learn Psychology with Afa Later in 1908, Binet and Simon published the revised version of the 1905 scale In 1911 a third revision of the Binet Simon scale appeared.  Later Stanford-Binet was published in 1916 by incorporating verbal and performance tests  The projective approach in psychology was associated with the word asutation mated developed by Francis Chalton Rorschach developed a personality test which is projective in nature known as the Rorschach inkblottast. In 1935 Morgan Murray developed Thematic Apperception Test (TAT) to study personality. In 1928 Payme developed a sentence completion test C. CHARACTERISTICS OF A GOOD TEST I. OBJECTIVITY A test must be free from subjective element so that there is complete interpersonal agreement among experts regarding the meaning of the items and scoring of the test.  Objectivity here, relates to two aspects of the test - objectivity of the items and objectivity of the scoring system.  By objectivity of items is meant that the items should be phrased in such a manner that they are interpreted in exactly the same way by all those who take the test.  For ensuring objectivity of items, item must have uniformity of order of presentation (i.e, either ascending or descending order)  By objectivity of scoring is meant that the scoring method of the test should be a standard one so that complete uniformity can be maintained when the test is scored by different experts at different times. II. RELATABILITY  A test must also be reliable.  Relatability refers to the self correction of the test. It shows the extend to which the result obtained are consistent when the test is administrated once or more than once on the same sample with a reasonable time gap.  Consistency in a results is obtained in a single administration is the index of internal consistency of the test and consistency in results obtained upon testing and retesting is an index of temporal consistency  Reliability thus includes both internal consistency as well as temporal consistency  For a test to be called sound it must be reliable because reliability indicates the extend to which the scores obtained in the test are free from such internal defects of standardization which are likely to produce errors of measurement. PsyQuesta Learn Psychology with Afa III. VALIDITY  Validity indicates the extend to which the test measure what it intends to measure, when compared with some outside independent criterion  In other words it is the correlation of the test with some outside criterion.  Generally, validity of the test depend upon the reliability because a test which yields inconsistent results (poor reliability) is ordinary not expected to correlate with some outside independent criterion. IV. NORMS  A test must also be guided by certain norms. Norms refer to the average performance of a representative sample on a given test.  There are for common types of norms - age norms, grade norms, percentile norms and standard score norms  Depending upon the purpose and use, a test constructor prepares any of these norms for his test.  Norms help in interpretation of scores, in the absence of norms no meaning can be added to the score obtained on the test. V. PRACTICABILITY  A test must also be practicable from the point of view of the time taken in its completion, length, scoring etc.  In other words the test should not be lengthy and scoring method must not be difficult nor one which can only be done by highly specialized persons. D. ETHICAL ISSUES IN PSYCHOLOGICAL TESTING Psychological testing refers to all the possible uses, applications and underlying important concepts of psychological and educational tests. To maintain its proper uses ans applications, the American Psychological Association (APA) has officially adopted a set of standards and rules in 1953 which have undergo continual review and refinement.The main ethical and moral issues relating to psychological testing can be described under the five heading I. ISSUES OF HUMAN RIGHT Various types of human rights have been recognized in the psychological testing. Among these rights is, right to not to be tested. In fact, persons who don’t want to subject themselves to testing, should not and ethically can’t be forced to accept this.  Moreover, individuals who finally decide to subject themselves to testing, have rights to know their test scores, their interpretations as well as basis of any decisions that affect their lives.  Likewise, these days other human rights such as the right to who will have the access to data of psychological testing, and the right to confidentiality of test results are being popularly discussed. PsyQuesta Learn Psychology with Afa II. ISSUE OF LABELLING On the basis of psychological testing, a person is given a certain label or diagnosed ha having a certain psychiatric disorder. This labeling has many harmful effects  Labeling can stigmatize a person for life and it also affects one’s access to help. Such a labeling creates additional problems  It also lower tolerance for stress and make treatment difficult  In view of these potential negative effect and dangers of labeling a person should have right to not to be labeled. III. ISSUES OF INVASION OF PRIVACY When people responds to items of psychological tests, they have little idea of what is being revealed by their responses, but somehow they feel that their privacy has been invaded.  psychological tests have very limited and pinpointed aim and they can’t invade the privacy of the person.  psychologists don’t consider it wrong, evil or detrimental to find out or collect information about the person, often the subjects privacy is invaded when information is misused  Psychologists are legally and ethically bound to maintain confidentiality and don’t reveal any more information than that is necessary to accomplish the purpose of testing.  Ethical code of APA(1992) has included confidentiality, i.e., which obviously indicates that personal information obtained by psychologists from any source is communicated to any others only with their consent.  Exception to this exist only if holding the information pose danger for them, others . IV. ISSUES OF DIVIDED LOYALITIES  Psychologists face conflicts when individuals welfare is is put at odds on hand and that of the institute that employed psychologist n the other.  Psychologist is to maintain test security, but also not violate the person’s right to know the basis of an adverse decision, this decisions when explained to a person might out go to others with same problems, who rightly can decide to outsmart the test V. RESPONSIBILITY OF TEST CONSTRUCTORS AND TEST USERS  The test constructor is responsible for providing all the necessary information.  Test constructors must provide a test manual which may clearly state the appropriate uses of the test, including data relating to reliability, validity and norms clearly specify about the scoring and administration standards.  Test users should have adequate knowledge. He should be aware of psychometric qualities of the test being used as well as the relevant literature  At any cost, a test user can not claim ignorance. PsyQuesta Learn Psychology with Afa E. FACTORS INFLUENCING TEST ADMINISTRATION Any influences that are specific to the test situation constitute error variance and reduce test validity. It is therefore important to identify any test related influences that may limit or impair the generalizability of test result. 1. EXAMINER INFLUENCES  Memorizing the exact verbal instructions is essential in most individual test. some previous formalities with the statement to be read prevents misleading and hesitation and permits a more natural, informal manner during test administration.  In individual test especially in the administration of performance test, such preparation involves the actual layout of necessary materials to facilitate subsequent use with a minimum of search or fun bling.  Materials should generally be placed on a table near the testing table so that they are within easy reach of the examiner but do not distract the test taker.  In group testing, all test blanks, answer sheets, special pencils or other materials needed should be carefully counted, checked and arranged in advance of the testing day.  Whether the examiner is a stranger or someone familiar to the test takers might make a significant difference in scores.  Children are more susceptible to examiner and situational influences than are adults. 2. TESTING CONDITION  The environment for testing should also be appropriate.  Noise-free room, proper seating arrangement, and adequate light should be provided to test takers  It is important to recognize the conditions which might affect the test scores e.g noise, privacy, traffic in the testing room etc, even the type of answer sheet employed may affect test scores.  Many subtle testing conditions affect the performance on ability as well as on performance tests. 3. TEST TAKER  The motivation of the test taker to take a test can greatly influence the test scores.  Test takers emotion, fatigue, any sort of handicap, mental set etc influence the test results.  It is important that the test taker be relaxed and a good rapport should be established.  practices designed to enhance rapport serve also to reduce test anxiety. Procedures tending to dispel surprise and strangeness from the testing situation and to reassure and encourage the subject should certainly help to PsyQuesta Learn Psychology with Afa lower anxiety 4. TEST ADMINISTRATION PROCESS  The main purpose of testing is to generalize the results obtained from the sample to those in non-test situations.  Any influences that are specific to the test situation constitute error variance and reduce test validity.  The testing process itself, examiner related variables, and the subject/examinee related variables all are potential confounding variables that need to be looked into, watched, and thoroughly controlled for the generalizability of the test. F. CLASSIFICATION OF PSYCHOLOGICAL TEST I. SPEED AND POWER TEST SPEED TEST A pure speed test is one in which individual differences, differ in speed of performance Items are arranged in uniformly low difficulty; the ability of test takers are concerned Speed tests trying to make every question correct if they have enough time. E.g: Clinical Speed and Accuracy Test POWER TEST Pure power tests are measure in performance of how much the testtakers knows or can do Time is up to the test. The difficulty of items is steeply graded, and items have a higher difficulty level also,that cant be answered. Power test is one where all test-takers have enough time to do their best; the only concern is what they can do. E.g: Raven’s Progressive Matrices II. INDIVIDUAL TESTS AND GROUP TESTS INDIVIDUAL TEST Administered to only person at a time Individual tests require one examiner and one subject E.g: Kaufman Scale Tests, KohsBlock Design Test. Rapport is essential Examiner’s role is important GROUP TEST Taken by a number of people together at same time One examiner works with many subject together E.g: WAIS, WISC, Standford -Binet Scales, paper and pencil tests Easy to administer, and easy to score. Examiners role is not as important PsyQuesta Learn Psychology with Afa III. VERBAL AND NON VERBAL TEST VERBAL TESTS use language to ask questions and demonstrate answers PERFORMANCE/ NON VERBAL Minimal or no use of language in solving the problems There is no need for a subject to manipulate Subject requires manipulation of objects, sketching maze, arranging pictures and completing patterns Both literate and illiterate can take the test Visual spatial tasks are used, non-verbal responses are yielded Limited to literate people Requires subject to give verbal responses, speaking or writing IV. OBJECTIVE AND SUBJECTIVE OBJECTIVE Test that measures an individual’s characteristics in a way that isn’t influenced by the examiner’s own beliefs; in this way, they are said to be independent of rater bias Objective items include multiplechoice, true-false, matching and completion Reliable and valid E.g: Self- Report Measure SUBJECTIVE An assessment tool that is scored according to or to standards that are less systematic Subjective items include short answer essays, problem solving and performance test items Reliability and validity are less accurate E.g: Case History and Interview V. CULTURE SPECIFIC AND SUBJECTIVE CULTURE SPECIFIC Targets on specific population SUBJECTIVE Does not target on a specific population Results are biased due to cultural Results are not biased due to cultural influence influence Population which is influenced by Administered to identify innate cultural elements display either low or abilities that is not affected by culture high scores in association with test norms PsyQuesta Learn Psychology with Afa VI. NORM REFERENCED AND CRITERION REFERENCED NORM REFERENCED Measures performance of one group of the test takers against another group of test takers. Each skill is tested by less than four items. The items vary in difficulty. CRITERION REFERENCED Measures performance of test takers against the criteria covered in the curriculum. Each skill is tested by less than four items to obtain an adequate sample of the students. Norm- Referenced test scores are Criterion - Reference tests need not be reported in a percentile rank. administered in a standardized format. If a test taker ranks 95%, it implies The score determines how much of the that he/she has performed better than curriculum is understood by the test 95% of the other test takers taker PsyQuesta Learn Psychology with Afa MODULE- 3 TEST CONSTRUCTION AND ADMINISTRATION A. GENERAL STEPS TO TEST CONSTRUCTION I. PLANNING  The first step in construction of a test is careful planning. At this stage the test constructor specifies the broad and specific objectives of the test in clear terms decides upon the nature of the content or item to be included, the type of instructions to be included, the method of sampling, a detailed arrangement of preliminary administration and the final administration, a probable length and time limit for the completion of the test, probable statistical method to be used, etc.  Planning also includes the total number of reproduction of the test to be made and a preparation of manual. II. WRITING DOWN THE ITEMS The second step in the test construction is the preparation of the items of the test.  According to Bean, an item is defined as the single question or task that is not even broken down into any smaller units.  Item writing starts with the planning done earlier the test constructor decides to prepare an essay test, the essay tests are written down. However, if he decides to create an objective test, he writes down the objective items such as the alternative response items, matching item, multiple choice items, completion item, pictorial form of items, etc. Depending on the purpose he decides to write any of their objective types of items.  There are some essential prerequisites which must be made if the item writer wants to write good and appropriate items. These requirements are enumerated as follows  The item writer must have a thorough knowledge and complete mastery of the subject matter.  The item writer must be fully aware of those person for whom the test is meant. He must be aware of intelligence level of those persons so that he may manipulate the difficulty level of the items for proper adjustment with their ability level.  The item writer must be familiar with different type of items along with their advantages and disadvantages. He must also be aware of the characteristics of good items and the common probable errors in writing items.  The item writer must have a large vocabulary. He must know the different meanings of the word so that the confusion in writing the items may be avoided. He must be able to convey the meaning of items in the PsyQuesta Learn Psychology with Afa simple possible language.  After writing down the items, they must be submitted to a group of subject experts for their criticisms and suggestions, which must then be duly modified.  The item writer must also cultivate a rich source of ideas for items. The common source for this are textbooks, journals, discussions, questions for interview, course outlines and other instructional materials.  After the items have been written down, they are renewed by some experts or by the item writer himself and then arrange in the order in which they have to appear in final.  Generally items are arranged in an increasing order of difficulty and those having the same form (i.e, alternative form matched) and dealing with the same contents are placed together III. ITEM ANALYSIS MEANING AD PURPOSE  After item writing, revision and editing they are subjected to a procedure called item analysis  Item analysis is a technique through which those items which are valid and suited to the purpose are selected and the rest are either eliminated or modified to suit the purpose  It demonstrates how effectively a given test item functions within the total test.  The validity of the whole test is depend upon the validity of the individual item.  It is a set of procedures that provide us with the estimate of validity of each item.  The main objective of item analysis are :  Item analysis indicate which items are difficult, easy, moderately difficult, or moderately easy. In other words it provides an index of the difficulty value to each item.  It also provides indices of ability of the item to discriminate between inferior and superior. In other words item analysis indicates the discrimination value of each item, this is known as item validity.  It indicates the effectiveness of the distractors in multiple choice items since multiple choice items are the more powerful and flexible objective items and many of standardized tests unlike this form of item. Although item analysis is done to indicate the extend to which the distractors or foils are effective in each item. It sometimes also indicates why a particular item in the test has not functioned effectively and how this might to modified so that its functional significance can be increased. ITEM DIFFICULTY  In item analysis the 1 st step is to find out the difficulty value of the item or PsyQuesta Learn Psychology with Afa index of difficulty of an item.  The difficulty value of the item is defined as the proportion or percentage of examines or individuals who answer the item correctly. this propotion or percentage is known as the index of difficulty of an item.  If an item was answers correctly by 90% of the examines it obviously means that the item is relatively easy and it is not well discriminating.  Maximum discrimination is possible when an item is answered correctly by 50% and wrongly by 50%.  The proportion passing an item is inversely related to the difficulty of an item. The higher the proportion or percentage getting the item right (i.e, higher the index of difficulty) the easier the item and lower the proportion getting the item right (i.e, lower the index of difficulty) the more difficult the item.  Any test to be called a measuring instrument must have some items of higher indices of difficulty, the items must have a normal distribution with respect to these indices of difficulty.  The reason why items of moderate difficulty indices are preferred is that they signal the maximum variance. As the index of the item increases / decreases the variance of the item gradually decreases. i.e, its ability to make comparisons among those those who pass and those who foil decreases.  Basically these are 2 important methods of determining the difficulty value of an item.: i. Method Of Judgement  In this method the difficulty value of an item is determined on the basis of the judgement of experts.  Items are given to a group of experts with the instruction to rank the items in their increasing order of difficulty. Subsequently, the test constructor takes a final decision keeping in view the commonality of ranks assigned to each item by the different judges as the experts.  The demerit of this procedure is that they cannot be fully reliable and objective. ii. Emperical Method  Also called the statistical method  It is the basic and scientific method of determining difficulty index of an item.  There are 2 common statistical method through which difficulty index can be estimated. One method is on the basis of response of all the examines whereas another method is on the basis of the response of only a position of the examines. P = R/N P - difficulty index R - no: of examine who pass the item. N - total no: of examines PsyQuesta Learn Psychology with Afa I. ITEM DISCRIMINATION  The determination of the index of discrimination is also known as validity index.  Index of discrimination is that ability of the item on the basis of which the discrimination or distinction is made between superiors and inferiors.  Bean has defined this index as “the degree to which the single item separates the superior from the inferior individual in the trait or group of traits being measured.”  From the point of view of the discriminatory power, all test items can be divided into items that are either (a) positively discriminating, (b) negatively discriminating or (c) non discriminating  A positively discriminating item may be defined as one in which proportion or percentage of correct answer is higher in the upper group. When the proportion or percentage of correct answer is lower in the upper group is called the negatively discriminating item.  Items in which the percentage or proportion of correct answers is approximately equal in both the groups is called a non discriminating item.  The negative and non discriminating items are dropped after item analysis. IV. PRELIMINARY ADMINISTRATION When the items have been written down and modified in the light of the suggestions and criticism given by the experts, the test is said to be ready for its experimental try out. The purpose of preliminary administration of any psychological test is as given below: i. Finding out the major weaknesses, omission, ambiguities and inadequacies of the items. ii. Determining the difficulty values of each item which in turn helps in selecting items for their even and proper distribution in the final form. iii. Determining the validity of the each individual item. The preliminary administration helps in determining the discriminatory power of each individual item. iv. Determining a reasonable time limit of the test. v. Determining the appropriate length of the test, i.e, it helps in determining the no: of items to be included in the final form. vi. Determining the inter-correlations of item so that overlapping can be avoided. vii. Identifying any weakness and vagueness in directions or instructions of the test as well as in the fore-exercises or sample questions of the test.  According to Conard (1951), there should be at least three preliminary administration of the test 1. 1st administration to detect any gross defects. Ambiguities and omission in items and instructions. For this, the no: of examines should be PsyQuesta Learn Psychology with Afa not less than 100. 2. 2nd administration to provide data for item analysis, and for this the no: of examines should be around 400. the sample for this must be similar to those for whom the test is intended. It aim at obtaining three kinds of information regarding the items : i. The difficulty value of the item ii. The discrimination index of the item iii. The effectiveness of distractors. 3. The 3rd preliminary administration is carried out to detect any minor defects that may not have been detected by the first two preliminary administration. At this stage the items are selected after item analysis and they constitute the test in the final form. The 3rd administration indicates how effective the test will really be when it would be administered on the sample for which it is really intended. V. STANDARDIZATION  Norms are established for the sake of standardization of any test. Standardization is the process whereby a test is administered to a representative sample of population whom the test is meant for, for the sake of establishing norms. A standardized test is the one that has normative data, as well as clearly specified administration and scoring procedures. VI. PREPARATION OF MANUAL AND REPRODUCTION OF THE TEST The last step in test construction is the preparation of a manual of the test.  In the manual the test constructor reports the psychometric properties of the test, norms and references. Thus gives a clear indication regarding the procedures of the test administration, the scoring methods and time limits if any, of the test.  It also includes instructions as well as the details of arrangement of materials i.e, whether the items have been arranged in random order or in any other order.  In general, the test manual should yield information about the standardisation sample, reliability, validity, scoring as well as practical considerations  The test constructor after seeing the importance and requirement of the test finally orders for printing of the test and the manual B. RELIABILITY AND ITS TYPE  Reliability is one of the important characteristics of any test. it refers to the precision or accuracy, of the measurement of scores in other words the score should be consistent. When all other factors are held constant or somehow controlled, a reliable test is one that produces identical results for an examinee from one occasion to the other. PsyQuesta Learn Psychology with Afa Test-Retest Reliability  Test-retest reliability deals with two performances of the same test by the same persons on two different occasions with a reasonable time gap.  This way of two administrations of the same test yield two independent set of scores. The two sets when correlated, give the value of the reliability coefficient.  The test-retest coefficient is also known as the coefficient of stability. The test takers' scores on first administration of the test are correlated with their scores obtained on the second administration of the same test.  a major advantage of this method is that there is maximum control over the test taker and test item variables. Alternate-Form Reliability  Also known as parallel-form reliability, equivalent form reliability, and the comparable forms reliability.  In order to overcome the problems caused by using the same version of the test on two occasions.  This involves computing the alternate-form reliability. In this approach the test developer develops two alternate or parallel forms of the same test. Ideally speaking the two forms should be independently developed and completely parallel or equivalent forms of the same measure. They should match each other in all respects. Two forms of the test are administered to the same sample, either immediately the same day or with the time interval of usually a fortnight. If reliability is calculated on the basis of data collected immediately on the basis of two administrations of the test, it is called alternate form (immediate)reliability and if reliability is calculated on the basis of data collected after a gap of fortnight, it is called alternate form (delayed) reliability. Internal consistency reliability  Also called as split half reliability.  Indicates the homogeneity of the test  Alternate form reliability is a popular type of reliability but with some obvious limitation like it is not easy and requires a lot of time and effort, the practice effect and prior experience may affect performance on the second occasion. The time gap between two administrations is another intervening variable.  To overcome these problems, split-half reliability is computed.  The test is arbitrarily divided into two halves. Scores on one half are correlated with scores on the other, using Pearson's r. The obtained value is the 'coefficient of internal consistency.  In this approach error caused by the temporal variation or different sets of items is controlled.  The common method of splitting the test is odd- even method. PsyQuesta Learn Psychology with Afa Inter-scorer Reliability/Inter-rater Reliability  At times the test may have to be evaluated by more than just one examiner.  In case of objective tests, a test taker will obtain the same marks no matter who, and how many people, evaluate it. On the other hand, when essay type or other open-ended items are to be evaluated, a need for multiple scorers may arise.  When more than one person is evaluating, inter-scorer reliability is required.  Some common procedures include the following: 1. Two examiners evaluate and mark a test of a group of test takers. The resulting two sets of scores are used for computing correlation between the two sets Intra-class coefficient/ coefficient of concordance. Two or more examiners score the performance of a number of test takers and the coefficient is computed from these scores. C. MEANING OF VALIDITY AND ITS TYPES  The term validity means truth or fidelity. Thus, validity refers to the degree to which a test measures what it claims to measure.  a test is valid to the extent that it serves the purpose for which it is to be used. Types Of Validity Content Validity:  Content validity is also known as curricular validity intrinsic validity, relevance, circular validity and representativeness.  Content validity is a non-statistical type of validity that is usually associated with the achievement test.  When a test is constructed so that its content of term measures what the whole test claims to measure, then the test is said to have content validity. For example, suppose a professor wants to test the overall knowledge of his students in the subject of elementary statistics, test will have content validity if the test covers every topic of elementary statistics that he taught in the class. Face validity Face validity  often confused with content validity  Face validity refers not what the test actually claims to measure but to what it appears to measure superficially.  in other words face validity is the mere appearance that the test has validity Thus, face validity should not be taken in the technical sense, nor should it be regarded as a substitute for objectively determined validity, when a test item looks valid to the group of examinees because it provides a logical link with the objective, the test is said to have face validity. PsyQuesta Learn Psychology with Afa Criterion-Related Validity  One approach to assessment of validity is through comparing the test results with a criterion.  Consider a situation where you are looking for a foot ruler to measure the length of a large sheet of paper. Someone brings a stick to you and tells you that the stick is foot long and you can use it instead of a ruler. You doubt this and want to ensure that this stick is really foot long. Now you look for another object that you are sure is foot long, though cannot be used for measuring another object. You see that the tiles in your kitchen are 6x6 inches square. You measure the stick against the tiles, and you see that the stick is equal to two tiles in length i e.. 12 inch long. What you have done here is, you measured your tool with reference to a criterion and the result of this exercise showed that your tool was valid. You can use it instead of a ruler.  There are two types of criterion-related validity: Predictive Validity: "The evidence that a test forecast score on the criterion at some future time", Predictive validity is also known as empirical validity or statistical validity.  In predictive validity, a test is correlated against the Criterion to be made available.sometime in the future. The test scores are obtained and then a time gap of month or years is allowed to elapse, after which the Criterion scores are obtained.  Concurrent validity: Concurrent valid evidence is defined as "evidence for criterion validity in which the test and the criterion are administered at the same point in time Construct Validity  Construct validity is measured keeping in view the particular construct that the test is supposed to measure.  If the test is found to measure the construct in question, then it is a valid test, and it does not measure the trait then the test is not accepted as valid.  Construct validity is about ensuring that the method of measurement matches the construct you want to measure. D. NORMS AND ITS TYPES  A norm represents a typical level of performance for a particular group.  In a test, norm is that score which has been obtained most by the group. Types Of Norms Percentile Norm  Percentile norm indicates, for each raw score, percentage of standardization sample that falls at or below that raw score. PsyQuesta Learn Psychology with Afa  For example, suppose Mohan has a score of 26 the mechanical reasoning test and if 40% of the standardization sample secures below the score of 26 Mohan has percentile rank of 40 or PR40 and percentile score of 26.  Percentile norms, provide a basis for interpreting an individual's score on a test in terms of his own standing in a particular standardization sample.  If the percentile norm is to be meaningful, it should be based upon a sample which has been made homogenous with respect to age, grade, sex occupation and other factors. Standard score norm  Norm based upon a standard score  here unit of the scale are equal so that they convey the same meaning throughout the whole range of the scale.  Standard score, like the percentile score, is a derived score, it has a specified or fixed mean and fixed standard deviation.  Different types of standard scores are z-score, t-score, stanine score etc.  Standard scores are needed primarily for two reasons. Firstly, when the performance of the same person on different tests to be compared, and it is best done through converting the raw scores into standard scores. Secondly, standard scores have equal units of measurement, and their size does not vary from distribution to distribution. Hence, they are frequently used in interpreting test scores. Age Equivalent Norms  Age equivalent norms are defined as the average performance of a representative sample of a certain age level on the measure of a certain trait or ability.  For example, if we measure the weight of a representative sample of 10year-old girls of the state of Bihar and find out the average of the obtained weight we can determine the age norm for the weight of 10-year-old girls.  Age norms are more suited to those traits or abilities which increase systematically with age. So, most of the physical treats like weight, height etc and cognitive abilities like intelligence shows such systematic change and age norms can be more appropriately used for these traits Grade-Equivalent Norms  grade equivalent norms are defined as the average performance of a representative sample of a certain Grade or class.  The test whose norms are being prepared, is given to the representative sample selected from each of the several grades or classes, After that the average performance of each grade on the given test is determined.  Grade equivalents represent the scores on educational achievement tests attained by children in a certain grade. These norms are obtained by calculating the mean raw scores of children in the standardization sample representing each grade. If 6th grade children in the standardization sample PsyQuesta Learn Psychology with Afa obtained a mean score of 35 in arithmetic test, then this raw score has a grade equivalent of 6. Hence a student obtaining 35 on the same test will be said to have a grade equivalent of 6.  t SCORE: -Standard score with the mean of 50 and standard deviation of 10 PsyQuesta Learn Psychology with Afa MODULE - 4 BASICS OF PSYCHOLOGICAL RESEARCH A. MEANING AND CHARACTERISTICS OF SCIENTIFIC RESEARCH  Scientific research is a systematic and objective attempt to provide answers to certain questions.  scientific research may be defined as the systematic and empirical recording and analysis of controlled observation, which may lead to development of theories, concepts, generalizations and principles, resulting in prediction and control of these activities that may have some cause effect relationship.  Kerlinger(1973) defined scientific research as “a systematic, controlled, empirical and critical investigation of hypothetical propositions about the presumed relations among natural phenomena.” B. CHARACTERISTICS OF SCIENTIFIC RESEARCH: 1. directed towards the solution of a problem or any relation between variables under study. 2. 3. Always based upon empirical or observable evidences. 4. Involves precise observation using valid and reliable instruments for collection of data and uses statistical methods for accurate description of the results obtained. 5. Emphasizes on development of theories, principles and generalizations regarding the variables studied and the populations. 6. systematic, objective and logical procedures. 7. Research is marked by patience, courage and unhurried activities. 8. Requires full expertise and awareness of relevant aspects of the problem, history, literature and sophisticated statistical methods for analysis. 9. The designs, procedures and results of scientific research must be replicable to ensure reliability and validity. 10. The research requires skills of writing and preparing a research report. C. TYPES OF RESEARCH 1. Historical research: investigates, records, analyses and interprets the events of the past for the purpose of discovering sound generalizations that are helpful and useful in understanding the past, the present and to a limited extent, the anticipated future. 2. Descriptive/non-experimental/Correlational research: describes, records, analyses and interprets the existing conditions. Here, an attempt is made to discover relationships that exist between non-manipulated variables, or some comparison or contrast among them. PsyQuesta Learn Psychology with Afa 3. Experimental research: one in which primary focus is upon variable relationship. Here, certain variables are controlled or manipulated and their effect is examined upon some other variables. 1. HISTORICAL  application of scientific method to the description and analysis of past events.  It is supposed to be a factual integrated account of the relationships between persons, times and events.  Historical research becomes necessary not only for knowing the past but also for understanding the present and predicting future in that context. Steps in historical research: 1. Selection of the problem – all problems are not suitable for historical research. Only those problems that can be studied with the help of historical records and have some social utility can be picked up. 2. Formulation of hypothesis – they are explicitly formulated in historical research. Researcher gathers evidence and carefully evaluates the trustworthiness of the hypothesis. 3. Collection of data – historical data is collected from primary sources and secondary sources. Primary sources are eyewitness accounts by actual observers or participants in the event. Secondary sources are gathered after the event talking to the observers and participants. 4. Analysis and generalization – historians employ principles of probability and rigorously subject the evidence to critical analysis. 2. DESCRIPTIVE RESEARCH V/S EXPERIMENTAL RESEARCH  descriptive research is where independent variables cannot be manipulated and cannot be experimentally studied.  The subjects are not randomly assigned into different treatment conditions.  the response of a group of subjects are measured on one variable and then compared with their measured responses on another variable.  Such research is also called R/R research because changes in one set of responses are compared with possible changes in another set of responses.  Field studies, ex post-facto research, survey research, content analysis, case studies etc are examples.  An experimental research is where the independent variables can be directly manipulated by the experimenter, and participants or subjects are randomly assigned into different treatment conditions.  experimental research is sometimes called the S/R research because the researcher manipulates a stimulus in order to establish whether or not this produces a change in certain response.  They are further of two types – laboratory experiments and field experiments. PsyQuesta Learn Psychology with Afa 3. BASIC VERSUS APPLIED RESEARCH  Basic or fundamental or pure research is the formal and systematic process where the researcher’s aim is to develop a theory or a model by identifying all the important variables in a situation.  They try discovering broad generalizations and principles about those variables.  It utilizes a careful sample so that its conclusion can be generalized to immediate situations but has little concern with the actual application of its generalizations or principles.  Applied research, as its name implies, applies the theory or model developed through the basic research to the actual solution of problems.  It has characteristics common to fundamental research. It also tends to make generalizations and uses various sampling methods.  The main purpose of applied research is not to develop theories about a fact but to test those theories in actual situations. D. RESEARCH PROCESS Even though research is a tedious, painful and slow moving job, following certain steps in conducting the research the work can be carried smoothly and with least difficulty. 1. IDENTIFYING THE PROBLEM  The first step in conducting a research is to identify the problem.  The researcher must discover a suitable problem and define it operationally.  A problem is defined as that interrogative testable statement which shows a relationship between two or more variables in an unambiguous manner. For example, take the following statement:What is the relationship between academic ability and socioeconomic status? Characteristics of a problem statement: 1. A problem statement is clear, specific, unambiguous and substantially relevant 2. It expresses the relationship between two or more variables 3. A problem statement is testable with empirical methods 4. It must avoid moral or ethical judgments that are difficult to study 5. It can be general or specific with sufficient importance Ways in which problems manifest: 1. When there is a clear gap in the results of several investigations in the same field 2. When results of several studies disagree with each other 3. When the facts of a field are based on unexplained information Types of problems Research problems can be broadly categorized into two types: PsyQuesta Learn Psychology with Afa a. Solvable problem A solvable problem raises questions which can be answered with the use of our normal capacities. It has two features;  It can be solved by empirical methods that use observation of natural events  It is possible to advance a suitable hypothesis as a tentative solution to it b. Unsolvable problem It is one that raises questions which cannot be answered using our normal capacities. They usually pertain to supernatural phenomena. It has three characteristics as follows;  An unsolvable problem possess the trait of un-structuredness as the researcher’s interest is not clear but unorganized  It has inadequately defined terms and operational definition and variable contains vagueness  It is difficult to collect relevant data regarding the problem 2. HYPOTHESIS  When the problem has been stated, a tentative solution in the form of a testable proposition is offered by the researcher which is called a hypothesis.  According to McGuigan (1990) hypothesis is “a testable statement, of potential relationship between two (or more) variables, that is advanced as potential solution to the problem.”  On the basis of given definition, two points can be suggested about hypo thesis. First, a hypothesis is a testable statement where measurable variables are used. Second, a hypothesis exhibits either a general or specific relationship between variables. After testing the hypothesis the researcher may find it to be correct or incorrect.  A good hypothesis meets the following criteria or characteristics; 1. The hypothesis should be conceptually clear 2. The hypothesis should be testable 3. The hypothesis should be economical and parsimonious 4. The hypothesis must be related to the existing body of theory and facts 5. The hypothesis should have logical unity and comprehensiveness 6. The hypothesis should be general in scope 7. The hypothesis should be related to available scientific tools and techniques 8. The hypothesis should be in accord with other hypotheses in the field Types of hypothesis  Alternative Hypothesis (research or experimental hypothesis) - The alternative hypothesis states that there is a relationship between the two variables being studied (one variable has an effect on the other). It states that the results are not due to chance and that they are significant in terms of supporting the theory being investigated. PsyQuesta Learn Psychology with Afa  Null Hypothesis - The null hypothesis states that there is no relationship between the two variables being studied (one variable does not affect the other). It states results are due to chance and are not significant in terms of supporting the idea being investigated.  Directional Hypothesis - A one-tailed directional hypothesis predicts the nature of the effect of the independent variable on the dependent variable. E.g., adults will correctly recall more words than children.  Non-directional Hypothesis - A two-tailed non-directional hypothesis predicts that the independent variable will have an effect on the dependent variable, but the direction of the effect is not specified. E.g., there will be a difference in how many numbers are correctly recalled by children and adults.  Simple hypothesis: contains only one or two variables  Complex hypothesis: hypothesis that contains more than two variables and require complex statistical calculations 3. VARIABLES  variables may be defined as those attributes of objects, events, things and beings which can be measured.  variables are the characteristics or conditions that are manipulated, controlled or observed by the experimenter. Intelligence, anxiety, aptitude, income, education, authoritarianism etc are examples.  Variables can be classified in several ways. Some of the commonly accepted classifications are given below. a) Dependent variables and Independent variables  The dependent variable is defined as one which experimenter makes a prediction.  The independent variable is defined as the one which is manipulated, measured and selected by the researcher to produce changes in the measured behaviour (dependent variable).  The dependent variable is the characteristic or condition that changes as the experimenter manipulates the independent variables. The independent variable is that characteristic or condition which is manipulated or selected by the experimenter to find its relationship in some observable phenomena.  For example if a researcher wants to find the effect of teaching methods on academic achievement, teaching methods constitute the independent variable and the academic achievement constitute the dependent variable.  Extraneous variables: are the uncontrolled variable that may affect the dependent variable. The experimenter is not interested in finding the extraneous variable’s effect and thus has to try and control them as far as possible. They are also referred to as confounding variables or relevant variables. 4. FORMULATING RESEARCH DESIGN  it is the blueprint of the detailed procedures of testing the hypothesis and analyzing the obtained data. PsyQuesta Learn Psychology with Afa  the sequence of those steps taken ahead of time to ensure that the relevant data will be collected in a way that permits objective analysis of the different hypotheses.  The selection of any research design is based upon the purpose of the investigation, types of variables and the conditions in which the research has to take place.  Basically a research design serves two functions. i. it answers the research question as objectively, validly and economically as possible. ii. It acts as a control mechanism. This enables the researcher to control unwanted variances such as experimental variance, extraneous variance and error variance. Types of research design I. Within-groups design (repeated treatment/measures) If the researcher is using only one group of subjects who will be tested under different values of the independent variables, the resulting experimental design is called within-groups design. Such designs are of two sub types. a. Complete within-groups design - in which practice effects are balanced by administering the conditions several times to each subject, in different orders each time, to obtain interpretable results. b. Incomplete within-groups design – in which each condition is administered to each subject only once while varying the order of administration across the subjects in a way the practice effects can be neutralized. II. Between groups design (between subjects/independent measures) If the researcher decides to use two separate groups for each value of the independent variable, the resulting design is referred to as between-groups design. a. Randomized group design – one in which subjects are randomly assigned to the different groups meant for the different conditions or values of independent variable. b. Matched groups design – one in which the subjects are matched depending upon mean, standard deviation, pairs etc. c. Factorial design – in which two or more independent variables are studied in all possible combinations (each having a separate group) in order to study their independent and interactive effects on dependent variable. 5. REVIEWING LITERATURE  A collective body of works done by earlier scientists is technically called the literature. Any scientific investigation starts with a review of the literature.  The main objectives of literature review are given below. PsyQuesta Learn Psychology with Afa 1. Identifying variables relevant for research: when the researcher makes a careful review of the literature, he becomes aware of the important and unimportant variables in the concerned area of research. 2. Avoidance of repetition: helps the researcher in avoiding any duplication of work done earlier by someone. Prior studies have to be the foundation of the current study and in some cases replication of a study also can be needed. 3. Synthesis of prior works: enables the researcher to collect and synthesize the findings of previous studies related to the problem. It gives a wider perspective to the future research. 4. Determining meaning and relationship among variables: careful review of literature enables the researcher in discovering significant variables and identifying important relations among variables. Sources of review  Journals and Books- Different research journals and books relevant to the areas of interest are the primary sources of literature review. Libraries have books and periodical sections with journals. Referred journals carry only those articles which are carefully reviewed by experts before publication.  Reviews- Reviews are short articles that give brief information regarding the work done in a particular area over a period of time. Reviewers select research articles of their interest, organize them content-wise, criticize their findings and offer their own suggestions and conclusions. They can often guide you to serious books and journals.  Abstracts- In an abstract, researchers can get all the relevant information regarding the paper and the problem discussed. Only limitation of abstracts is that the researcher who desire detailed information regarding the methodology and results.  Internet- Today, the internet is a very easy and quick source of literature review. They provide an easy way to find the original articles related to the problem. Plenty sites that discuss issues relevant to the problem can give enough information and resources.  Doctoral dissertations- Dissertations by those who studied similar issues can be found at many institutes and libraries. The researcher can choose interested topics and find useful information and references there. Most dissertations include chapters like an introduction, review of the literature, purpose of the study, method of the study, data collected, analysis and interpretation, summary and conclusion.  Supervisors /research professors- Those who have been working with researchers for long time often know the literature well and are able to guide in right direction. Therefore, it is advised that a researcher consult supervisor, research professors and assistants in the process. 6. SAMPLING PsyQuesta Learn Psychology with Afa  A population may be defined as the identifiable and well specified group of individuals or objects the researcher wants to study. A population can be finite or infinite like the students of a class and fishes in a river respectively.  A measure based upon the entire population is called a parameter.  A sample is any number of units selected to represent the population according to some rule or plan. A measure based upon a sample is known as statistic.  most sampling methods can be categorized into two – 1. Probability Sampling Methods specify the probability or likelihood of inclusion of each element or individual in the sample. That means i. the size of parent population from which the sample has to be taken is known to the researcher. ii. each element of the population must have an equal chance of being included in a subsequent sample. iii. the desired sample must be clearly specified. The major probability sampling methods are the following: a. Simple random sampling- every member of a population has an equal chance of being a sample i.e., samples are randomly assigned b. Systematic sampling- every ith item from the list is selected as the sample. c. Stratified random sampling - used in cases where the data is heterogeneous. The population is divided into different strata with specific characteristics d. Area or cluster sampling- the total population is divided into different clusters\ groups and sample units are randomly from each of these clusters 2. Non-probability Sampling Methods there is no way of assessing the probability of the elements of the population being included in the sample. Important techniques of non-probability sampling are: a. Quota sampling- the sample size is fixed first and then quota is fixed for various categories of population b. Accidental sampling- investigator selects samples based upon his convenience, economy and not based on specific traits c. Judgemental/purposive sampling- Sampling units are selected on the basis of some belief of a researcher that it represents the population well. d. Snowball sampling-Initially selected respondent provides addresses of additional respondents. e. Saturation sampling - Drawing all elements of individuals having characteristics of interest to the invigilator is called saturation sampling PsyQuesta Learn Psychology with Afa 1. 2. 3. 4. 5. 6. FUNDAMENTALS OF SAMPLING Universe or population – population is a group of objects, animate or inanimate, finite or infinite under the study Sampling frame – the list of all items in a population Sampling design – it means a plan for obtaining a sample from the sampling frame. It refers to the procedure the researcher would adopt in selecting the sample. Statistic and parameter –statistic is a numerical value based upon the sample ,where as parameter is the numerical value based upon population Sampling error – Sampling error is a difference between the parameter and statistic, ie “a sampling error is a difference between parameter and estimate of that parameter which is derived from sample” Confidence level and significance level – confidence level is the likelihood that the actual value will fall within the stated precision limits. Significance level, on the other hand, is the level that indicates the likelihood that the answer will fall outside the precision range. If the confidence level is 95%, then the corresponding significance level is 100-95 = 5% or 0.05. E. DATA COLLECTION TECHNIQUES 1. QUESTIONNAIRE AND SCHEDULE  Questionnaires are used when factual information from the respondents is desired.  It consists of a form containing a series of questions where the respondents themselves fill in the answers.  It enables researchers to get first-hand information regarding the vagueness of items and establish a warm relationship with the persons being tested.  A schedule consists of a form containing a series of questions, which are asked and filled in by the investigator in a face to face situation.  An opinionnaire is an information form which attempts to measure the attitude or belief of an individual. Hence, an opinionnaire is also called a attitude scale. Types of Questionnaires 1. Fixed response questionnaire/ close ended  consists of statements or questions with a fixed number of options or choices. Example: Do you feel shy talking to members of opposite sex? (Yes/No)  Data collection and statistical analysis will be made easier with this type of questionnaires.  They will also be time- money and energy saving.  Disadvantages include the researcher’s inability to provide enough relevant responses and these questionnaires can encourage a respondent to adopt some kind of response set or bias. 2. Open end questionnaire  questions that require short or lengthy answers by the respondents.  particularly useful in situations where researchers have little information PsyQuesta Learn Psychology with Afa about the subjects to be studied  elicits unanticipated and insightful responses that can increase the researchers’ understanding.  Disadvantages include biases that may occur. Educational or socioeconomic backgrounds may affect the way different respondents understand and respond to items. They are also time consuming and may reduce participation, scoring and statistical analysis are more difficult and misleading. Example: What are the causes of student unrest in the campus? Characteristics of a good questionnaire: 1. Questionnaire must be concerned with specific and relevant topics. 2. Significance, objectives and aims of the questionnaire are clearly stated 3. Be short and avoid burdening the respondent 4. Use simple and clear language for directions and questions 5. Questions must be objective without hinting any possible answer 6. Embarrassing questions, presuming questions and hypothetical questions must be avoided 7. The items must be placed in a good and systematic order to elicit valid responses 8. Must be clearly arranged, printed attractive and neatly 2.INTERVIEW  Interview is a face to face interaction between interviewer and interviewee which intends to elicit some desired information from the latter. Like questionnaire, interview aims to obtain data regarding the respondents with minimum bias and maximum efficiency. Success of an interview depends on characteristics of the interviewer, the questions and the respondent. Types of interviews 1. Structured or formal interview – already prepared questions are asked in a set order by the interviewer and answers are recorded in a standardized form. since questions and response marking are structured, a relatively less trained interviewer can conduct the interview smoothly. The systematic procedure also makes them more valid than informal interviews. At the same time they are time consuming and expensive, also less valid than other methods like bio data analysis and standardized psychological test. 2. Unstructured or informal interview – there are no pre- determined questions nor any preset order of the questions and it is left to the interviewer to ask some questions in a way he likes regarding a number of points the interview is supposed to build up on. An advantage to informal interview is that the interviewer gets to probe more to gain a deeper understanding of the respondent’s behaviour. Disadvantages include the dependency on the interviewer that reduces reliability, need for skilled and trained interviewer and the data collected being difficult to quantify and analyse statistically. PsyQuesta Learn Psychology with Afa 3. 1. 2. 3. 4. 5. Advantages of Interviews Allows greater flexibility in the process of questioning Facilitates obtaining desired information readily and quickly Ensures the interviewee interpret and answer the questions themselves Validity of the verbal responses checked non-verbally Desired level of control can be exercised over the situations and contexts Disadvantages of Interviews 1. Interviewers’ variability 2. Inter-interviewer variability 3. Doubtful validity and dependability of verbal responses 4. Time consuming 5. Variations inherent to the interviewing context 6. No foolproof system for recording responses Important sources of error in interviews 1. Attitude of the interviewer 2. Incomprehensibility of the questions asked 3. Lack in warmth in the interview situation 4. Lack of motivation of respondents 5. Duration of the interview 3. CONTENT ANALYSIS  Content analysis\ document analysis, is a method used to analyse qualitative data.  it is a technique that allows a researcher to take qualitative data and to transform it into quantitative data (numerical data).  the analyzer takes the communications or documents prepared by the respondents/subjects and systematically finds out the frequency or proportion of their appearances.  The technique can be used for data in many different formats, for example, interview transcripts, letters, diaries, journals, stories, reports, academic writings, autobiographies, film, and audio recordings. It can also be used for analyzing responses to Projective tests. Methods of content analysis The most widely used method for content analysis was proposed by Berelson (1954) and has three steps. 1. Specification of the universe – what are we looking for (variables) 2. Identify units of the analysis – five major units are words, themes, items, characters and space-and-time measures 3. Quantification – assigning numerals to the objects of analysis 4. OBSERVATION  refers to watching and listening to the behaviour of other persons over time without manipulating or controlling it and record findings in a way that allow some degree of analytical interpretation and discussion. PsyQuesta Learn Psychology with Afa  Observation involves broadly selecting, recording and analysing behaviour for empirical aims of description or development of theory.  Observation usually occurs in natural settings although it could be in contrived settings as laboratory experiments and simulations. It captures significant events or occurrences affecting relations among the observed. It identifies regularities and recurrences in social life through comparing and contrasting other findings and theories. 1. 2. 3. 4. 5. 6. 7. Types of observation Systematic observation – done according to some explicit procedures as well as in accordance with the logic of scientific inference. Unsystematic observation – casual observation made by the investigator without specifying any explicit and objective inference. Participant observation – the investigator actively participates in the activities of the group to be observed. Also called disguised participant observation. Observations occurs in a natural setting enabling to record the behaviour in a realistic manner. observations of longer duration can yield broad and meaningful information for understanding human behaviour. It is usually unstructured and fails to be precise about data collection, making statistical analysis difficult. It is time consuming and makes the process effortful to observers. Since the participation is active, the observer sometimes develop emotions towards the observed making the data collected biased. Non-participant observation – the investigator observes the behaviour of subjects in their natural setting without any participation. usually structured, the observer plans for the nature of the setting they want to observe, representativeness of the data, and problems associated with the presence of observer. the obtained data is more reliable and representative. The observer can plan the aspects and process of observation in a nice way. The observer is also able to concentrate upon specific aspects of social behaviour and gets better chances to find data or solution regarding the probe.Limitations of this type of observation are (1) that the observed group and their setting does not remain a natural one with the observer around and (2) it captures less natural context of social settings than participant observation. Naturalistic observation - This technique involves observing involves studying the spontaneous behaviour of participants in natural surroundings. The researcher simply records what they see in whatever way they can. Controlled observation - The researcher decides where the observation will take place, at what time, with which participants, in what circumstances and uses a standardized procedure. Participants are randomly allocated to each independent variable group. Subjective observation - In subjective observation, which is also called introspection, a person observes his own mental activities. Introspection literally means, "Looking within". It means getting insight into one's own mental activities. In introspection, a person perceives, analyses, and gives a PsyQuesta Learn Psychology with Afa 8. report of his own feelings and experiences. 5. RATING SCALE  Rating scale is a technique to assess both actual behaviour as well as remembered behaviour.  It is defined as a technique through which the observer or rater categorizes the objects, events or persons on a continuum, represented by as series of continuous numerals.  The purpose of rating scale is to know what kind of impressions the objects or persons have made upon the raters. A rating scale usually has two, three, five, seven, nine or eleven points on a line with descriptive categories.  rating scales are divided into few categories. 1. Numerical scale - the observer is supplied with a sequence of numbers well defined as to what they mean. 2. Graphic scale - the scales are presented graphically in which descriptive cues corresponding to the different scales given. 3. Percentage rating - requires the rater to place the rates among different specified percentage groups. 4. Standard scale is one in which the rater is presented with some standards with pre-established scale values. 5. Scale of cumulated points - based on cumulated or summated points. Here the person’s total score is the sum of individual ratings assigned to all items on the scale. 6. Forced choice rating scale - the rater is given a set of attributes in terms of verbal statements for a single item and s/he gets to decide which one represents the rated object appropriately. F.CARRYING OUT STATISTICAL ANALYSIS  It is customary to choose the appropriate statistical tests on the basis of the obtained data.  If the data fulfil the requirement of parametric assumptions, any of the parametric tests which suit the purpose can be selected. If the data do not fulfil the parametric requirements, any of the non-parametric tests could be used.  Other things to be kept in mind in selection of appropriate statistical tests are the number of independent and dependent variables and the nature of the variables, that is, whether they are nominal, ordinal, ratio or interval.  When both independent and dependent variables are interval measures and are more than one, multiple correlation is the most appropriate statistic. On the other hand, when they are interval measures and their number is one, Pearson r may be used.  With nominal and ordinal measures the non-parametric statistics are the common choice.  Sometimes, researchers transform the measures so that appropriate statistical test may be applied without loss of much information. For PsyQuesta Learn Psychology with Afa  example, if scores of two groups on interval measures are available but the data do not fulfil the requirement for the t test, the researcher can transform the interval measures to ordinal measures and subsequently, apply the Mann-Whitney test. DESCRIPTIVE AND INFERENTIAL STATISTICS  Descriptive statistics describe a sample. take a group that you’re interested in, record data about the group members, and then use summary statistics and graphs to present the group properties.  With descriptive statistics, there is no uncertainty because you are describing only the people or items that you actually measure. You’re not trying to infer properties about a larger population.  Inferential statistics takes data from a sample and makes inferences about the larger population from which the sample was drawn. Because the goal of inferential statistics is to draw conclusions from a sample and generalize them to a population, we need to have confidence that our sample accurately reflects the population.  Statistics which enables to make inferences or generalizations about a population with known possibilities of error are called statistical inferences. any statistical analysis dealing with the entire population, or to describing the sample is descriptive statistics. On the other hand, if we deal with a sample, our analysis relates not only to the characteristics of the sample but it also provides information about the population. Such statistics are known as inferential statistics or inductive statistics or sampling statistics. DRAWING CONCLUSIONS  Once a researcher has designed the study and collected the data, using statistics, researchers can summarize the data, analyze the results, and draw conclusions based on this evidence.  Not only can statistical analysis support (or refute) the researcher’s hypothesis; it can also be used to determine if the findings are statistically significant. When results are said to be statistically significant, it means that it is unlikely that these results are due to chance.  Based on these observations, researchers must then determine what the results mean. In some cases, an experiment will support a hypothesis, but in other cases, it will fail to support the hypothesis.  Just because the findings fail to support the hypothesis does not mean that the research is not useful or informative. In fact, such research plays an important role in helping scientists develop new questions and hypotheses to explore in the future.  After conclusions have been drawn, the next step is to share the results with the rest of the scientific community. This is an important part of the process because it contributes to the overall knowledge base and can help PsyQuesta Learn Psychology with Afa other scientists find new research avenues to explore. G. STRUCTURE OF A RESEARCH REPORT  The writing of the research report requires imagination, creativity and resourcefulness. Research reports should be written in a dignified and objective style, although there is no one such style which is acceptable to all.  The research reports aim at telling the readers the problems instigated, the methods adopted, the results found and the conclusion reached.  The research paper should be written in a clear and unambiguous language so that the reader can also objectively judge the adequacy and the validity of the research.  For attaining objectivity, personal pronouns such as I, you, we, my, our, etc., should be avoided and as their substitutes, expressions like 'investigator’, ‘researcher’ should be used.  The research report, whether it is based on a dissertation or short-term research paper, should be of fairly standardized pattern.  Such standard conventions have been neatly outlined by Publication Manual of the American Psychological Association, and have been universally accepted.  The publication manual has suggestions as to how to write a report effectively, how to present ideas with concise expression and how to avoid ambiguity and increase re-adaptability of the report. The following outline or format is the typical research form prepared according to the American Psychological Association’s Publication Manual (1983); I. Title page a. Title b. Author’s name and affiliation c. Running head d. Acknowledgements II. Abstract III. Introduction (no heading) a. Statement of the problem b. Background/review of literature c. Purpose and rationale/hypothesis IV. Method a. Subjects b. Apparatus (if any) c. Design d. Procedure V. Results a. Tables and figures b. Statistical presentation PsyQuesta Learn Psychology with Afa VI. VII. VIII. Discussion a. Support or non-support of hypothesis b. Practical and theoretical implications c. Conclusions References Appendix (if appropriate) APA style of reference writing  References comprise all documents including journals, books, technical reports, computer programs and unpublished works mentioned in the text of the report.  References are arranged in alphabetical order by the last name of the author(s) and the year of publication in parenthesis,( ).  in case of unpublished citations, only the reference is cited.  Sometimes no author is listed and then, in that condition the first word of the title or sponsoring organization is used to begin the entry.  When more than one name is cited within parenthesis, the references are separated by semicolons. In parenthesis, the page number is given only for direct quotations. The researcher should check carefully that all references cited in the text appear in the references and vice versa.  References should not be confused with Bibliography.  A bibliography contains everything that is included in the reference section plus other publications which are useful but were not cited in the text or manuscript.  Bibliography is not generally included in research reports. Only references are usually included. The Publication Manual of the American Psychological Association (1983) has given some specific guidelines for writing references of various types as indicated below. 1. For reference of books with single author: Siegel, S (1956) Non-parametric Statistics for the Behavioural Sciences, New York: McGraw Hill. 2. For reference of books with multiple authors: Guilford, JP & Fruchter, B (1978) Fundamental Statistics in Psychology and Education, New York: McGraw Hill. 3. For reference of corporate or association as author: American Psychological Association (1983) Publication Manual (3rd ed.) Washington. 4. For reference of journal article: Edwards, AL & Kenny, KC (1946). A Comparison of Thurstone and Likert techniques of Attitude Scale Construction, Journal of Applied Psychology, 30, 70-73. 5. For reference of thesis or dissertation: Singh AK (1978) Construction and Standardization of a Verbal Intelligence Scale. Unpublished Doctoral Dissertation, Patna University, PsyQuesta Learn Psychology with Afa Patna. 6. For reference of chapter in an edited book: Atkinson, RC & Shiffrin, RM(1968) Human Memory: A proposed system and its control processes. In The psychology of Learning and Motivation, ed. KW Spence and JT Spence, vol. 2, pp 89-195. New York: Academic Press. PsyQuesta Learn Psychology with Afa

Psychological Measurement and Testing Course Material

Related documents

Products

Support

Psychological Measurement and Testing Course Material

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib