Tutorial 1 elementary statistics 1. Define each of the following terms: i. Census An attempt to measure every item in the population of interest. ii. Sampling The process of selecting a sample from a population. iii. Pilot Survey A mini survey where the researcher sends out a questionnaire to a smaller pattern size as compared to the real target audience. 2. Discuss the differences between: i. Inferential Statistics and Descriptive Statistics Descriptive statistics includes the collection, organization, summarization, and presentation of data. Inferential statistics includes making inferences from samples to populations, estimations and hypothesis testing, determining relationships, and making predictions. Inferential statistics is based on probability theory. ii. Statistic and Parameter A statistic is a characteristic or measure obtained by using the data values from a sample. A parameter is a characteristic or measure obtained by using all the data values from a specific population iii. Qualitative data and Quantitative data Qualitative variables are variables that can be placed into distinct categories, according to some characteristic or attribute. For example, if subjects are classified according to gender (male or female), then the variable gender is qualitative. Other examples of qualitative variables are religious preference and geographic locations. Quantitative variables are numerical and can be ordered or ranked. For example, the variable age is numerical, and people can be ranked in order according to the value of their ages. Other examples of quantitative variables are heights, weights, and body temperatures. Quantitative variables can be further classified into two groups: discrete and continuous. iv. Discrete and Continuous data Discrete variables can be assigned values such as 0, 1, 2, 3 and are said to be countable. Examples of discrete variables are the number of children in a family, the number of students in a classroom, and the number of calls received by a switchboard operator each day for a month. Continuous variables, by comparison, can assume an infinite number of values in an interval between any two specific values. Temperature, for example, is a continuous variable, since the variable can assume an infinite number of values between any two given temperatures. v. Primary data and Secondary data Primary also known as firsthand data, or raw data, involves the interviewer personally conducting the research and directly obtaining information from respondents. The advantages of primary data include its heightened accuracy, reliability, and timeliness. When necessary, information is not available from other sources (secondary data), primary data becomes essential. Additionally, primary sources often provide insights into the methodology used and limitations associated with the gathered facts and figures. However, the drawbacks of primary data collection include its high cost, time-consuming nature, and the need for substantial manpower. Secondary data, sourced from newspapers, economic reports, annual company reports, statistical abstracts, and other outlets, can also be valuable for research purposes. Secondary data present advantages such as requiring less time, cost, and effort, and they are consistently available. Nevertheless, secondary data may contain errors due to printing or transcription issues from primary sources. Users of secondary data may also lack knowledge about the conditions under which the data were collected and summarized, necessitating careful consideration of relevance. Moreover, secondary data may not always fulfil the specific objectives of a research study. vi. Random sample and non random sample Random samples are selected by using chance methods or random numbers. One such method is to number each subject in the population. Then place numbered cards in a bowl, mix them thoroughly, and select as many cards as needed. The subjects whose numbers are selected constitute the sample. Non-random sampling is a sampling technique where the sample selection is based on factors other than just random chance. In other words, non-random sampling is biased in nature. Here, the sample will be selected based on the convenience, experience or judgment of the researcher. vii. Census and sample survey Census is the process of collecting data from every member of a population, while sampling is the process of selecting a subset of individuals from a population to represent the whole. Census data is more accurate and precise than sampling data, as it includes information from every member of the population. Census data collection is generally more time-consuming and expensive than sampling, as it requires reaching out to every individual in the population. Sampling is a more efficient method of data collection and can provide a good representation of the population with a smaller sample size. The census is usually taken for a country as a whole whereas sampling is used for studies and research, also it can be done for a specific region or group of people. viii. Sampling error and non-sampling error. Sampling error is the difference between the sample measure and the corresponding population measure since the sample is not a perfect representation of the population. Non-sampling error refers to an error that arises from the result of data collection, which causes the data to differ from the true values. It is different from sampling error, which is any difference between the sample values and the universal values that may result from a limited sampling size. 3. Give three steps that should be considered when designing a questionnaire. Specify the information to be collected. Questionnaire should be as short as possible. Questions must be kept simple and phrased to imply the same meaning to all respondents. 4. Primary data are usually preferred to secondary data because primary data is more meaningful and reliable that secondary data, but why do we sometimes use secondary data? As secondary data requires less time, cost, and effort, and they are consistently available. 5. State the sampling techniques used in each of the following cases: i. An auditor auditing the financial standings of a department by selecting every fifth file in the cabinet. Systematic sampling. Its judgement ii. Students of a particular school is grouped according to their race. Then a random sample from each race is selected. Stratified random sampling iii. An interview selects anybody that passes his house to gather their view about privatization of institutes of higher learning. Convenient sampling iv. A sociologist is interested in the mean income of households in an area of flat houses. There are 12 blocks of flats in the area in which he chooses 3 blocks at random. All the household s in the selected blocks are surveyed. Cluster sampling 7. A research was conducted to study the effectiveness of ‘A computer for every home’. The study focused on dwellers/ household of housing estates in town. Five housing estates were randomly selected from a total of 60 housing estates. Every household from the selected housing estates was studied on the effectiveness of the campaign. a. Was the data collected by the researchers primary or secondary data and state the reasons. The data collected by researchers are primary data as it is collected directly from the source for a specific research purpose. b. State the population for the above study. 60 housing estates c. State the sampling frame of the study. 60 housing estates d. What is the sampling technique used to select the household for the above research? The sampling technique used in the described research is a form of cluster sampling. The researchers first divided the population into clusters (in this case, housing estates) and then randomly selected a subset of those clusters for their study. Once the clusters (housing estates) were chosen, the researchers surveyed every household within the selected clusters. e. Explain briefly how 5 housing estates can be selected from 60 housing if the simple random sampling was employed. First give each estate a unique identifier. Then, using a random number generator, choose five estates. Which gives each estate has a fair shot at making it into the study on the 'A computer for every home' campaign's effectiveness. The selected estates become the group for the research, and this method ensures a level playing field where every estate has an equal chance of being part of the study. f. Explain briefly how 5 housing estates can be selected from 60 housing using systematic method. In a systematic sampling method to select five housing estates from a total of sixty, a consistent interval is established. First, the housing estates are assigned a sequence. For example, if the estates are listed from 1 to 60, every 60/5 = 12th estate would be selected. Starting at a random point, the first estate, every 12th estate in the sequence would then be included in the sample. This systematic approach ensures a representative selection across the entire population without relying on random chance, providing a structured and efficient way to form a sample for the research. g. Suggest the most appropriate data collection method for the above study and state 2 Questionnaire survey. It allows researchers to collect data from a larger sample size within a given budget and time constraints. With a standardized set of questions, surveys can be distributed to a wider audience, providing a broader overview of the campaign's impact across various households in the selected estates. Respondents may feel more comfortable providing honest feedback on a selfadministered questionnaire. a. State the population of these surveyed All repairs workshop for vehicles in Bandar J. b. State the required sampling frame List of repaired workshop for vehicles in Bandar J., Give an example for each of these situations (the examples should relate (i) a question that will derive quantitative data; the cost of repair (ii)/ a question that will derive qualitative data. Type of cost What are the common suggestions? Type of services What is the type or colour of vehicle? d. Which is the most appropriate sampling technique that may be used by the researchers? Give one reason for your answer Stratified sample as it allows them to quickly obtain a sample population that best represents the entire population being studied. Grouped into similar vehicle Based on your answer in (d), explain in detail how the sample may be selected 35/190) *80 = 15, M : (50/190)*80 = 21 C : (85/190)*80 = 36 H : (20/190)*80 = 8 Use stratified sample to select sample from each workshop.