Data and Analysis of Data by Binam Ghimire 1 Learning Objectives To give students a flavour on the nature of work related to data and analysis Data Primary and Secondary Primary Data Collection Secondary Estimation OLS, use of Excel 2 Research design activity : Workplace commitment What would a definition of committment be, if you were to look it up in the dictionary Now, how might we “measure” committment, i.e. the extent to which a person is committed to their job/employer? Research Activity 1 1. In your group find out what the concept of “workplace commitment” means. Use more than 3 found definitions and include in your research notes the following Where did you get the definitions from What characteristics do they all have in common What characteristics are different Now, using the findings from secondary sources that you have found write your own group definition of “workplace commitment” Research Activity 2 Now that you have an operational definition of the concept of “workplace commitment” draw up a research plan detailing how you would go about measuring it in a workplace that you are familiar with Primary and Secondary sources In that last pair of activities you were shown the differences between primary and secondary sources of research data. Primary Data Definition : “Data observed or collected directly from firsthand experience.” Primary data is important for all areas of research because it is unvarnished information about the results of observation. No one has tarnished it or spun it by adding their own opinion or bias so it can form the basis of objective conclusions. Can be collected through direct observation, participant observation, questionnaires and surveys, asking questions, focus groups etc Primary Data Data that has been collected from first-hand-experience is known as primary data. Primary data has not been published yet and is more reliable, authentic and objective. Primary data has not been changed or altered by human beings, therefore its validity is greater than secondary data. Importance of Primary Data: Importance of Primary data cannot be neglected. A research can be conducted without secondary data but a research based on only secondary data is least reliable and may have biases because secondary data has already been manipulated by human beings. Secondary Data Definitions : “Published data and the data collected in the past or by other parties is called secondary data” “Existing primary data that was collected by someone else or for a purpose other than the current one.” Secondary Data Data collected from a source that has already been published in any form is called as secondary data. The review of literature in nay research is based on secondary data. Mostly from books, journals and periodicals. Secondary Data Importance of Secondary Data: Secondary data can be less valid but its importance is still there. Sometimes it is difficult to obtain primary data; in these cases getting information from secondary sources is easier and possible. Sometimes primary data does not exist in such situation one has to confine the research to secondary data. Sometimes primary data is present but the respondents are not willing to reveal it in such case too secondary data can suffice A Secondary Source is............ "A secondary source is a report on the findings of the primary source. While not as authoritative as the primary source, the secondary source often provides a broad background and readily improves one's learning curve. Most textbooks are secondary sources; they report and summarize the primary sources.“ (Don W. Stacks, Primer of Public Relations Research. Guilford Press, 2002) Sources of Secondary Data Textbooks Research Reports Learned Journals Census Official (government or NGO) statistics Trade Body Reports Company Reports Plus a whole variety of other sources depending on the subject that you are researching Market research and market reports Using market reports and other data Once you have identified the information you need, you can start to assemble it. Initially it's worth looking at information that's already been published, eg market reports, official statistics, trade publications, etc. Some of this information is free, but some you'll have to pay for. You can obtain market reports and other information from a wide range of sources: Internet sources UK Trade & Investment is a useful resource for exporters, with sectoral information for more than 200 countries worldwide. You can read country overviews on the UK Trade and Investment website http://www.ukti.gov.uk/export/sectors.html You can also read sector reports on the UK Trade & Investment website http://www.ukti.gov.uk/export/accessinginternationalma rkets/businessopportunitiesalerts.html Statistics on the web- RBA INFORMATION SERVICES http://www.rba.co.uk/sources/stats.h tm--- has an index and explanation of most of the statistics that are available on the web (WON a Really Useful site award) Other Sources (commercial) Commercial publishers of market reports include KeyNote, (http://www.keynote.co.uk/) Euromonitor, (http://www.euromonitor.com/) Mintel, (http://www.marketresearch.com/vendors/viewVendor.a sp?VendorID=614 Datamonitor, (http://www.datamonitor.com/ ) The Economist Intelligence Unit (http://www.eiu.com/public/ ) Datastream Good but expensive Sources for Resources Penn World Table The World Bank Data 19 ESDS International One platform to find data from various resources (mainly international aggregates macro database) Many resources including world bank and IMF http://www.esds.ac.uk/International/ You can access it from university or home 20 ESDS International Big list of series (variables), long annual Data etc. 21 ESDS International Easy to plot them in graph for a group of country GDP Growth High income countries and world 22 ESDS International Or for a single country comparing with the rest of the world GDP Growth the world and China 23 Yahoo Finance http://uk.finance.yahoo.com/ 24 Google Finance http://www.google.com/finance 25 Interpreting information Though there's a lot of readily available market information, you need to be careful how you interpret it. External data might not be in a useful format to use easily. It may have been collected for other purposes or be from a range that doesn't tally with your target market. Beware of out-of-date market information. This can be misleading, as the market may have changed significantly since the information was published. It can be particularly hard to tell how recent any information published on the internet is. Some information on the web can be unreliable or biased. Remember that statistics can sometimes mask the true picture. For example, an 'average' income for the population in your area might conceal a high proportion of low earners - meaning fewer people can afford your product than it appears. Methods of primary data collection: Method Interactive interviewing Written descriptions by participants Observation Description People asked to verbally describe their experiences of phenomenon. People asked to write descriptions of their experiences of phenomenon. Descriptive observations of verbal and non-verbal behavior. Methods of primary data collection Method Survey Description The survey is a non-experimental, descriptive research method. Surveys can be useful when a researcher wants to collect data on phenomena that cannot be directly observed Surveys are used extensively to assess attitudes and characteristics of a wide range of subjects,. In a survey, researchers sample a population. Basha and Harter (1980) state that "a population is any set of persons or objects that possesses at least one common characteristic." Methods of primary data collection Description Method Case Study A case study is an intensive analysis of an individual unit (e.g., a person, group, or event) stressing developmental factors in relation to context. Case studies may be descriptive or explanatory. The latter type is used to explore causation in order to find underlying principles.They may be prospective (in which criteria are established and cases fitting the criteria are included as they become available) or retrospective (in which criteria are established for selecting cases from historical records for inclusion in the study). Beware !!!! Regardless of the type of primary data that you collect you need to bear in mind that before the data will support any generalisable conclusions it has to be based on a sound sampling methodology. Good method + sound sample + Good data analysis = good conclusions Good method + unrepresentative sample + Good data analysis = Poor conclusions So ………. Answer the following questions How large does my sample need to be in order to form the basis of a valid conclusion ? What population will my conclusions represent or refer to ? What characteristics does the population have (age, gender, incomes etc) Are all of the relevant characteristics of the population represented (proportionately) in my sample Work this one out …………… In an examination of the factors that motivate non qualified workers in a particular industry Mohammed decides to use a particular company to survey. He chooses a software development company that he used to work at on the basis that he can get easy access to collect data. The company has 100 full time employees of whom 60% are men, 40% are aged under 35, 85% hold a qualification at degree level and 20% have either Masters or postgraduate qualifications He decides to sample 10 people from the programming department that he used to work in however in that department everyone is aged under 35 and none of them hold any postgraduate qualification Your task Decide how representative any data that Mohammed will manage to collect is likely to be How reliable will his conclusions be What are the strengths and weaknesses of his data collection efforts How would you go about collecting the data that is needed in this case So how do we know what “good” data is ? By measuring the data against 3 standards Validity Authenticity Reliability Validity: Validity is one of the major concerns in research. Validity is the quality of a research that makes it trustworthy and scientific. Validity relies on the use of scientific methods in research to make it logical and acceptable. Using primary data in research can improves the validity of research. First hand information obtained from a sample that is representative of the target population will yield data that will be valid for the entire target population. Authenticity: Authenticity refers to the genuineness of the research. Authenticity can be at stake if the researcher invests personal biases or uses misleading information in the research. Primary research tools and data can become more authentic if the methods chosen to analyze and interpret data are valid and reasonably suitable for the data type. . Primary sources are more authentic because the facts have not been overdone. Primary source can be less authentic if the source hides information or alters facts due to some personal reasons. Their are methods that can be employed to ensure factual yielding of data from the source. Reliability: Reliability is the certainty that the research is true enough to be trusted. For example, if a research study concludes that junk food consumption does not increase the risk of cancer and heart diseases. This conclusion should have to be drawn from a sample whose size, sampling technique and variability is not questionable. Reliability improves with using primary data. In the similar research mentioned above if the researcher uses experimental method and questionnaires the results will be highly reliable. On the other hand, if he relies on the data available in books and on internet he will collect information that does not represent the real facts. What next? How?? 38 What can be done Secondary Data: Descriptive Correlations Estimation Statistical Analysis - Softwares Statistical Analysis Tools 40 What Next? Or R 41 Analysis and Estimation Example Income (X) 6 12 10 20 See using Excel Consumption (Y) 22 22 12 8 Analysis and Estimation Countries France Italy Switzerland Australia Britain USA Russia Czech Republic Japan Mexico Death Heart Disease 61.1 94.1 106.4 173 199.7 176 373.6 283.7 34.7 36.4 Wine Beer Liquor 63.5 58 46 15.7 12.2 8.9 2.7 40.1 25.1 65 102.1 100 87.8 17.1 2.5 0.9 1.7 1.2 1.5 2 3.8 1.7 1 0.2 140 55 50.4 1 2.1 0.8 Death Heart Disease per 100,000; alcohol per capita consumption in litre. Source: Indiana University Analysis and Estimation Based on established Theory Rajan and Jingales (1995) Beck et al (2000) This weeks work Before we meet again ? ? ? ? Thank you