Uploaded by Simple X

Statistics revision papers

advertisement
Chapter 1
Introduction of Statistics
Statistics
The practice or science of collecting and analysing numerical data in large quantities,
especially for the purpose of inferring proportions in a whole from those in a representative
sample.
Characteristics of Statistics*read notes
● Statistics are numerically expressed.
● It has an aggregate of facts.
● Data are collected in systematic order.
● It should be comparable to each other.
● Data are collected for a planned purpose.
● Descriptive Statistics
Limitations of Statistics* read notes
● Statistics does not deal with individual measurements. ...
● Statistics deals only with quantitative characteristics. ...
● Statistical results are true only on an average. ...
● Statistics is only one of the methods of studying a problem. ...
● Statistics can be misused.
Descriptive statistics are used to summarize a set of data. Suppose you were teaching a
class of 25 students, and you wanted to know what the average score was for the test that
you just gave. You would use descriptive statistics; you are interested in the performance of
that particular set of students. One limitation of descriptive statistics is that they do not allow
us to make any inferences about the population at large. This type of statistics simply
describes the data set that has been collected.
Inferential Statistics
Now, suppose you need to collect data on a very large population. For example, suppose
you want to know the average height of all the men in a city with a population of so many
million residents. It isn't very practical to try and get the height of each man.
This is where inferential statistics comes into play. Inferential statistics makes inferences
about populations using data drawn from the population. Instead of using the entire
population to gather the data, the statistician will collect a sample or samples from the
millions of residents and make inferences about the entire population using the sample.
The sample is a set of data taken from the population to represent the population. Probability
distributions, hypothesis testing, correlation testing and regression analysis all fall under the
category of inferential statistics.
Population
In statistics,Population is the entire set of items from which you draw data for a statistical
study. It can be a group of individuals, a set of items, etc. It makes up the data pool for a
study.
What is a Sample?
A sample is defined as a smaller and more manageable representation of a larger group. A
subset of a larger population that contains characteristics of that population. A sample is
used in statistical testing when the population size is too large for all members or
observations to be included in the test.
Data
Data is raw unorganized facts that need to be processed. Data can be something simple and
seemingly random and useless until it is organized.
Primary Data
Primary data refers to the first hand data gathered by the researcher himself.
Secondary Data
Secondary data means data collected by someone else earlier.
Examples
Source Surveys, observations, experiments, questionnaire, personal interview, etc.
Government publications, websites, books, journal articles, internal records etc.
Information
When data is processed, organized, structured or presented in a given context so as to
make it useful, it is called information.
Good quality information is:
● Relevant. Information obtained and used should be needed for decision-making - it
•doesn't matter how interesting it is. ...
● Up-to-date. Information needs to be timely if it is to be actioned. ...
● Accurate. ...
● Meeting user needs. ...
● Easy to use and understand. ...
● Worth the cost. ...
● Reliable.
● Complete
Constant
*
Variable* (or data)
Not consistent or liable to change.
Categorical (or Quantitative) Variables
Variables whose values result from counting or measuring something.
Examples:
height, weight, time in the 100 yard dash, number of items sold to a shopper
Numerical (or Qualitative) Variables
Variables that are not measurement variables. Their values do not result from measuring or
counting.
Examples:
hair color, religion, political party, profession
Discrete Variables
A discrete variable is a variable that takes on distinct, countable values. In theory, you
should always be able to count the values of a discrete variable.
Examples
● Years of schooling
● Number of goals made in a soccer match
● Number of red M&M’s in a candy jar
● Votes for a particular politician
● Number of times a coin lands on heads after ten coin tosses
Continuous Variables
A continuous variable is a variable that can take on any value within a range. A continuous
variable takes on an infinite number of possible values within a given range.
Because the possible values for a continuous variable are infinite, we measure continuous
variables (rather than count), often using a measuring device like a ruler or stopwatch.
Continuous variables include all the fractional or decimal values within a range.
Examples
● The time it takes sprinters to run 100 meters
● The size of real estate lots in a city
● The weight of baby elephants
● The body temperature of patients with the flu
● The deployment altitude of skydivers
QUIZ Question
Download