Uploaded by françeska al

BUSINESS STATISTICS 1 CRAM FOR EXAM L1 JONKOPING UNIVERSITY

advertisement
54L1 THEORY
DATA AND DATASET
1. Data are the facts collected, summarized, analyzed and interpreted and when they are
collected in a particular study are referred to as data set
2. Elements are the entities on which data are collected such as individuals, firms,
countries etc
3. A variable is a characteristic of interest for the elements such as shoe size, number of
employees, GDP etc.
4. An observation is a set of measurements collected for a particular element such as shoe
size 44, 10 employees, 583 billion USD etc.
5. The total number of data values in a data set is the number of elements multiplied by the
number of variables.
A simple graph with names and subject grades for each student
● The elements correspond to individuals
● The variables are the number of points on tests in mathematics, physics etc
● The observations are the actual pointsthat each individual received, ex. For matt: 38 (
Math), 58 for Physics, 66 for chemistry, 49 for biology
● The data set co sists of 32 observations ( everything all together)
SCALES OF MEASUREMENT
-
4 different scales of measurement
This scale determines the amount of information contained in the data and indicates the
statistical analyses that are most appropriate
●
●
●
●
Nominal : Data is divided into different categories, there are no natural ranking of the
categories, the variable “values” of the variable can only be described in words, not
numbers, a nonnumeric label or numeric code may be used and the mathematical
symbols are = and ≠
Ordinal: Data is divided into different categories, there exists a natural ranking of the
categories, it isn’t possible to indicate in any meaningful way differences or distances
between the values, a nonnumeric label or numeric code may be used,
Interval : data is always numeric, there exists a natural ranking of data, variables
measured on the interval scale have fixed measurement units, it makes sense to specify
differences or distances between values, arbitrary zero point which means that interval
scaled variables do not have a true or absolute zero point so because of this it is
technically incorrect to declare that something is so many times larger or smaller than
something else.
Ratio : data is always numeric , there exists a natural ranking of the data, variables
measured on the ratio scale have fixed measurement units, it makes sense to specify
●
Data measured on the nominal and ordinal scale is usually called qualitative or
categorical data
● Data measured on the interval and ratio scale are usually called quantitative data
● More options for statistical analysis when the data are quantitative
(Statistical inference) : the process of drawing conclusions about an underlying population
based on a sample or subset of the data
Graphical methods
●
●
Pie chart: it is a chart type where circle sectors show proportions of a total
Bar chart: it is a chart type that shows the values of different groups by the
height of the bars
● Histogram : it is a chart type that shows how mant observations there are for
each interval
● Scatter plot : Scatter plot is a chart type that uses dots to represent values for two
different numeric variables. It is used to observe relationships between two
variables
Measures of central tendency
A measure of central tendency is a central or typical value for data or a probability function
● Mode is the most frequent value in the dataset
● Median is the middle value that separates the higher half from the lower
half of the dataset
● Mean is the average value calculated as (formula)
5. Measures of variability - Percentiles
●
The Pth percentile is the value such that that at least p percent of the
observations are less than or equal to this value.
The percentiles that divide the observations into four parts ( P25, P50 and P75)
are called quartiles. P25 is called the first quartile, P50 is called the second
quartile or the median, and P75 is called the third quartile
Download