Uploaded by shribavan kanesamoorthy

Omnibus test

advertisement
Omnibus test
Omnibus tests are a kind of statistical test. They test whether the explained variance in a set of data is
significantly greater than the unexplained variance, overall. One example is the F-test in the analysis of
variance. There can be legitimate significant effects within a model even if the omnibus test is not
significant. For instance, in a model with two independent variables, if only one variable exerts a significant
effect on the dependent variable and the other does not, then the omnibus test may be non-significant. This
fact does not affect the conclusions that may be drawn from the one significant variable. In order to test
effects within an omnibus test, researchers often use contrasts.
Omnibus test, as a general name, refers to an overall or a global test. Other names include F-test or Chisquared test. It is a statistical test implemented on an overall hypothesis that tends to find general
significance between parameters' variance, while examining parameters of the same type, such as:
Hypotheses regarding equality vs. inequality between k expectancies μ 1 = μ 2 = ⋯ = μ k vs. at least one
pair μ j ≠ μ j′, where j, j′ = 1, ..., k and j ≠ j′, in Analysis Of Variance (ANOVA); or regarding equality
between k standard deviations σ1 = σ2= ⋯ = σk vs. at least one pair σj ≠ σj′ in testing equality of
variances in ANOVA; or regarding coefficients β1 = β2 = ⋯ = βk vs. at least one pair βj ≠ βj′ in
Multiple linear regression or in Logistic regression.
Usually, it tests more than two parameters of the same type and its role is to find general significance of at
least one of the parameters involved.
Contents
Definitions
In one-way analysis of variance
Model assumptions in one-way ANOVA
Example
ANOVA
Dependent variable: time minutes to respond
Test of Homogeneity of Variances
Dependent variable: time minutes to respond
Considerations
In multiple regression
Model assumptions in multiple linear regression
The omnibus F test regarding the hypotheses over the coefficients
Example 1- omnibus F test on SPSS
ANOVA
Model summary
Coefficients
Example 2- multiple linear regression omnibus F test on R
Coefficients
In logistic regression
Omnibus test relates to the hypotheses
Model fitting: maximum likelihood method
Test's statistic and distribution: Wilks' theorem
Other considerations
Example 1 of logistic regression
Independent variables
Dependent variable
Block 1: method = forward stepwise (conditional)
Omnibus tests of model coefficients
Block 2: method = forward stepwise (conditional)
Omnibus tests of model coefficients
Variables in the equation
Example 2 of logistic regression
Dependent variable (coded as a dummy variable)
Independent variables (coded as a dummy variables)
Omnibus tests of model coefficients
Variables in the equation
See also
External links
Definitions
Omnibus test commonly refers to either one of those statistical tests:
ANOVA F test to test significance between all factor means and/or between their variances
equality in Analysis of Variance procedure ;
The omnibus multivariate F Test in ANOVA with repeated measures ;
F test for equality/inequality of the regression coefficients in multiple regression;
Chi-Square test for exploring significance differences between blocks of independent
explanatory variables or their coefficients in a logistic regression.
These omnibus tests are usually conducted whenever one tends to test an overall hypothesis on a quadratic
statistic (like sum of squares or variance or covariance) or rational quadratic statistic (like the ANOVA
overall F test in Analysis of Variance or F Test in Analysis of covariance or the F Test in Linear
Regression, or Chi-Square in Logistic Regression).
While significance is founded on the omnibus test, it doesn't specify exactly where the difference is
occurred, meaning, it doesn't bring specification on which parameter is significantly different from the
other, but it statistically determines that there is a difference, so at least two of the tested parameters are
statistically different. If significance was met, none of those tests will tell specifically which mean differs
from the others (in ANOVA), which coefficient differs from the others (in regression) etc.
In one-way analysis of variance
The F-test in ANOVA is an example of an omnibus test, which tests the overall significance of the model.
A significant F test means that among the tested means, at least two of the means are significantly different,
but this result doesn't specify exactly which means are different one from the other. Actually, testing means'
differences is done by the quadratic rational F statistic ( F=MSB/MSW). In order to determine which mean
differs from another mean or which contrast of means are significantly different, Post Hoc tests (Multiple
Comparison tests) or planned tests should be conducted after obtaining a significant omnibus F test. It may
be considered to use the simple Bonferroni correction or another suitable correction. Another omnibus test
we can find in ANOVA is the F test for testing one of the ANOVA assumptions: the equality of variance
between groups. In One-Way ANOVA, for example, the hypotheses tested by omnibus F test are:
H0: μ1 =μ2 =....= μk
H1: at least one pair μj≠μj'
These hypotheses examine model fit of the most common model: yij = μj + εij, where yij is the dependent
variable, μj is the j-th independent variable's expectancy, which usually is referred to as "group expectancy"
or "factor expectancy"; and εij are the errors results on using the model.
The F statistics of the omnibus test is:
Where, is the overall sample mean,
sample size of group j.
is the group j sample mean, k is the number of groups and nj is
The F statistic is distributed F(k-1,n-k),(α) under assumption of null hypothesis and normality assumption. F
test is considered robust in some situations, even when the normality assumption isn't met.
Model assumptions in one-way ANOVA
Random sampling.
Normal or approximately normal distribution of in each group.
Equal variances between groups.
If the assumption of equality of variances is not met, Tamhane's test is preferred. When this assumption is
satisfied we can choose amongst several tests. Although the LSD (Fisher's Least Significant Difference) is
a very strong test in detecting pairs of means differences, it is applied only when the F test is significant,
and it is mostly less preferable since its method fails in protecting low error rate. Bonferroni test is a good
choice due to its correction suggested by his method. This correction states that if n independent tests are to
be applied then the α in each test should be equal to α /n. Tukey's method is also preferable by many
statisticians because it controls the overall error rate. On small sample sizes, when the assumption of
normality is not met, a nonparametric analysis of variance can be made by the Kruskal-Wallis test.
An alternative option is to use bootstrap methods to assess whether the group means are different. Bootstrap
methods do not have any specific distributional assumptions and may be an appropriate tool to use like
using re-sampling, which is one of the simplest bootstrap methods. A person can extend the idea to the case
of multiple groups and estimate p-values.
Example
A cellular survey on customers' time-wait was reviewed on 1,963 different customers during 7 days on
each one of 20 in-sequential weeks. Assuming none of the customers called twice and none of them have
customer relations among each other, One Way ANOVA was run on SPSS to find significant differences
between the days time-wait:
ANOVA
Dependent variable: time minutes to respond
Source
Sum of Squares
Between Groups
12823.921
Within Groups
26414.958
Total
39238.879
df
Mean Square
F
Sig.
6
2137.320
158.266
.000
1956
13.505
1962
The omnibus F ANOVA test results above indicate significant differences between the days time-wait (PValue =0.000 < 0.05, α =0.05).
The other omnibus tested was the assumption of Equality of Variances, tested by the Levene F test:
Test of Homogeneity of Variances
Dependent variable: time minutes to respond
Levene Statistic
36.192
df1
6
df2
Sig.
1956
.000
The results suggest that the equality of variances assumption can't be made. In that case Tamhane's test can
be made on Post Hoc comparisons.
Considerations
A significant omnibus F test in ANOVA procedure, is an in advance requirement before conducting the
Post Hoc comparison, otherwise those comparisons are not required. If the omnibus test fails to find
significant differences between all means, it means that no difference has been found between any
combinations of the tested means. In such, it protects family-wise Type I error, which may be increased if
overlooking the omnibus test. Some debates have occurred about the efficiency of the omnibus F Test in
ANOVA.
In a paper Review of Educational Research (66(3), 269-306) which reviewed by Greg Hancock, those
problems are discussed:
William B. Ware (1997) claims that the omnibus test significance is required depending on the Post Hoc test
is conducted or planned: "... Tukey's HSD and Scheffé's procedure are one-step procedures and can be
done without the omnibus F having to be significant. They are "a posteriori" tests, but in this case, "a
posteriori" means "without prior knowledge", as in "without specific hypotheses." On the other hand,
Fisher's Least Significant Difference test is a two-step procedure. It should not be done without the
omnibus F-statistic being significant."
William B. Ware (1997) argued that there are a number of problems associated with the requirement of an
omnibus test rejection prior to conducting multiple comparisons. Hancock agrees with that approach and
sees the omnibus requirement in ANOVA in performing planned tests an unnecessary test and potentially
detrimental, hurdle unless it is related to Fisher's LSD, which is a viable option for k=3 groups.
Other reason for relating to the omnibus test significance when it is concerned to protect family-wise Type I
error.
The publication "Review of Educational Research" discusses four problems in the omnibus F test
requirement:
First, in a well planned study, the researcher's questions involve specific contrasts of group means' while
the omnibus test, addresses each question only tangentially and it is rather used to facilitate control over the
rate of Type I error.
Secondly, this issue of control is related to the second point: the belief that an omnibus test offers protection
is not completely accurate. When the complete null hypothesis is true, weak family-wise Type I error
control is facilitated by the omnibus test; but, when the complete null is false and partial nulls exist, the Ftest does not maintain strong control over the family-wise error rate.
A third point, which Games (1971) demonstrated in his study, is that the F-test may not be completely
consistent with the results of a pairwise comparison approach. Consider, for example, a researcher who is
instructed to conduct Tukey's test only if an alpha-level F-test rejects the complete null. It is possible for the
complete null to be rejected but for the widest ranging means not to differ significantly. This is an example
of what has been referred to as non-consonance/dissonance (Gabriel, 1969) or incompatibility (Lehmann,
1957). On the other hand, the complete null may be retained while the null associated with the widest
ranging means would have been rejected had the decision structure allowed it to be tested. This has been
referred to by Gabriel (1969) as incoherence. One wonders if, in fact, a practitioner in this situation would
simply conduct the MCP contrary to the omnibus test's recommendation.
The fourth argument against the traditional implementation of an initial omnibus F-test stems from the fact
that its well-intentioned but unnecessary protection contributes to a decrease in power. The first test in a
pairwise MCP, such as that of the most disparate means in Tukey's test, is a form of omnibus test all by
itself, controlling the family-wise error rate at the α-level in the weak sense. Requiring a preliminary
omnibus F-test amount to forcing a researcher to negotiate two hurdles to proclaim the most disparate
means significantly different, a task that the range test accomplished at an acceptable α -level all by itself. If
these two tests were perfectly redundant, the results of both would be identical to the omnibus test;
probabilistically speaking, the joint probability of rejecting both would be α when the complete null
hypothesis was true. However, the two tests are not completely redundant; as a result the joint probability of
their rejection is less than α. The F-protection therefore imposes unnecessary conservatism (see
Bernhardson, 1975, for a simulation of this conservatism). For this reason, and those listed before, we agree
with Games' (1971) statement regarding the traditional implementation of a preliminary omnibus F-test:
There seems to be little point in applying the overall F test prior to running c contrasts by procedures that
set [the family-wise error rate] α .... If the c contrasts express the experimental interest directly, they are
justified whether the overall F is significant or not and (family-wise error rate) is still controlled.
In multiple regression
In multiple regression, the omnibus test is an ANOVA F test on all the coefficients, that is equivalent to the
multiple correlations R Square F test. The omnibus F test is an overall test that examines model fit, thus
failure to reject the null hypothesis implies that the suggested linear model is not significantly suitable to the
data. None of the independent variables has explored as significant in explaining the dependent variable
variation. These hypotheses examine model fit of the most common model: yi = β0 + β1 xi1 + ... +βk xik +
εij
estimated by E(y i|x i1,...,x ik) = β0 + β1x i1 + ... + βkx ik, where E(yi|xi1 ....xik ) is the dependant
variable explanatory for the i-th observation, xij is the j-th independent (explanatory) variable, βj is the j-th
coefficient of xij and indicates its influence on the dependant variable y upon its partial correlation with y.
The F statistics of the omnibus test is:
Whereas, ȳ is the overall sample mean for yi, ŷi is the regression estimated mean for specific set of k
independent (explanatory) variables and n is the sample size.
The F statistic is distributed F (k,n-k-1),(α) under assuming of null hypothesis and normality assumption.
Model assumptions in multiple linear regression
Random sampling.
Normal or approximately normal distribution of the errors eij.
The errors eij explanatory equals zero>, E(eij)=0.
Equal variances of the errors eij. Which it's omnibus F test ( like Levene F test).
No Multi-collinearity between explanatory/predictor variables' meaning: cov(xi,xj)=0 where is
i≠j, for any i or j.
The omnibus F test regarding the hypotheses over the coefficients
H0 : β1 = β2 =....= βk = 0
H1 : at least one βj ≠ 0
The omnibus test examines whether there are any regression coefficients that are significantly non-zero,
except for the coefficient β0. The β0 coefficient goes with the constant predictor and is usually not of
interest. The null hypothesis is generally thought to be false and is easily rejected with a reasonable amount
of data, but in contrary to ANOVA, it is important to do the test anyway. When the null hypothesis cannot
be rejected, this means the data are completely worthless. The model that has the constant regression
function fits as well as the regression model, which means that no further analysis need be done. In many
statistical researches, the omnibus is usually significant, although part or most of the independent variables
has no significance influence on the dependant variable. So the omnibus is useful only to imply whether the
model fits or not, but it doesn't offers the corrected recommended model which can be fitted to the data.
The omnibus test comes to be significant mostly if at least one of the independent variables is significant.
This means that any other variable may enter the model, under the model assumption of non-colinearity
between independent variables, while the omnibus test still shows significance. The suggested model is
fitted to the data.
Example 1- omnibus F test on SPSS
An insurance company intends to predict "Average cost of claims" (variable name "claimamt") by three
independent variables (Predictors): "Number of claims" (variable name "nclaims"), "Policyholder age"
(variable name holderage), "Vehicle age" (variable name vehicleage). Linear Regression procedure has
been run on the data, as follows: The omnibus F test in the ANOVA table implies that the model involved
these three predictors can fit for predicting "Average cost of claims", since the null hypothesis is rejected
(P-Value=0.000 < 0.01, α=0.01). This rejection of the omnibus test implies that at least one of the
coefficients of the predictors in the model have found to be non-zero. The multiple- R-Square reported on
the Model Summary table is 0.362, which means that the three predictors can explain 36.2% from the
"Average cost of claims" variation.
ANOVA
Source
Sum of Squares
Regression
df
Mean Square
F
Sig.
22.527
.000a
605407.143
3
201802.381
Residual
1066019.508
119
8958.147
Total
1671426.650
122
a. Predictors: (Constant), nclaims Number of claims, holderage Policyholder age, vehicleage Vehicle age
b. Dependent Variable: claimant Average cost of claims
Model summary
Model
1
R
R Square
Adjusted R Square
.602a
.362
.346
Std. Error of the Estimate
94.647
a. Predictors: (Constant), nclaims Number of claims, holderage Policyholder age, vehicleage Vehicle age
However, only the predictors: "Vehicle age" and "Number of claims" has statistical influence and
prediction on the "Average cost of claims" as shown on the following "Coefficients table", whereas
"Policyholder age" is not significant as a predictor (P-Value=0.116>0.05). That means that a model without
this predictor may be suitable.
Coefficients
Unstandardized
Coefficients
Model
1
B
Std. Error
Standardized
Coefficients
t
Sig.
15.100
.000
-7.247
.000
Beta
(Constant)
447.668
29.647
vehicleage Vehicle age
-67.877
9.366
holderage Policyholder
age
-6.624
4.184
-.128
-1.583
.116
nclaims Number of
claims
-.274
.119
-.217
-2.30
.023
-.644
a. Dependent Variable: claimant Average cost of claims
Example 2- multiple linear regression omnibus F test on R
The following R output illustrates the linear regression and model fit of two predictors: x1 and x2. The last
line describes the omnibus F test for model fit. The interpretation is that the null hypothesis is rejected (P =
0.02692<0.05, α=0.05). So Either β1 or β2 appears to be non-zero (or perhaps both). Note that the
conclusion from Coefficients: table is that only β1 is significant (P-Value shown on Pr(>|t|) column is 4.37e05 << 0.001). Thus one step test, like omnibus F test for model fitting is not sufficient to determine model
fit for those predictors.
Coefficients
Estimate
Std. Error
t value
Pr(>|t|)
(Intercept)
-0.7451
.7319
.-1.018
0.343
X1
0.6186
0.7500
0.825
4.37e-05 ***
X2
0.0126
0.1373
0.092
0.929
Residual standard error: 1.157 on 7 degrees of freedom
Multiple R-Squared: 0.644, Adjusted R-squared: 0.5423
F-statistic: 6.332 on 2 and 7 DF, p-value: 0.02692
In logistic regression
In statistics, logistic regression is a type of regression analysis used for predicting the outcome of a
categorical dependent variable (with a limited number of categories) or dichotomic dependent variable
based on one or more predictor variables. The probabilities describing the possible outcome of a single trial
are modeled, as a function of explanatory (independent) variables, using a logistic function or multinomial
distribution. Logistic regression measures the relationship between a categorical or dichotomic dependent
variable and usually a continuous independent variable (or several), by converting the dependent variable to
probability scores. The probabilities can be retrieved using the logistic function or the multinomial
distribution, while those probabilities, like in probability theory, takes on values between zero and one:
So the model tested can be defined by:
whereas yi is the category of the dependent variable for the i-th observation and xij is the j independent
variable (j=1,2,...k) for that observation, βj is the j-th coefficient of xij and indicates its influence on and
expected from the fitted model .
Note: independent variables in logistic regression can also be continuous.
Omnibus test relates to the hypotheses
H0 : β1 = β2 =....= βk = 0
H1 : at least one βj ≠ 0
Model fitting: maximum likelihood method
The omnibus test, among the other parts of the logistic regression procedure, is a likelihood-ratio test based
on the maximum likelihood method. Unlike the Linear Regression procedure in which estimation of the
regression coefficients can be derived from least square procedure or by minimizing the sum of squared
residuals as in maximum likelihood method, in logistic regression there is no such an analytical solution or a
set of equations from which one can derive a solution to estimate the regression coefficients. So logistic
regression uses the maximum likelihood procedure to estimate the coefficients that maximize the likelihood
of the regression coefficients given the predictors and criterion. The maximum likelihood solution is an
iterative process that begins with a tentative solution, revises it slightly to see if it can be improved, and
repeats this process until improvement is made, at which point the model is said to have converged.
Applying the procedure in conditioned on convergence ( see also in the following "remarks and other
considerations ").
In general, regarding simple hypotheses on parameter θ ( for example):
θ=θ1
H0 : θ=θ0
vs.
H1 :
,the likelihood ratio test statistic can be referred as:
,where L(yi|θ) is the likelihood function, which refers to the specific θ.
The numerator corresponds to the maximum likelihood of an observed outcome under the null hypothesis.
The denominator corresponds to the maximum likelihood of an observed outcome varying parameters over
the whole parameter space. The numerator of this ratio is less than the denominator. The likelihood ratio
hence is between 0 and 1.
Lower values of the likelihood ratio mean that the observed result was much less likely to occur under the
null hypothesis as compared to the alternative. Higher values of the statistic mean that the observed
outcome was more than or equally likely or nearly as likely to occur under the null hypothesis as compared
to the alternative, and the null hypothesis cannot be rejected.
The likelihood ratio test provides the following decision rule:
If
do not reject H0 ,
otherwise
If
reject H0
and also reject H0 with probability
q
whereas the critical values
the relation:
are usually chosen to obtain a specified significance level α, through
.
c, q
if
,
Thus, the likelihood-ratio test rejects the null hypothesis if the value of this statistic is too small. How small
is too small depends on the significance level of the test, i.e., on what probability of Type I error is
considered tolerable The Neyman-Pearson lemma states that this likelihood ratio test is the most powerful
among all level-α tests for this problem.
Test's statistic and distribution: Wilks' theorem
First we define the test statistic as the deviate
which indicates testing the ratio:
While the saturated model is a model with a theoretically perfect fit. Given that deviance is a measure of the
difference between a given model and the saturated model, smaller values indicate better fit as the fitted
model deviates less from the saturated model. When assessed upon a chi-square distribution, non-significant
chi-square values indicate very little unexplained variance and thus, good model fit. Conversely, a
significant chi-square value indicates that a significant amount of the variance is unexplained. Two
measures of deviance D are particularly important in logistic regression: null deviance and model deviance.
The null deviance represents the difference between a model with only the intercept and no predictors and
the saturated model. And, the model deviance represents the difference between a model with at least one
predictor and the saturated model. In this respect, the null model provides a baseline upon which to
compare predictor models. Therefore, to assess the contribution of a predictor or set of predictors, one can
subtract the model deviance from the null deviance and assess the difference on a chi-square distribution
with one degree of freedom. If the model deviance is significantly smaller than the null deviance then one
can conclude that the predictor or set of predictors significantly improved model fit. This is analogous to the
F-test used in linear regression analysis to assess the significance of prediction.
In most cases, the exact distribution of the likelihood ratio corresponding to specific hypotheses is very
difficult to determine. A convenient result, attributed to Samuel S. Wilks, says that as the sample size n
approaches the test statistic has asymptotically distribution with degrees of freedom equal to the difference
in dimensionality of and parameters the β coefficients as mentioned before on the omnibus test. e.g., if n is
large enough and if the fitted model assuming the null hypothesis consist of 3 predictors and the saturated (
full ) model consist of 5 predictors, the Wilks' statistic is approximately distributed (with 2 degrees of
freedom). This means that we can retrieve the critical value C from the chi squared with 2 degrees of
freedom under a specific significance level.
Other considerations
1. In some instances the model may not reach convergence. When a model does not converge
this indicates that the coefficients are not reliable as the model never reached a final
solution. Lack of convergence may result from a number of problems: having a large ratio of
predictors to cases, multi-collinearity, sparseness, or complete separation. Although not a
precise number, as a rule of thumb, logistic regression models require a minimum of 10
cases per variable. Having a large proportion of variables to cases results in an overly
conservative Wald statistic and can lead to non convergence.
2. Multi-collinearity refers to unacceptably high correlations between predictors. As multicollinearity increases, coefficients remain unbiased but standard errors increase and the
likelihood of model convergence decreases. To detect multi-collinearity among the
predictors, one can conduct a linear regression analysis with the predictors of interest for the
sole purpose of examining the tolerance statistic used to assess whether multi-collinearity is
unacceptably high.
3. Sparseness in the data refers to having a large proportion of empty cells (cells with zero
counts). Zero cell counts are particularly problematic with categorical predictors. With
continuous predictors, the model can infer values for the zero cell counts, but this is not the
case with categorical predictors. The reason the model will not converge with zero cell
counts for categorical predictors is because the natural logarithm of zero is an undefined
value, so final solutions to the model cannot be reached. To remedy this problem,
researchers may collapse categories in a theoretically meaningful way or may consider
adding a constant to all cells. Another numerical problem that may lead to a lack of
convergence is complete separation, which refers to the instance in which the predictors
perfectly predict the criterion - all cases are accurately classified. In such instances, one
should reexamine the data, as there is likely some kind of error.
4. Wald statistic is defined by, where is the sample estimation of and is the standard error of .
Alternatively, when assessing the contribution of individual predictors in a given model, one
may examine the significance of the Wald statistic. The Wald statistic, analogous to the t-test
in linear regression, is used to assess the significance of coefficients. The Wald statistic is
the ratio of the square of the regression coefficient to the square of the standard error of the
coefficient and is asymptotically distributed as a chi-square distribution. Although several
statistical packages (e.g., SPSS, SAS) report the Wald statistic to assess the contribution of
individual predictors, the Wald statistic has some limitations. First, When the regression
coefficient is large, the standard error of the regression coefficient also tends to be large
increasing the probability of Type-II error. Secondly, the Wald statistic also tends to be
biased when data are sparse.
5. Model Fit involving categorical predictors may be achieved by using log-linear modeling.
Example 1 of logistic regression
Spector and Mazzeo examined the effect of a teaching method known as PSI on the performance of
students in a course, intermediate macro economics. The question was whether students exposed to the
method scored higher on exams in the class. They collected data from students in two classes, one in which
PSI was used and another in which a traditional teaching method was employed. For each of 32 students,
they gathered data on
Independent variables
GPA-Grade point average before taking the class.
TUCE-the score on an exam given at the beginning of the term to test entering knowledge of
the material.
PSI- a dummy variable indicating the teaching method used (1 = used Psi, 0 = other
method).
Dependent variable
• GRADE — coded 1 if the final grade was an A, 0 if the final grade was a B or C.
The particular interest in the research was whether PSI had a significant effect on GRADE. TUCE and
GPA are included as control variables.
Statistical analysis using logistic regression of Grade on GPA, Tuce and Psi was conducted in SPSS using
Stepwise Logistic Regression.
In the output, the "block" line relates to Chi-Square test on the set of independent variables that are tested
and included in the model fitting. The "step" line relates to Chi-Square test on the step level while variables
included in the model step by step. Note that in the output a step chi-square, is the same as the block chisquare since they both are testing the same hypothesis that the tested variables enter on this step are nonzero. If you were doing stepwise regression, however, the results would be different. Using forward
stepwise selection, researchers divided the variables into two blocks (see METHOD on the syntax
following below).
LOGISTIC REGRESSION VAR=grade
/METHOD=fstep psi / fstep gpa tuce
/CRITERIA PIN(.50) POUT(.10) ITERATE(20) CUT(.5).
The default PIN value is .05, was changed by the researchers to .5 so the insignificant TUCE would make
it in. In the first block, psi alone gets entered, so the block and step Chi Test relates to the hypothesis H0:
βPSI = 0. Results of the omnibus Chi-Square tests implies that PSI is significant for predicting that
GRADE is more likely to be a final grade of A.
Block 1: method = forward stepwise (conditional)
Omnibus tests of model coefficients
step1
Chi-Square
df
Sig.
5.842
1
.016
Block
5.842
1
.016
Model
5.842
1
.016
Step
Then, in the next block, the forward selection procedure causes GPA to get entered first, then TUCE (see
METHOD command on the syntax before).
Block 2: method = forward stepwise (conditional)
Omnibus tests of model coefficients
Chi-Square
Step1
Step2
df
Sig.
1
.003
Step
9.088
Block
9.088
Model
14.930
2
.001
Step
.474
1
.491
Block
9.562
2
.008
Model
15.404
3
.002
1
.003
The first step on block2 indicates that GPA is significant (P-Value=0.003<0.05, α=0.05)
So, looking at the final entries on step2 in block2,
The step chi-square, .474, tells you whether the effect of the variable that was entered in the
final step, TUCE, significantly differs from zero. It is the equivalent of an incremental F test of
the parameter, i.e. it tests H0: βTUCE = 0.
The block chi-square, 9.562, tests whether either or both of the variables included in this
block (GPA and TUCE) have effects that differ from zero. This is the equivalent of an
incremental F test, i.e. it tests H0: βGPA = βTUCE = 0.
The model chi-square, 15.404, tells you whether any of the three Independent Variabls has
significant effects. It is the equivalent of a global F test, i.e. it tests H0: βGPA = βTUCE = βPSI =
0.
Tests of Individual Parameters shown on the "variables in the equation table", which Wald test (W=(b/sb)2,
where b is β estimation and sb is its standard error estimation ) that is testing whether any individual
parameter equals zero . You can, if you want, do an incremental LR chi-square test. That, in fact, is the best
way to do it, since the Wald test referred to next is biased under certain situations. When parameters are
tested separately, by controlling the other parameters, we see that the effects of GPA and PSI are
statistically significant, but the effect of TUCE is not. Both have Exp(β) greater than 1, implying that the
probability to get "A" grade is greater than getting other grade depends upon the teaching method PSI and
a former grade average GPA.
Variables in the equation
B
Step1a
2.826
GPA
TUCE
PSI
Constant
S.E.
Wald
df
5.007
1
.452
1
4.992
1
6.972
1
1.263
.142
0.095
2.378
1.064
-13.019
4.930
Sig.
.025
.502
.025
.008
Exp(B)
16.872
1.100
10.786
.000
a. Variable(s) entered on step 1: PSI
Example 2 of logistic regression
Research subject: "The Effects of Employment, Education, Rehabilitation and Seriousness of Offense on
Re-Arrest". A social worker in a criminal justice probation agency tends to examine whether some of the
factors are leading to re-arrest of those managed by the person's agency over the past five years who were
convicted and then released. The data consist of 1,000 clients with the following variables:
Dependent variable (coded as a dummy variable)
Re-arrested vs. not re-arrested (0 = not re-arrested; 1 = re-arrested) – categorical, nominal
Independent variables (coded as a dummy variables)
Whether or not the client was adjudicated for a second criminal offense (1=
adjudicated,0=not).
Seriousness of first offense (1=felony vs. 0=misdemeanor) -categorical, nominal
High school graduate vs. not (0 = not graduated; 1 = graduated) - categorical, nominal
Whether or not client completed a rehabilitation program after the first offense,0 = no rehab
completed; 1 = rehab completed)-categorical, nominal
Employment status after first offense (0 = not employed; 1 = employed)
Note: Continuous independent variables were not measured on this scenario.
The null hypothesis for the overall model fit: The overall model does not predict re-arrest. OR, the
independent variables as a group are not related to being re-arrested. (And for the independent variables:
any of the separate independent variables is not related to the likelihood of re-arrest).
The alternative hypothesis for the overall model fit: The overall model predicts the likelihood of re-arrest.
(The meaning respectively independent variables: having committed a felony (vs. a misdemeanor), not
completing high school, not completing a rehab program, and being unemployed are related to the
likelihood of being re-arrested).
Logistic regression was applied to the data on SPSS, since the Dependent variable is Categorical
(dichotomous) and the researcher examine the odd ratio of potentially being re-arrested vs. not expected to
be re-arrested.
Omnibus tests of model coefficients
Chi-Square
Step1
df
Sig.
Step
41.155
4
.000
Block
41.155
4
.000
Model
41.155
4
.000
The table shows the "Omnibus Test of Model Coefficients" based on Chi-Square test, which implies that
the overall model is predictive of re-arrest (focus is on row three—"Model"): (4 degrees of freedom) =
41.15, p < .001, and the null can be rejected. Testing the null that the Model, or the group of independent
variables that are taken together, does not predict the likelihood of being re-arrested. This result means that
the model of expecting re-arrestment is more suitable to the data.
Variables in the equation
Step1
felony
high school
rehab
employ
Constant
B
S.E.
Wald
df
Sig.
Exp(B)
0.283
0.142
3.997
1
0.046
1.327
0.023
0.138
1
0.867
1.023
-0.679
0.142
0.000
0.507
-0.513
0.142
.599
1.035
0.154
0.028
22.725
1
13.031
1
.000
45.381
1
.000
2.816
One can also reject the null that the B coefficients for having committed a felony, completing a rehab
program, and being employed are equal to zero—they are statistically significant and predictive of re-arrest.
Education level, however, was not found to be predictive of re-arrest. Controlling for other variables,
having committed a felony for the first offense increases the odds of being re-arrested by 33% (p = .046),
compared to having committed a misdemeanor. Completing a rehab program and being employed after the
first offense decreases the odds or re-arrest, each by more than 50% (p < .001).
The last column, Exp(B) (taking the B value by calculating the inverse natural log of B) indicates odds
ratio: the probability of an event occurring, divided by the probability of the event not occurring. An
Exp(B) value over 1.0 signifies that the independent variable increases the odds of the dependent variable
occurring. An Exp(B) under 1.0 signifies that the independent variable decreases the odds of the dependent
variable occurring, depending on the decoding that mentioned on the variables details before.
A negative B coefficient will result in an Exp(B) less than 1.0, and a positive B coefficient will result in an
Exp(B) greater than 1.0. The statistical significance of each B is tested by the Wald Chi-Square—testing the
null that the B coefficient = 0 (the alternate hypothesis is that it does not = 0). p-values lower than alpha are
significant, leading to rejection of the null. Here, only the independent variables felony, rehab, employment,
are significant ( P-Value<0.05. Examining the odds ratio of being re-arrested vs. not re-arrested, means to
examine the odds ratio for comparison of two groups (re-arrested = 1 in the numerator, and re-arrested = 0
in the denominator) for the felony group, compared to the baseline misdemeanor group. Exp(B)=1.327 for
"felony" can indicates that having committed a felony vs. misdemeanor increases the odds of re-arrest by
33%. For "rehab", a person can say that having completed rehab reduces the likelihood (or odds) of being
re-arrested by almost 51%.
See also
Logistic regression § Introduction
Likelihood-ratio test
Neyman–Pearson lemma
External links
http://www.math.yorku.ca/Who/Faculty/Monette/Ed-stat/0525.html
http://www.stat.umn.edu/geyer/aster/short/examp/reg.html
http://www.nd.edu/~rwilliam/xsoc63993/
http://www.sjsu.edu/people/edward.cohen/courses/c2/s1/Week_15_handout.pdf
Retrieved from "https://en.wikipedia.org/w/index.php?title=Omnibus_test&oldid=1089654194"
This page was last edited on 25 May 2022, at 00:20 (UTC).
Text is available under the Creative Commons Attribution-ShareAlike License 3.0; additional terms may apply. By
using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the
Wikimedia Foundation, Inc., a non-profit organization.
Download