Proceedings of 23rd International Business Research Conference

advertisement
Proceedings of 23rd International Business Research Conference
18 - 20 November, 2013, Marriott Hotel, Melbourne, Australia, ISBN: 978-1-922069-36-8
Analytical Formula for Model Selection Probability
M. Shafiqur Rahman* and Syfun Nahar**
At present many different information criteria are used for choosing
better model from competing alternative models in applied Economics,
Econometrics and Statistics. Correct selection probabilities are usually
used for comparing the performances of different information criteria and
choosing a good criterion. For almost all existing procedures, correct
model selection probabilities are studied empirically. In this paper an
analytical formula for finding the probability of correct selection in the
context of linear regression model is developed. It is applied to find
correct selection probabilities and to compare performances of some
commonly used information criteria. It is observed that for lower
parametric model BIC performs best and the performances of other
2
criteria are in order HQC, JIC, AIC, Sp, GCV, Cp and R respectively.
On the other hand for higher parametric model the performances of BIC,
HQC, JIC, AIC, Sp, GCV, Cp and
R 2 criteria are exactly reverse.
Field of Research: Econometrics, Applied Economics
JEL Codes: B23, C10 and C52
1. Introduction
Model selection criteria play an important role in applied Economics, Econometrics and
Statistics. At present a large number of model selection criteria are available in the
literature including Akaike’s (1973) information criterion (AIC), Schwartz’s (1978)
Bayesian information criterion (BIC), Theil's (1961) R 2 (RBAR) criterion, Craven and
Wahba’s (1979) generalized cross validation (GCV) criterion, Hannan and Quinn’s
(1979) criterion (HQC), Hocking’s (1976) Sp criterion, Rahman and King’s (1999) joint
information criterion (JIC), and Mallows (1964) Cp criterion for choosing an appropriate
model from a number of competing alternative models for a particular data set. The
performance of any criterion varies from situation to situation. None of them is better in
all situations. So we need to compare all available criteria with each other to investigate
which one is performing better in which situation.
Mills and Prasad (1992), Fox (1995) and many authors studied the performances of
model selection criteria. Mills and Prasad compared the performances of some criteria
in a number of situations: robustness to collinearity among regressors, to distributional
assumptions, and to nonstationarity in time series. Fox expressed some model selection
criteria as penalized log likelihood functions, ranked them in terms of the penalties paid
_________________________
*
M. Shafiqur Rahman, Department of Operations Management and Business Statistics, College of
Economics and Political Science, Sultan Qaboos University, Muscat, Sultanate of Oman.
**
Syfun Nahar, Department of Mathematics and Statistics, College of Science, Sultan Qaboos University,
Muscat, Sultanate of Oman.
Proceedings of 23rd International Business Research Conference
18 - 20 November, 2013, Marriott Hotel, Melbourne, Australia, ISBN: 978-1-922069-36-8
for the addition of an extra parameter and interpreted them as  2 -statistics. We have
compared some commonly used criteria on the basis of probabilities of correct
selection. In almost all previous research on the small sample properties of model
selection procedures, selection probabilities are computed empirically. For example
King et al. (1996), Forbes et al. (1995) and Grose and King (1994) estimated
probabilities of correct selection using Monte Carlo techniques in the general model
selection problem. The level of estimation error depends on the number of replications
and also the probability being estimated. In this paper an analytical formula for the
probability of correct selection is obtained. By using this formula, the probabilities of
correct selection are computed for different IC procedures. It is observed that when the
model with lower number of regressors is true then the performance of BIC is better
than that of HQC, the performance of HQC is better than that of JIC, the performance of
JIC is better than that of AIC, the performance of AIC is better than that of S p, the
performance of Sp is better than that of GCV, the performance of GCV is better than
that of Cp, the performance of Cp is better than that of R 2 criterion. On the other hand
when the model with higher number of regressors is true then the performances of BIC,
HQC, JIC, AIC, Sp, GCV, Cp and R 2 criteria are exactly reverse.
The plan of this paper is as follows. First we express eight model selection criteria in
one common form named residual sum of squares (SS) and presented in section 2.
Section 3 introduces an analytical formula for finding the probability of correct selection
and comparison of eight model selection criteria based on probability of correct
selection. The final section contains some concluding remarks.
2. Expressing All Criteria in Residual SS form
Suppose we are interested in selecting a model from m alternative regression models
M1, M2, . . ., Mm for a given data set. Let the model Mj (j = 1,2, . . ., m) be represented
by
Y = Xj  j + Uj
(2.1)
where Y is an n  1 vector of observations on the dependent variable, Xj is an n  (kj-1)
matrix of observations on regressors,  j is a (kj-1)  1 vector of regression coefficients
and Uj is a vector of random disturbances following N(0,  2j I). Then the log-likelihood
function for the model Mj is given by
1
n

Lj(  j ,  2j ) = - [ln  2j + ln(2  ) + 2 Y  X j  j  Y  X j  j  ] .
(2.2)
n j
2
The log likelihood can be regarded as an estimator of the expected log likelihood. The
mean expected log likelihood is the mean with respect to the data, of the expected
likelihood of the maximum likelihood model and is a measure for the goodness of fit of a
model. The model with the largest mean expected log likelihood can be considered to
be the best model. The mean expected log likelihood can be estimated by the maximum
log likelihood which for the jth model is given by
Lj(  j ,  2j ) =  n2 ln  2j  ln(2 )  1
(2.3)
Proceedings of 23rd International Business Research Conference
18 - 20 November, 2013, Marriott Hotel, Melbourne, Australia, ISBN: 978-1-922069-36-8
E2
where,  2j = nj is the maximum likelihood estimator (MLE) of  2j ,
E 2j = (Y - Xj  j )/( Y - Xj  j ) is the residual sum of squares and  j = ( X j X j ) -1 X jY .
Unfortunately, the maximum log likelihood has a general tendency to overestimate the
true value of the mean expected log likelihood. This tendency is more prominent for
models with a large number of parameters and implies that if we choose the model with
the largest maximum log likelihood, a model with an unnecessarily large number of
parameters is likely to be chosen. It is evident from (2.3) that choosing the model with
the largest maximum log likelihood is equivalent to choose the model with the smallest
residual sum of squares ( E 2j ). Therefore, the model with the smallest E 2j can be
considered to be the best model. Also if we choose the model with the smallest E 2j , a
model with an unnecessarily large number of parameters is likely to be chosen. To
overcome this problem we need some adjustment in the E 2j before using it for model
selection. This is done by using a penalty function dependent on the number of
parameters, among other things. Let Pj be the penalty function for the model Mj. Then
we usually select the model with the smallest Ij given by
Ij = E 2j Pj .
(2.4)
It suggests that
if E 2j Pj < Ei2 Pi,  i = 1,2,..,( j - 1),( j  1),.., m
(2.5)
then the model Mj will be our choice of the best model.
All existing model selection criteria can be expressed in the above form.
(a) Theil’s adjusted R2 criterion
Theil (1961) suggested the adjusted R2 criterion for model comparison and is given by
nE 2j
2
.
(2.6)
R =1n - k j S 2j
This criterion will select the model with the largest R 2 . The probability of correct
selection under this criterion when the model Mj is true can be written as
2


nEi2
 nE j

2

; i = 1,2,..., ( j - 1), ( j  1),.., m .
P(CS| Mj, R ) = P 
(2.7)


 n - k j  n - k i 

The equation (2.7) implies that for model selection the R 2 criterion can be written in the
following equivalent form
nE 2j
(2.8)
R2 
n - k j  .
Therefore, (2.8) is a particular case of (3.4) with Pj=
n
2
n - k j  . That is R criterion is a
special case of (2.4). Also using R 2 criterion we will select the model with the smallest
error variance. Theil (1961) showed that this criterion will select the true model at least
Proceedings of 23rd International Business Research Conference
18 - 20 November, 2013, Marriott Hotel, Melbourne, Australia, ISBN: 978-1-922069-36-8
as often as any other model. Later on Schmidt (1973, 1975) showed that R 2 criterion
will not help us in selecting the true model from a set of competing alternative
regression models if a regression model contains both the variables of the true model
with some extra, irrelevant, regressors.
(b) Mallows Cp criterion
Mallows (1964, 1973) proposed a criterion for model selection which can be written for
the jth model in the following form,
(n  k j ) E 2j
Cp =
(2.9)
n - k j  .
Also Rothman (1968), Akaike (1969) and Amemiya (1980) have suggested this
criterion. Rothman called it Jp, Akaike called it Final Prediction Error (FPE) and
Amemiya called it Prediction Criterion (PC).
(n  k j )
Therefore the Cp criterion is a special case of (2.4) with Pj=
n - k j  .
(c) Hocking Sp criterion
Hocking (1976) suggested Sp criterion for model selection and is given by
E 2j
Sp =
n - k j n - k j - 1
1
Therefore the Sp criterion is a special case of (2.4) with Pj=
n - k j n - k j - 1
(2.10)
(d) Generalised Cross Validation criterion
Craven and Wahba (1979) proposed Generalised Cross Validation (GCV) criterion and
can be written as
E 2j
GCV =
.
(2.11)
2
 kj 
1 - 
 n 
Therefore the GCV criterion is an special case of (2.4) with Pj=
1
 kj
1  n



2
.
(e) Hannan and Quinn criterion (HQC)
Hannan and Quinn (1979) suggested a model selection criterion and can be expressed
as
HQC = E 2j ln n 
2k j
n
.
(2.12)
Proceedings of 23rd International Business Research Conference
18 - 20 November, 2013, Marriott Hotel, Melbourne, Australia, ISBN: 978-1-922069-36-8
Therefore the HQC is an special case of (2.4) with Pj= ln n 
2k j
n
.
(f) Akaike information criterion (AIC)
Akaike (1973) proposed a model selection criterion usually denoted by AIC and can be
expressed as
AIC = Lj(  j ,  2j ) - kj
(2.13)
The probability of correct selection under AIC when the model Mj is true can be written
as
P(CS| Mj, AIC ) = P L j ˆ j ,ˆ 2j - k j  Li ˆi ,ˆ i2 - ki ; i = 1,2,.., ( j - 1), ( j  1),.., m
 
= P ln ˆ
 








= P  n2 ln ˆ  ln(2 )  1 - k j   n2 ln ˆ  ln(2 )  1 - ki ; i = 1,2,.., ( j - 1), ( j  1),.., m
n
2
2
j
2
j
2
i

- k j  n2 ln ˆ - ki ; i = 1,2,.., ( j - 1), ( j  1),.., m
2
i
2k j
2 ki




P(CS| Mj, AIC ) = PE 2j e n  Ei2 e n ; i = 1,2,.., ( j - 1), ( j  1),.., m .




Therefore the AIC criterion can be written in the following equivalent form
AIC  E e
2
j
(2.14)
2k j
n
.
(2.15)
2k j
Therefore AIC is a special case of (2.4) with Pj= e
n
.
(g) Schwarz Bayes Information Criterion (AIC)
Schwarz (1978) proposed the Bayes Information Criterion usually denoted by BIC and
can be expressed as
kj
BIC = Lj(  j ,  2j ) (2.16)
ln n
2
The probability of correct selection under BIC when the model Mj is true can be written
as
kj


k
P(CS| Mj, BIC ) = PL j ˆ j , ˆ 2j - ln n  Li ˆi , ˆ i2 - i ln n; i = 1,2,.., ( j - 1), ( j  1),.., m
2
2






kj


k
= P n2 ln ˆ j2  ln( 2 )  1 - ln n   n2 ln ˆ i2  ln( 2 )  1 - i ln n; i = 1,2,.., ( j - 1), ( j  1),.., m
2
2


kj


k
= P n2 ln ˆ j2 - ln n  n2 ln ˆ i2 - i ln n; i = 1,2,.., ( j - 1), ( j  1),.., m
2
2


kj
ki




P(CS| Mj, BIC ) = PE 2j n n  Ei2 n n ; i = 1,2,.., ( j - 1), ( j  1),.., m .
(2.17)




Hence the BIC criterion can be written in the following equivalent form




Proceedings of 23rd International Business Research Conference
18 - 20 November, 2013, Marriott Hotel, Melbourne, Australia, ISBN: 978-1-922069-36-8
kj
BIC  E n n
2
j
(2.18)
kj
Therefore, the BIC criterion is a special case of (2.4) with Pj= n n
(h) Joint Information Criterion (JIC)
Rahman and King (1999) proposed the Joint Information Criterion usually denoted by
JIC and can be expressed as

 k j 
1

JIC = Lj(  j ,  2j ) - k j ln n - n ln 1 -  .
(2.19)
4
n





The probability of correct selection under JIC when the model Mj is true can be written
as P(CS| Mj, JIC )

 k j 
1
1
 k  


2
2
 L j ˆ j , ˆ j - k j ln n - n ln 1 -   Li ˆi , ˆ i - k i ln n - n ln 1 - i ;
= P
4
4
 n  
 n 




i = 1,2,.., ( j - 1), ( j  1),.., m


kj
ki
1
1




(2.20)
= PE 2j n - k j  2 n  2 n  Ei2 n - k i  2 n  2 n ; i = 1,2,.., ( j - 1), ( j  1),.., m .


Hence the JIC criterion can be written in the following equivalent form




kj
JIC 
E 2j n  2 n
n - k 
,
(2.21)
j
kj
which is of the form (2.4) with Pj=
n  2 n
n - k 
. Therefore the AIC, BIC, HQC, R 2 , Sp, Cp,
j
JIC and GCV criteria are special cases of (2.4) and can be easily obtained by suitable
choices of Pj.
Proceedings of 23rd International Business Research Conference
18 - 20 November, 2013, Marriott Hotel, Melbourne, Australia, ISBN: 978-1-922069-36-8
Table 1: Values Pj of for Some Model Selection Criteria
Criteria
AIC
Pj
e
GCV
Criteria
BIC
2k j
n
kj
nn
1
 kj
1  n
JIC
Pj



HQC
ln n n
R2
n
n - k j 
Sp
1
n - k j n - k j - 1
2
kj
n  2 n
n - k 
2k j
j
Cp
(n  k j )
n - k 
j
3. Probability of Correct Selection
Suppose CS denotes the event of correct selection and P(CS|M j)
denotes the probability of correct selection when the model Mj is true. Therefore,
P(CS|Mj) = P[ I j - I i < 0 | Mj ; i = 1,2, . ., (j -1), (j +1), . . , m].
(3.1)
2
2
=P[ E j Pj < E i Pi ; i = 1,2, . ., (j -1), (j +1), . . , m]
As E 2j is the residual sum of squares for the model Mj, E 2j /  2j follows chi-square
distribution with (n-kj) degrees of freedom.
= P[  2j , ( nk j )  2j Pj <  i2, ( n ki )  i2 Pi ; i = 1,2, . ., (j -1), (j +1), . . , m]
  2j , ( nk j ) /( n  k j ) (n  k ) 2 P

i
i i

; i  1,2,....,  j  1), ( j  1),..., m 
=P 2
2
  i , ( n ki ) /( n  k i ) (n  k j ) j Pj

As the ratios of two independent mean Chi-squares follows Snedecor’s F-distribution,
therefore,


(n  k i ) i2 Pi
; i  1,2,....,  j  1), ( j  1),..., m
P(CS|Mj) = = P  F( n k j ), ( n ki ) 
2
(n  k j ) j Pj




(n  k i ) Pi
 i2
=P  F( n k j ), ( n ki )  2 Pij ; i  1,2,....,  j  1), ( j  1),..., m , where, Pij =
.
(n  k j ) Pj
j


Proceedings of 23rd International Business Research Conference
18 - 20 November, 2013, Marriott Hotel, Melbourne, Australia, ISBN: 978-1-922069-36-8
For Theil’s adjusted R2, Pij =
(n  k j )  n /( n  k i )
n - k i   n /( n  k j )
= 1. For Mallows Cp, Pij
(n  k i )
(n  k i ) (n  k i )
=
=
. Similarly Pij for other criteria can be obtained and presented
(n  k j ) n  k j 
n - k j 
(n  k j )
(n  k i )
in Table 2.
Table 2: Values Pij of for Some Model Selection Criteria
Pij
Criteria
AIC
(n  k i )
e
(n  k j )
2 ( ki  k j )
n
(n  k j )
GCV
Criteria
BIC
HQC
n - k i 
(n  k i )
JIC
(n  k ij )
( ki  k j )
n
(n  k i )
n  k j 
Cp
2n
Pij
(n  k i )
n
(n  k j )
( ki  k j )
(n  k i )
(ln n)
(n  k j )
n
2 ( ki  k j )
R2
1
Sp
(n  k j  1)
n
n - k i  1
3.1: Comparative Study based on probability of correct selection
If ki > kj, then for Cp , Pij =
(n  k i )
(n  k i )
> 1 but if ki < kj, then Pij =
n  k j 
n  k j  < 1.
Let Pcs(Cp) be the probability of correct selection under Mallows Cp criterion.
As Pij( R 2 ) = 1, therefore,
Pcs(Cp) > Pcs( R 2 ), if ki < kj and Pcs(Cp) < Pcs( R 2 ), if ki < kj.
Consider, Pij(GCV) - Pij(Cp) =
(3.2)
(k i  k j )(k i  k j )
(n  k j ) (n  k i )
=
= positive, if ki > kj and
n - k i  n  k j  (n  k i )n  k j 
= negative if ki < kj.
Therefore, Pcs(GCV) > Pcs(Cp) if ki > kj and Pcs(GCV) < Pcs(Cp) if ki < kj.
(3.3)
Proceedings of 23rd International Business Research Conference
18 - 20 November, 2013, Marriott Hotel, Melbourne, Australia, ISBN: 978-1-922069-36-8
Similarly, Pij(Sp) - Pij(GCV) =
(n  k j  1) (n  k j )
(k i  k j )
=
= positive, if ki > kj
n - k i  1 n - k i  (n  k i )n  k i  1
and
= negative if ki < kj.
Therefore, Pcs(Sp) > Pcs(GCV) if ki > kj and Pcs(Sp) < Pcs(GCV) if ki < kj.
Similarly, Pij(AIC) - Pij(Sp) =
(n  k i )
e
(n  k j )
2 ( ki  k j )
-
n
(n  k j  1)
n - k i  1
= positive, if ki > kj and
= negative if ki < kj.
Therefore, Pcs(AIC) > Pcs(Sp) if ki > kj and Pcs(AIC) < Pcs(Sp) if ki < kj.
(n  k i )
n
Similarly, Pij(BIC) - Pij(AIC) =
(n  k j )
( ki  k j )
n
(n  k i )
e
(n  k j )
(3.4)
(3.5)
2 ( ki  k j )
n
= positive, if ki > kj and
= negative if ki < kj.
Therefore, Pcs(BIC) > Pcs(AIC) if ki > kj and Pcs(BIC) < Pcs(AIC) if ki < kj.
(3.6)
Combining (3.2), (3.3), (3.4), (3.5) and (3.6) we have,
Pcs(BIC) > Pcs(AIC) > Pcs(Sp) > Pcs(GCV) > Pcs(Cp) > Pcs( R 2 ), if ki > kj and
Pcs(BIC) < Pcs(AIC) < Pcs(Sp) < Pcs(GCV) < Pcs(Cp) < Pcs( R 2 ), if ki < kj .
Therefore, when the model with lower number of regressors is true then the
performance of BIC is better than that of AIC, the performance of AIC is better than that
of Sp, the performance of Sp is better than that of GCV, the performance of GCV is
better than that of Cp, the performance of Cp is better than that of R 2 criterion. On the
other hand when the model with higher number of regressors is true then the
performances of BIC, AIC, Sp, GCV, Cp and R 2 criteria are exactly reverse. The
probabilities of correct selection for some hypothetical values of n, k i, kj,  i2 &  2j are
calculated and presented in Table 3.
Proceedings of 23rd International Business Research Conference
18 - 20 November, 2013, Marriott Hotel, Melbourne, Australia, ISBN: 978-1-922069-36-8
Table 3: Probabilities of Correct Selection for Some Criteria
n
20
20
20
20
20
20
30
30
30
30
30
30
40
40
40
40
40
40
60
60
60
60
60
60
100
100
100
100
100
100
kj
3
7
2
4
3
4
3
7
2
4
3
4
3
7
2
4
3
4
3
7
2
4
3
4
3
7
2
4
3
4
ki
7
3
4
2
4
3
7
3
4
2
4
3
7
3
4
2
4
3
7
3
4
2
4
3
7
3
4
2
4
3
BIC
0.988
0.184
0.931
0.482
0.868
0.634
0.966
0.338
0.899
0.569
0.843
0.675
0.945
0.432
0.878
0.614
0.829
0.696
0.913
0.536
0.852
0.661
0.813
0.717
0.874
0.623
0.826
0.700
0.798
0.735
HQC
0.976
0.276
0.910
0.542
0.852
0.662
0.946
0.430
0.878
0.615
0.829
0.696
0.921
0.513
0.857
0.652
0.816
0.713
0.886
0.598
0.834
0.689
0.803
0.730
0.849
0.666
0.811
0.720
0.790
0.745
JIC
0.976
0.279
0.907
0.550
0.850
0.665
0.942
0.444
0.873
0.624
0.826
0.700
0.915
0.528
0.853
0.660
0.814
0.716
0.881
0.609
0.830
0.695
0.801
0.733
0.845
0.672
0.809
0.723
0.789
0.746
AIC
0.972
0.301
0.904
0.556
0.847
0.668
0.933
0.474
0.866
0.636
0.822
0.705
0.903
0.559
0.845
0.673
0.809
0.722
0.866
0.637
0.821
0.707
0.796
0.738
0.830
0.694
0.801
0.733
0.785
0.751
Sp
0.895
0.579
0.832
0.692
0.803
0.730
0.850
0.664
0.809
0.723
0.789
0.746
0.828
0.697
0.798
0.736
0.784
0.752
0.807
0.724
0.788
0.748
0.778
0.758
0.791
0.744
0.780
0.756
0.774
0.762
GCV
0.889
0.593
0.828
0.697
0.801
0.733
0.847
0.669
0.807
0.724
0.789
0.746
0.827
0.699
0.797
0.737
0.783
0.752
0.807
0.725
0.787
0.748
0.778
0.758
0.791
0.744
0.779
0.756
0.774
0.762
Cp
0.847
0.669
0.814
0.716
0.791
0.744
0.827
0.699
0.801
0.733
0.784
0.751
0.815
0.715
0.793
0.741
0.781
0.755
0.801
0.732
0.785
0.750
0.777
0.759
0.789
0.746
0.779
0.757
0.773
0.763
R2
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
0.768
Based on the information in Table 3 we may conclude that in general
Pcs(BIC) > Pcs(HQC) > Pcs(JIC) > Pcs(AIC) > Pcs(Sp) > Pcs(GCV) > Pcs(Cp) > Pcs( R 2 ), if
ki > kj and
Pcs(BIC) < Pcs(HQC) < Pcs(JIC) < Pcs(AIC) < Pcs(Sp) < Pcs(GCV) < Pcs(Cp) < Pcs( R 2 ), if
ki < kj .
Proceedings of 23rd International Business Research Conference
18 - 20 November, 2013, Marriott Hotel, Melbourne, Australia, ISBN: 978-1-922069-36-8
4. Concluding Remarks
We have developed analytical formulae for finding the probability of correct selection in
regression model selection. We have applied these formulae for finding probabilities of
correct selection under AIC, BIC, JIC, GCV, HQC, Sp, Cp and R 2 criteria and for
comparing these criteria among themselves. We have observed that analytical formulae
are easy to apply and less time consuming than Monte Carlo simulation. It is observed
that for lower parametric model BIC performs best and the performances of other
criteria are in order HQC, JIC, AIC, Sp, GCV, Cp and R 2 respectively. On the other hand
for higher parametric model the performances of BIC, HQC, JIC, AIC, Sp, GCV, Cp and
R 2 criteria are exactly reverse.
It is well known that choosing an appropriate penalty function is the main problem in
developing a new criterion. By calculating the probability of correct selection exactly we
are expecting to better control the choice of penalty function.
At present many information criteria are used to choose between competing alternative
models. If we rank these available criteria on the basis of probabilities of correct
selection, we can see that these rankings vary dramatically from one model to another.
So, we need to develop new criterion which gives consistent probability for all
competing models. We are currently working on this problem and expect to report on
this work in a future paper.
References
Akaike, H 1969, ‘Fitting Autoregressive Models for Prediction’, Annals of the
Institute of Statistical Mathematics, 21, 243-247.
Akaike, H 1973, ‘Information theory and an extension of the maximum likelihood
principle’, Proceedings of the Second International Symposium on Information
Theory, B.N. Petrov and F. Csaki, Akademial Kiado, Budapest, 267-281.
Amemiya, T 1980, Selection of Regressors, International Economic Review, 21, 331354.
Craven, P and Wahba, G 1979, Smoothing Noisy data with Spline Functions:
Estimating the Correct Degree of Smoothing by the Method of Generalized Cross
Validation, Numerische Mathematik, 31, 377-403.
Forbes, CS, King, ML and Morgan A 1995, Small sample variable selection
procedures, Proceedings of the 1995 Econometrics Conference at Monash, 343360.
Fox, KJ 1995, ‘Model Selection Criteria: A Reference Source’, University of British
Columbia and University of NSW, School of Economics, Sydney, Australia.
Grose, SD and King, ML 1994, ‘The use of information criteria for model selection
between models with equal number of parameters’ Paper presented at the 1994
Australian Meeting of the Econometric Society.
Hannan, EJ and Quinn, BG 1979, The determination of the order of an auto-
Proceedings of 23rd International Business Research Conference
18 - 20 November, 2013, Marriott Hotel, Melbourne, Australia, ISBN: 978-1-922069-36-8
regression, Journal of the Royal Statistical Society, Series B, 41, 190-195.
Hocking, RR 1976, The analysis and selection of variables in linear regression,
Biometrics, 32, 1-49.
King, ML 1981, The Durbin-Watson bounds test and regressions without an
intercept, Australian Economic Papers, 20,161-170.
King, ML, Forbes, CS and Morgan A 1996, Improved small sample model selection
procedures, Paper presented at the World Congress of the Econometric Society,
Tokyo.
Mallows, CL 1964, ‘Choosing Variables in a Linear Regression: A graphical Aid’,
Presented at the Central Regional Meeting of the Institute of Mathematical
Statistics, Manhattan, Kansas (May).
Mallows, CL 1973, Some Comments on Cp, Technometrics, 15, 661-676.
Mills, JA and Prasad, K 1992, A comparison of model selection criteria, Econometric
Reviews, 11, 201-233.
Rahman, MS and King, ML 1999, Improved Model Selection Criterion,
Communications in Statistics, Simulation and Computation. 28(1), 51-71.
Rahman, MS and Nahar, S 2004, Generalized Model Selection Criterion, Far East
Journal of Theoretical Statistics, 12(2), 117-147.
Rothman, D 1968, Letter to the Editor, Technometrics, 10, 432.
Schmidt, P 1973, Calculating the Power of the Minimum Standard Error Choice
Criterion, International Economic Review, 14, 253-255.
Schmidt, P 1975, Choosing among Alternative Linear Regression Models: A
correction and some Further Results, Atlantic Economic Journal, 3, 61-63.
Schwartz, G 1978, Estimating the dimension of a model, The Annals of Statistics, 6,
461-464.
Theil, H 1961, Economic Forecasts and Policy, 2nd edition, North-Holland,
Amsterdam.
Download