User guide for “Model1_3by3

advertisement
Estimating the Polychoric Correlation from Misclassified Data
by
Choi-Fan Yiu and Wai-Yin Poon
British Journal of Mathematical and Statistical Psychology, 2008, Vol 61,
p.49-74.
User guide for the Excel programmes described in the paper
Excel files have been developed to produce the results of the examples in the
captioned paper. This user’s guide provides detailed descriptions of these Excel files
with Visual Basic for Applications.
A. Excel Prerequisite Settings
As the Excel files contain macros, the user may need to consult a computer officer for
the prerequisite Excel settings. The following are the steps that we have completed
before opening a specific Excel file.
1. Open Microsoft Excel, select Tools → Macro → Security, and under Security
Level select Medium → OK.
2. Select Tools → Add-Ins → Tick Analysis ToolPak – VBA and Solver Addin → OK.
3. Close Excel completely.
The Excel settings that are evoked by these steps will be saved in the user’s computer;
there is no need to repeat them when the Excel files are used the next time.
1
B. The File “Model1_3x3.xls”
This file can be used to compute the estimate of the thresholds and the polychoric
correlation of two variables with known misclassification probabilities. Example 1 in
Section 2.3 of the paper is used for illustration.
Steps to obtain the estimate and standard error of the thresholds and the
polychoric correlation:
Step 1:
Open the file “Mode1_3x3.xls” and press Enable Macros in the message box. In the
spreadsheet “Estimation”, the four main regions are highlighted in blue for data entry.
Step 2: Data Entry
Region 1: Cells E2 to E10
Observed cell frequencies (n11, n12, n13, n21, n22, n23, n31, n32, n33) should be input into
these cells.
Example 1 in the paper: The frequencies (209, 101, 237, 151, 126, 426, 16, 21, 138)
should be input into these cells.
Region 2: Cells E14 to E22
The values of  uvj (see Section 2.1 (b) of the paper) should be input in the order of (u,v)
= (1,1), (1,2),…, (3,3) into these cells.
Example 1 in the paper: When a model of no misclassification is employed,  uvj  1
for all u and v, and 1 should be input into these cells.
Region 3: Cells F14 to N22
The values of  hk (uv) (see Section 2.1 (c) of the paper) should be input into these cells.
For example, the values of  hk (11) should be input in the order of (h,k) = (1,1),
2
(1,2), …, (3,3) to cells F14 to F22. Accordingly, the values of  hk (33) should be input
in the order of (h,k) = (1,1), (1,2), …, (3,3) into cells N14 to N22.
Example 1 in the paper: As  hk (11)
 0 0.5 0 


  0.5 0 0  , the values (0, 0.5, 0, 0.5, 0, 0, 0, 0,
 0
0 0 

0) should be input into cells F14 to F22.
Region 4: Cells C2 to C6
These cells store the values of the estimates ( ˆ , ˆ 1 , ˆ 2 , ˆ1 , ˆ 2 ) in each iteration of the
optimization procedure. The user can also input the selected set of starting values in
these cells. The values in the cells are updated upon the completion of Step 4.
Step 3:
Press Ctrl + Q to obtain the values of  hk ( uv) j (see equation (3) of the paper). These
values are shown in Cells F2 to N10.
Step 4:
Select Tools → Solver → Solve → OK to obtain the results.
Results
The ML estimate and the standard error of the parameters are shown in the green
region.
Note
DO NOT delete or modify any cells except the values in the blue regions.
3
C. The Files “Model2_2x2.xls”, “Model2_2x3.xls” and “Model2_3x3.xls”
These files can be used to compute the thresholds and the polychoric correlation of
two variables based on double sampling data. They can be used to produce the results
of the examples in Section 4 of the paper. The results in Example 2 (2 x 2 table),
Example 3 (2 x 3 table) and Example 4 (3 x 3 table) can be produced by the files
“Model2_2x2.xls”, “Model2_2x3.xls” and “Model2_3x3.xls” respectively. As the
structure of the three files is the same, we use “Model2_3x3.xls” and Example 4 to
illustrate the basic steps.
Steps to obtain the estimate and standard error of the thresholds and the
polychoric correlation:
Step 1:
Open the file “Model2_3x3.xls” and press Enable Macros in the message box. In the
spreadsheet “3by3”, two main regions are highlighted in blue for data entry.
Step 2: Data Entry
Region 1: Cells E2 to M10
The observed cell frequencies nhk(uv) of the first sample should be input into these cells.
For example, the cell frequencies (n11(11), n12(11), …n33(11)) should be input into cells E2
to M2, and the cell frequencies (n11(33), n12(33), …n33(33)) should be input into cells E10
to M10.
Example 4 in the paper: The observed frequencies (91, 12, 0, 2, 0, 0, 7, 0, 0) should be
input into cells E2 to M2, and the observed frequencies (0, 0, 2, 0, 1, 1, 0, 1, 20)
should be input into cells E10 to M10.
Region 2: Cells E12 to M12
The observed cell frequencies of (n*11 , n*12 , … n*33 ) of the second sample should be
input into cells E12 to M12.
Example 4 in the paper: The observed cell frequencies of the second sample (227,
259, 63, 182, 378, 123, 40, 76, 52) should be input into cells E12 to M12.
4
Region 3: Cells C2 to C6
These cells store the values for ( ˆ , ˆ 1 , ˆ 2 , ˆ1 , ˆ 2 ) in each iteration of the optimization
procedure. The user can also input the selected set of starting values in these cells.
The values in the cells are updated upon the completion of Step 3.
Step 3:
Select Tools → Solver → Solve → OK to obtain the results.
Results
The ML estimate and the standard error of the parameters are shown in the green
region.
Note
DO NOT delete or modify any cells except the values in the blue regions.
5
Download