Estimating the Polychoric Correlation from Misclassified Data by Choi-Fan Yiu and Wai-Yin Poon British Journal of Mathematical and Statistical Psychology, 2008, Vol 61, p.49-74. User guide for the Excel programmes described in the paper Excel files have been developed to produce the results of the examples in the captioned paper. This user’s guide provides detailed descriptions of these Excel files with Visual Basic for Applications. A. Excel Prerequisite Settings As the Excel files contain macros, the user may need to consult a computer officer for the prerequisite Excel settings. The following are the steps that we have completed before opening a specific Excel file. 1. Open Microsoft Excel, select Tools → Macro → Security, and under Security Level select Medium → OK. 2. Select Tools → Add-Ins → Tick Analysis ToolPak – VBA and Solver Addin → OK. 3. Close Excel completely. The Excel settings that are evoked by these steps will be saved in the user’s computer; there is no need to repeat them when the Excel files are used the next time. 1 B. The File “Model1_3x3.xls” This file can be used to compute the estimate of the thresholds and the polychoric correlation of two variables with known misclassification probabilities. Example 1 in Section 2.3 of the paper is used for illustration. Steps to obtain the estimate and standard error of the thresholds and the polychoric correlation: Step 1: Open the file “Mode1_3x3.xls” and press Enable Macros in the message box. In the spreadsheet “Estimation”, the four main regions are highlighted in blue for data entry. Step 2: Data Entry Region 1: Cells E2 to E10 Observed cell frequencies (n11, n12, n13, n21, n22, n23, n31, n32, n33) should be input into these cells. Example 1 in the paper: The frequencies (209, 101, 237, 151, 126, 426, 16, 21, 138) should be input into these cells. Region 2: Cells E14 to E22 The values of uvj (see Section 2.1 (b) of the paper) should be input in the order of (u,v) = (1,1), (1,2),…, (3,3) into these cells. Example 1 in the paper: When a model of no misclassification is employed, uvj 1 for all u and v, and 1 should be input into these cells. Region 3: Cells F14 to N22 The values of hk (uv) (see Section 2.1 (c) of the paper) should be input into these cells. For example, the values of hk (11) should be input in the order of (h,k) = (1,1), 2 (1,2), …, (3,3) to cells F14 to F22. Accordingly, the values of hk (33) should be input in the order of (h,k) = (1,1), (1,2), …, (3,3) into cells N14 to N22. Example 1 in the paper: As hk (11) 0 0.5 0 0.5 0 0 , the values (0, 0.5, 0, 0.5, 0, 0, 0, 0, 0 0 0 0) should be input into cells F14 to F22. Region 4: Cells C2 to C6 These cells store the values of the estimates ( ˆ , ˆ 1 , ˆ 2 , ˆ1 , ˆ 2 ) in each iteration of the optimization procedure. The user can also input the selected set of starting values in these cells. The values in the cells are updated upon the completion of Step 4. Step 3: Press Ctrl + Q to obtain the values of hk ( uv) j (see equation (3) of the paper). These values are shown in Cells F2 to N10. Step 4: Select Tools → Solver → Solve → OK to obtain the results. Results The ML estimate and the standard error of the parameters are shown in the green region. Note DO NOT delete or modify any cells except the values in the blue regions. 3 C. The Files “Model2_2x2.xls”, “Model2_2x3.xls” and “Model2_3x3.xls” These files can be used to compute the thresholds and the polychoric correlation of two variables based on double sampling data. They can be used to produce the results of the examples in Section 4 of the paper. The results in Example 2 (2 x 2 table), Example 3 (2 x 3 table) and Example 4 (3 x 3 table) can be produced by the files “Model2_2x2.xls”, “Model2_2x3.xls” and “Model2_3x3.xls” respectively. As the structure of the three files is the same, we use “Model2_3x3.xls” and Example 4 to illustrate the basic steps. Steps to obtain the estimate and standard error of the thresholds and the polychoric correlation: Step 1: Open the file “Model2_3x3.xls” and press Enable Macros in the message box. In the spreadsheet “3by3”, two main regions are highlighted in blue for data entry. Step 2: Data Entry Region 1: Cells E2 to M10 The observed cell frequencies nhk(uv) of the first sample should be input into these cells. For example, the cell frequencies (n11(11), n12(11), …n33(11)) should be input into cells E2 to M2, and the cell frequencies (n11(33), n12(33), …n33(33)) should be input into cells E10 to M10. Example 4 in the paper: The observed frequencies (91, 12, 0, 2, 0, 0, 7, 0, 0) should be input into cells E2 to M2, and the observed frequencies (0, 0, 2, 0, 1, 1, 0, 1, 20) should be input into cells E10 to M10. Region 2: Cells E12 to M12 The observed cell frequencies of (n*11 , n*12 , … n*33 ) of the second sample should be input into cells E12 to M12. Example 4 in the paper: The observed cell frequencies of the second sample (227, 259, 63, 182, 378, 123, 40, 76, 52) should be input into cells E12 to M12. 4 Region 3: Cells C2 to C6 These cells store the values for ( ˆ , ˆ 1 , ˆ 2 , ˆ1 , ˆ 2 ) in each iteration of the optimization procedure. The user can also input the selected set of starting values in these cells. The values in the cells are updated upon the completion of Step 3. Step 3: Select Tools → Solver → Solve → OK to obtain the results. Results The ML estimate and the standard error of the parameters are shown in the green region. Note DO NOT delete or modify any cells except the values in the blue regions. 5