Solution

advertisement
BOLGATANGA POLYTECHNIC
School of Applied Science and Arts
Department of Statistics
End of First Semester Examination: 2013/2014
(December 2013)
Course: Statistical Computing I (STA217)
Marking scheme
Software: QBASIC/JUST BASIC & MICROSOFT OFFICE EXCEL
Q1
(a)
input "Sample size: "; n
input "Hypothesized variance: "; h
for i = 1 to n
input "x";i; " = "; x: x(i)= x
sum = sum + x(i)
next i
ave=sum/n
cls
print "THE SUM = "; sum
print
print "MEAN = "; ave
' THIS PORTION CALCULATES AND PRINT
' THE VARIANCE
for i = 1 to n
diff = x(i) – ave
s = diff^2
sum2 = sum2 + s
next i
D = (sum2/(n-1))
print
print "THE VAR = "; D
' Chi-square statistic
chi = (D/h)*(n-1)
print
print "Chi(";(n-1);")"; " = "; chi
end
½
½
½
1
½
½
½
mark
mark
mark
mark
mark
mark
mark
½ mark
½ mark
½
½
½
½
½
½
mark
mark
mark
mark
mark
mark
½ mark
½ mark
½ mark
½ mark
(b)
Advantages of array formulae (any three of the following)
3marks
i.
Array Formulae are concise: You can eliminate columns and rows by packing the
calculations into an array formula
ii.
Array Formulae are powerful: You can easily perform many complex calculations such
as multiple conditional sums and counts using array formulae
iii.
Array Formulae save disk space: Using array formulae covering a range of cells can
reduce workbook size compared to equivalent individual formulae. Memory used does
not usually seem to reduce significantly
iv.
Array Formulae offer increased protection: Excel will only allow you to alter a complete
array formula block so the user is prevented from accidentally changing a single formula
Disadvantages of array formulae (any two of the following)
2marks
i.
The Black-Box effect: Array Formulae can be complex and hard to understand. Most
Excel users do not understand array formulae at all. This can reduce confidence and
usability of your spreadsheet
ii.
Calculation Overhead: Each time an array formula is calculated all of the virtual cells
needed by the array formula are calculated, regardless of whether this is required or not.
This may cause the array formula to be slower than a non-array equivalent set of formulae
iii.
Requirement for all components of the array formulae to be the same size: This may
require the array formula to perform a large number of unnecessary calculations
(c)
The student is expected to design an Excel template similar to the figure below:
10marks
Q2.
(a)
A
3
4
5
6
7
8
9
B
product
barrels to produce
profit per barrel
ingredients
10 corn
11 hops
12 malt
C
D
KB
KR
¢
18
20 ¢
KB
E
24
30
KR
20
2
10
1
30
F
total
profit
G
¢ 1,080.0
total
usage
360
60
900
qty
available
500
60
900
6marks
Objective function:
F5 = SUMPRODUCT(C4:D4,C5:D5)
2marks
Decision variables:
E10
-
amount of corn required for a given production
E11
-
amount of hops required for a given production
E12
-
amount of malt required for a given production
E10 = SUMPRODUCT($C$4:$D$4,C10:D10)
-
applied to E11 and E12
2marks
(b)
INPUT A, B
½ mark
IF A > B THEN
½ mark
GOTO [GREATER]
½ mark
END IF
½ mark
Y = A + B
½ mark
GOTO [QUIT]
½ mark
[GREATER]
½ mark
Y = A * B
½ mark
[QUIT]
½ mark
PRINT “Y = “; Y
½ mark
END
½ mark
Q3
(a)
INPUT N
½ mark
K = 1
½ mark
J = 1
½ mark
[START]
½ mark
T = 1/K
½ mark
S = S + T * J
½ mark
K = K + 2
½ mark
J = 0 – J
½ mark
IF K <= N THEN
½ mark
GOTO [START]
½ mark
END IF
½ mark
PRINT “S = “; S
½ mark
END
(b)
A
1
2
3
4
5
6
Candidate
Score Before
Score After
Difference
Average
Difference
Standard Deviation
Count
Significance Level
7
8
9
1
0 T-Calculated
1
1 T-Critical
1
2 Decision
Formulae
B
C
D
E
F
G
H
I
J
1
2
3
4
5
6
7
8
9
38
41
52
40
45
49
2
4
-3
2
7
3
0
3
1
8
2
4
6
1
9
2
4
5
1
4
1
9
5
5
0
4
9
-1
3
8
3
6
-2
K
1
0
4
0
3
9
-1
Mark
s
1
½
1.8
3.293
10
0.05
½
½
½
½
1.729
½
2.262
Do not reject null
hypothesis
½
B6 = AVERAGE(B4:K4)
B7 = ROUND(STDEV(B4:K4),3)
B8 = COUNT(B4:K4)
B10 = ROUND(ABS(B6-0)/(B7/SQRT(B8)),3)
B11 = ROUND(TINV(B9,B8-1),3)
B12 =IF(B10<B11,"Do not reject null
hypothesis", "Reject null hypothesis")
½
½
½
1½
1
1½
Total marks
10
(c)
2
2
4
6
2
1
2
-1
-2
0
Coefficients
1
1
3
-1
-1
2
0
1
1
-2
Marks
2
-1
1
-2
1
Inverse of coefficient matrix
=ROUND(MINVERSE(B2:F6),3)
-0.385
0.269
0.538
-0.192
-1.333
1
1.667
-1
1.513
-0.692 -1.718
0.923
0.615
-0.231 -0.462
0.308
0.487
-0.308 -0.282
0.077
0.115
0
-0.154
-0.385
0.154
½
mark
Constants
7
8
1
-3
2
Marks
½
mark
answer:
2marks =MMULT(B10:F14,H2:H6) 2marks
x1
0.801
x2
3.336
x3
0.26
2marks
2marks
x4
0.301
x5
0.74
Q4.
(a)
Data entry
1mark
ANOVA
Source of Variation
Temperature
Pressure
Interaction
Error
SS
0.301
0.768
0.069
0.160
Total
1.298 17
df
2
2
4
9
MS
F
P-value F crit
0.151 8.469
0.009 4.256
0.384 21.594
0.000 4.256
0.017 0.969
0.470 3.633
0.018
3marks
i.
Hypothesis:
H0:
No main factor effect
H1:
There is main factor effect
1marks
Both pressure and temperature are significant at the 5% significance level. Thus, both factors
influence yield.
2marks
ii.
Hypothesis:
H0: There is no interaction
H1: There is interaction
1marks
There is no evidence of interaction between the factors at the 5% level of significance.
2marks
(b)
INPUT "SAMPLE SIZE = "; N
PRINT "ENTER THE X-VALUES"
FOR I = 1 TO N
INPUT "X";I; " = "; X:X(I)= X
SUM = SUM + X(I)
NEXT I
AVE1 = SUM/N
FOR I = 1 TO N
SSX = X(I)-AVE1
SX = SSX^2
DV = DV + SX
NEXT I
PRINT "ENTER THE Y-VALUES"
FOR I = 1 TO N
INPUT "Y"; I; " = "; Y:Y(I)= Y
SUM1 = SUM1 + Y(I)
NEXT I
AVE2 = SUM1/N
FOR I = 1 TO N
SSXY = ((X(I)- AVE1)*((Y(I)-AVE2)))
SXY = SXY + SSXY
NEXT I
IF SXY = 0 THEN
PRINT "DIVISION BY ZERO (0) NOT POSSIBLE"
GOTO [QUIT]
END IF
GRAD = SXY/DV
CONST = AVE2 - GRAD*AVE1
½ mark
½
½
½
½
½
mark
mark
mark
mark
mark
½
½
½
½
mark
mark
mark
mark
½
½
½
½
½
½
½
½
½
½
½
mark
mark
mark
mark
mark
mark
mark
mark
mark
mark
mark
½ mark
½ mark
½ mark
IF GRAD<0 THEN
G1 = 0-GRAD
PRINT "Y = "; CONST; " - "; G1; "*X"
ELSE
PRINT "Y = "; CONST; " + "; GRAD; "*X"
END IF
[QUIT]
END
Q5
(a)
i.
ii.
iii.
iv.
v.
½
½
½
½
½
½
mark
mark
mark
mark
mark
mark
A flowchart is a pictorial presentation of an algorithm that shows its logical structure clearly.
2marks
An algorithm is a step-by-step procedure to solve a problem.
2marks
An array is a list of variables of the same type.
2marks
A prompt is a message displayed by the computer indicating that it is waiting for the user to
enter instructions or data.
2marks
Trailer data refers to an extra data entered after all the real data to indicate all the real data
were entered.
2marks
(b)
OPEN "members.dat" FOR RANDOM AS #1 LEN=256
1½ mark
FIELD #1, 90 AS Name$, 110 AS Address$, 50 AS Rank$, 6 AS IDnumber
1½ mark
Name$ = "John Q. Public"
½ mark
Address$ = "456 Maple Street, Anytown, USA"
½ mark
Rank$ = "Expert Programmer"
½ mark
IDnumber = 99
½ mark
PUT #1, 3
1mark
GET #1,3
1mark
Print Name$
½ mark
Print Rank$
½ mark
Print IDnumber
½ mark
close #1
1mark
end
½ mark
(c)
Bin
10
17
31
38
45
>45
Total
Frequency The formula below produces the frequency table
160 =FREQUENCY(A1:A1000,C3:C8)
158
270
162
158
92
1000
5marks
Q6
(a)
(i)
Equality of variances
F-Test Two-Sample for Variances
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
H1
486.000
9369.571
15.000
14.000
0.959
0.469
0.403
H2
544.933
9770.781
15.000
14.000
2marks
H0:
Equal variance
H1:
Unequal variance
1mark
Since the p-value (0.469) is greater than the 5% significance level, we fail to reject the null
hypothesis of equal variance. Therefore, we conclude that the two sample data have equal variances.
2marks
(ii)
Homoscedastic t-test or two-sample t-test assuming equal variance
(iii)
t-Test: Two-Sample Assuming Equal Variances
2marks
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
H1
486.000
9369.571
15.000
9570.176
0.000
28.000
-1.650
0.055
1.701
0.110
2.048
H2
544.933
9770.781
15.000
2marks
(α)
H0:
μ1 = μ2
H1:
μ1 ≠ μ2
1mark
(β)
At the 5% significance level, we fail to reject the null hypothesis of equal means and conclude that
the two hospitals do not charge differently. Therefore, there is no need for Mr. Good to report the
matter to Medicare payments.
2marks
(b)
NEXT K
NEXK J
4 5 6 5 6 7 6 7 8
1mark
1mark
1mark
(c)
i. The relationship between any one independent series and the dependent series can be
captured by a straight line in a 2-axis graph.
ii. The independent variables do not change if the sampling is replicated
2marks
2marks
iii. The sample size must be greater than the number of independent variables (N should be
greater than k – 1)
2marks
iv. Not all the values of any one independent series can be the same
2marks
v. The residual or disturbance error terms follow several rules
2marks
vi. There are no linear relationships among the independent variables
2marks
Download