NCSS

advertisement
Dan Dillon
Homework Problem #7 – Part 1
STAT 582, Statistical Consulting and Collaboration
Dr. Jennings
Evaluation of NCSS Software
NCSS is published by NCSS, LLC (Kaysville, Utah), a company founded in 1981. Apparently
the company is relatively small and/or has a significant interest in customer service because their
website states, “When you call NCSS, chances are Dr. Jerry Hintze [Ph.D. statistician, NCSS
President] will answer your questions”.1
The latest version of the software is NCSS 2007. One might suppose that it the software is
targeted to applied statisticians in the life sciences, because the other two products the company
offers are PASS 2008 (Power Analysis and Sample Size, “…the best power analysis and sample
size tool on the market. No other sample size program covers as many statistical procedures.”2)
and GESS 2006 (Gene Expression Statistical System for Microarrays). In addition, the software
has specific functions for real estate appraisers, business analytics (e.g. forecasting), industrial
quality control and pharmaceutical development (e.g., bioequivalence).
The software appears to target single users or small group purchases (as opposed to “enterprise”
systems), as the price list does not mention bulk discounts (but does mention “custom quotes”
and “multi-user discounts”) and instead focuses on “a price much lower than the competition’s.”3
The licenses are single-user, perpetual licenses with lifetime phone and email support.4
The system uses a Windows-style GUI (menus and buttons) and is compatible with Windows
versions from 95 to Vista, and both 32- and 64-bit operating systems. It requires a Pentium-class
processor, 32 MB of RAM, 200 MB of hard disk space and Adobe Reader® version 7.3
1
http://www.ncss.com/about_ncss.html. Accessed April 27, 2010.
http://www.ncss.com/index.htm. Accessed April 27, 2010.
3
http://www.ncss.com/ncss.html. Accessed April 27, 2010.
4
https://www.ncssorders.com/ncss_pricelist.asp?Pricing=Academic, Accessed April 27, 2010.
2
Dan Dillon, STAT 582
Major Statistical Topics in NCSS
The following are the major areas covered by NCSS:5
Analysis of Variance
Descriptive Statistics
Multivariate I
Repeated Measures
Appraisal Methods
Design of Experiments
Multivariate II
ROC Curves
Binary Diagnostic Tests
Forecasting
Proportions
Survival Analysis
Charts and Graphs
General Linear Models
Quality Control
Time Series Analysis
Cross Tabulation
Meta-Analysis
Regression Analysis
T-Tests
Curve Fitting
Mixed Models
Reliability Analysis
There are over 180 distinct statistical and graphics procedures. Some of the more important ones
(from my standpoint) are:








5
Numerous charts (including bar, error bar, and pie) and plots (including box, dot,
histograms, percentile, scatter, and surface).
Curve and distribution fitting functions (ratio of polynomials, exponential smoothing,
extreme value fitting, beta, gamma, loglinear, lognormal).
ANOVA tools (one-way, multiple and repeated measures, mixed models, general linear
models).
Various regression analyses (linear, logistic, multinomial logistic, Poisson, lognormal,
“all-possible” regression search, non-linear, principal components, ridge, variable
selection).
Time-related analyses (survival (including Kaplan-Meier), life-table, longitudinal mixed
models, time series).
Various statistical tests (Chi-square, equivalence, Fisher's Exact, Mann-Whitney, MantelHaenszel, multiple comparison, normality, one-sample t-tests, paired t-tests, two-sample
t-tests).
Multivariate tools (including analysis of covariance, canonical correlation, clustering
(hierarchical and K-means), discriminant analysis, factor analysis, MANOVA,
multivariate and principal components analysis).
Design tools (power calculations, balanced incomplete block, case-control, central
composite designs, factorial, fractional factorial, Latin square, matched case-control,
Placket-Burman, response surface and screening designs).
http://www.ncss.com/ncss_procedures.html. Accessed April 27, 2010.
Page 2 of 7
Dan Dillon, STAT 582
User Interface and Output
The opening screen mimics a spreadsheet. See Figure 1.
Figure 1. Opening Screen – Spreadsheet.
One can enter the data directly or import the data. File formats from the following programs are
supported: Access, BMDP, DBase, Epi Info, Excel (including *.xlsx), Gauss, JMP, LimDep,
Lotus, MatLab, Minitab, Paradox, Quatro, SAS, SigmaPlot, Solo Dos, SPlus, SPPS, Stata,
Statistica, Symphony, Systat, *.txt (ASCII fixed and delimited), *.html, and ODBC.
Figure 2 is a close-up of the tool-bar. It appears intuitive for a moderately-trained statistician:
separate buttons lead to different types of analysis (linear regression, multiple regression, 1-way
ANOVA, GLM ANOVA, etc.
Figure 2. Close-up of tool-bar.
I tested three different basic routines – graphing a scatterplot, linear regression and ANOVA – to
get a feel for the software. The graphics interface was easy and produced crisp graphics. See
Figure 3. I especially liked the fact that one could directly edit and reformat the text on the upper
part of the page and rescale the main graph as one saw fit.
Page 3 of 7
Dan Dillon, STAT 582
Figure 3. Graphing Output.
Simple linear regression was easily accomplished. Data was imported from a comma-delimited
file. Figure 4 shows screen that appears after the Linear Regression button is pressed.
Figure 4. Tabbed Dialog Box for Linear Regression.
Page 4 of 7
Dan Dillon, STAT 582
Fourteen different tabs are available covering features such as which variables are to be
analyzed, formatting of output, and types of graphs to be included. Bootstrap resampling is also
included. The right hand side of the screen includes context sensitive help – it varies not only
based on tab chosen but also which box is being selected (e.g., the Dependent Variable box).
The Guide Me button will automatically move the cursor through most of the major options for
all of the tabs and finally prompt the user to press Run. (The graphing routine described above
had a similar tabbed dialog box, but with tabs and features unique to it.) Once the user has
chosen all the desired options, pressing the Run button executes the procedure.
The test runs was done without doing too much customization of output. Ten (10) pages of
output were immediately produced, including several graphs and a 3 paragraph summary.
Figure 5 is one page of the output.
Page/Date/Time
Database
Y = C1 X = C2
5 4/29/2010 10:15:37 PM
C:\DOCUMENTS AND SETTINGS\OWNER\DESKTOP\TEMP2.S0
Analysis of Variance Section
Source
Intercept
Slope
Error
Lack of Fit
Pure Error
Adj. Total
Total
DF
1
1
25
13
12
26
27
Sum of
Squares
8405.813
793.2805
491.5262
332.8328
158.6933
1284.807
9690.62
Mean
Square
8405.813
793.2805
19.66105
25.60253
13.22444
49.41564
F-Ratio
Prob
Level
Power
(5%)
40.3478
0.0000
1.0000
1.9360
0.1311
s = Square Root(19.66105) = 4.434078
Notes:
The above report shows the F-Ratio for testing whether the slope is zero, the degrees of freedom,
and the mean square error. The mean square error, which estimates the variance of the residuals,
is used extensively in the calculation of hypothesis tests and confidence intervals.
Figure 5. Linear Regression Output.
The ANOVA procedure worked in the much the same way, but, not surprisingly, fewer options:
there were 5 tabs, entitled Variables, Reports, Box Plot, Means Plot and Template. Figure 6
contains excerpts from the output.
Page 5 of 7
Dan Dillon, STAT 582
Analysis of Variance Report
Page/Date/Time
1 4/30/2010 9:29:25 PM
Database
C:\DOCUMENTS AND SETTINGS\OWNER\DESKTOP\TEMP3.S0
Response
C1
[Ed.: output deleted…]
Box Plot Section
Box Plot
45.00
C1
38.75
32.50
26.25
20.00
1
2
3
C2
[Ed.: output deleted…]
Analysis of Variance Table
Source
Sum of
Term
DF
Squares
(Alpha=0.05)
A: C2
2
672
S(A)
21
416
Total (Adjusted)
23
1088
Total
24
* Term significant at alpha = 0.05
Mean
Square
336
19.80952
[Ed.: output deleted…]
Plots of Means Section
Means of C1
40.00
C1
35.00
30.00
25.00
20.00
1
2
3
C2
Figure 5. ANOVA Output.
Page 6 of 7
F-Ratio
16.96
Prob
Level
Power
0.000041* 0.998982
Dan Dillon, STAT 582
Conclusion
I found the software very easy to manipulate and very intuitive, with excellent graphics. It was
my first experience with Windows-style statistical package. I’m very familiar with SAS and its
coding requirements and somewhat familiar with R and its interactive features. I could easily
unlearn both and learn this software for most of my needs.
Page 7 of 7
Download